Glossary of terms

Allele

One of several possible versions of a gene. Each one contains a distinct variation in its DNA sequence. For example, a “deleterious allele” is a form of a gene that leads to disease.

Amino acid

The chemical building block of proteins. During translation, different amino acids are strung together to form a chain that folds into a protein.

Archaea

Microbes that look similar to bacteria but are actually more closely related to eukaryotes, such as humans. Archaea are single-celled organisms that don’t have a nucleus and can only be seen with a microscope. They’re found in many different habitats, and many of the first known examples were found in extreme environments.

Bacteria

An abundant type of microbe. These single-celled organisms are invisible to the naked eye, don’t have a nucleus, and can have many shapes. They’re found in all types of environments, from Arctic soil to inside the human body. Most bacteria are not harmful to human health, but certain pathogenic bacteria can cause illness.

Base

The four “letters” of the genetic code (A, C, T, and G) are chemical groups called bases or nucleobases. A = adenine, C = cytosine, T = thymine, and G = guanine. Instead of thymine, RNA contains a base called uracil (U).

Base pair

Different chemicals known as bases or nucleobases are found on each strand of DNA. Each base has a chemical attraction for a particular partner base, known as its complement. C matches up with G, while A pairs with T or U. These bonded genetic letters are called base pairs. Two strands of DNA can zip together to form a double-helix shape when complementary bases match up to form base pairs.

Cancer

A type of disease caused by uncontrolled growth of cells. Cancerous cells may form clumps or masses known as tumors, and can spread to other parts of the body through a process known as metastasis.

Cas

Abbreviation of CRISPR-associated, may refer to genes (cas) or proteins (Cas) that protect bacteria and archaea from viral infection.

Cas9

A protein derived from the CRISPR-Cas bacterial immune system that has been co-opted for genome engineering. Uses an RNA molecule as a guide to find a complementary DNA sequence. Once the target DNA is identified, Cas9 cuts both strands. Has been compared to “molecular scissors” or a “genetic scalpel.” In CRISPR immunity, cutting viral DNA prevents it from destroying the host cell. In genome engineering, cutting genomic DNA initiates a repair process that ends up making a change or “edit” to its sequence.

Cell

The basic unit of life. The number of cells in a living organism ranges from one (e.g. yeast) to quadrillions (e.g. blue whale). A cell is composed of four key macromolecules that allow it to function (protein, lipids, carbohydrates, and nucleic acids). Among other things, cells can build and break down molecules, move, grow, divide, and die.

Chromosome

The compact structure into which a cell's DNA is organized, held together by proteins. The genomes of different organisms are arranged into varying numbers of chromosomes, and human cells have 23 pairs.

Cleave

The scientific term for cut or break apart. Typically refers to splitting apart a long polymeric molecule like DNA, RNA, or protein. For example, a nuclease like Cas9 can be directed to cleave DNA at a specific location.

Complementary

Describes any two DNA or RNA sequences that can form a series of base pairs with each other. Each base forms a bond with a complementary partner. T (DNA) and U (RNA) bond with A, and C complements G. For example, in CRISPR immunity, the spacer sequence in a guide RNA is complementary to a sequence found in a viral genome. When the RNA bases pair with complementary DNA bases from an invading virus, the Cas9 protein will cut the target to stop the viral infection.

Cpf1 (Cas12)

A protein derived from the CRISPR-Cas bacterial immune system that has been co-opted for genome engineering. Uses an RNA molecule as a guide to find a complementary DNA sequence. Once the target DNA is identified, Cpf1 cuts both strands. Has been compared to “molecular scissors” or a “genetic scalpel.” In CRISPR immunity, cutting viral DNA prevents it from destroying the host cell. In genome engineering, cutting genomic DNA initiates a repair process that ends up making a change or “edit” to its sequence.

CRISPR

Pronounced “crisper.” An adaptive immune system found in bacteria and archaea, co-opted as a genome engineering tool. Acronym of “clustered regularly interspaced short palindromic repeats,” which refers to a section of the host genome containing alternating repetitive sequences and unique snippets of foreign DNA. CRISPR-associated surveillance proteins use these unique sequences as molecular mugshots as they seek out and destroy viral DNA to protect the cell.

CRISPR RNA (crRNA)

During CRISPR immunity, the host cell generates crRNA molecules, each containing one spacer that is complementary to a portion of a viral genome. crRNAs guide CRISPR immune proteins to find and destroy matching invader sequences.

CRISPR screening

A technique that lets scientists see the effects of turning gene expression up or down with CRISPRa and CRISPRi. Instead of checking one gene at a time, a single CRISPR screen can provide information about thousands of different genes at a time.

CRISPRa and CRISPRi

CRISPRa stands for CRISPR activation and CRISPRi stands for CRISPR interference or inhibition. Both are methods for fine-tuning gene expression. If a gene were a car, CRISPRa is the gas pedal and CRISPRi is the brake. Using CRISPRa to activate a gene increases protein production. Using CRISPRi to “turn down” a gene reduces the number of protein products made from that gene.

dCas9

Catalytically inactive, or “dead,” Cas9. This mutated version of the Cas9 protein cannot cut, but still binds tightly to a particular DNA sequence specified by the guide RNA. Can be used to physically block the process of transcription, turning off a specific gene, or to shuttle other proteins to a particular site in the genome.

DNA

Abbreviation of deoxyribonucleic acid, a long molecule that encodes the information needed for a cell to function or a virus to replicate. Forms a double-helix shape that resembles a twisted ladder. Different chemicals called bases, abbreviated as A, C, T, and G, are found on each side of the ladder, or strand. The bases have an attraction for each other, making A stick to T while C sticks to G. These rungs of the ladder are called base pairs. The sequence of these letters is called the genetic code.

Double-strand break (DSB)

When both strands of DNA are broken, two free ends are created. May be made intentionally by a tool such as Cas9. Cells repair their DNA to prevent cell death, sometimes changing the DNA sequence at the site of the break. Initiating or controlling this process with the intent to alter a DNA sequence is known as genome engineering.

Enzyme

A molecule, typically a protein, that causes or catalyzes a chemical change. Usually an enzyme’s name describes a molecule involved in the activity it performs and ends with the suffix -ase. For example, lactase is a well-known enzyme that breaks down lactose, a sugar found in milk. Cas9 is a nuclease, an enzyme that breaks apart the backbone of nucleic acids (RNA or DNA).

Epigenetic

Refers to changes to a cell’s gene expression that do not involve altering its DNA code. Instead, the DNA and proteins that hold onto DNA are “tagged” with removable chemical signals. Epigenetic marks tell other proteins how to read the DNA, which parts to ignore, and which parts to transcribe into RNA.
Comparable to sticking a note that says “SKIP” onto a page of a book – a reader will ignore this page but the book itself has not been changed.

Eukaryote

A domain of organisms whose cells contain a nucleus and other organelles. Eukaryotes are often large and multicellular (e.g. elephants) but can also exist as microscopic, single cells (e.g. yeast). This category of life includes humans. Compare to prokaryotes (bacteria and archaea).

Expression

A product being made from a gene; can refer to either RNA or protein. When a gene is turned on, cellular machines “express” this by transcribing the DNA into RNA and/or translating the RNA into a chain of amino acids. For example, a “highly expressed” gene will have many RNA copies produced, and its protein product is likely to be abundant in the cell. CRISPRi and CRISPRa are methods for turning gene expression down or up, respectively.

Gene

A segment of DNA that encodes the information used to make a protein. Each gene is a set of instructions for making a particular molecular machine that helps a cell, organism, or virus function.

Gene drive

A mechanism for preferential inheritance of a particular DNA sequence. Usually, offspring have a semi-random chance of inheriting a given stretch of DNA from either parent. In a scientist-designed gene drive, a gene is engineered to have a 100% chance of being passed on. Gene drives can force the inheritance of a desirable trait through a population of organisms. For example, this approach could potentially make all mosquitoes incapable of transmitting the malaria parasite.

Gene therapy

Delivering corrective DNA to human cells as a medical treatment. Certain diseases can be treated or even cured by adding a healthy DNA sequence into the genomes of particular cells. Scientists and doctors typically use a harmless virus to shuttle genes into targeted cells or tissues, where the DNA is incorporated somewhere within the cells’ existing DNA. CRISPR genome editing is sometimes referred to as a gene therapy technique.

Genetically modified organism (GMO)

A genetically modified organism has had its DNA intentionally altered using scientific tools. Any organism can be engineered in this manner, including microbes, plants, and animals.

Genome

The entire DNA sequence of an organism or virus. The genome is essentially a huge set of instructions for making individual parts of a cell and directing how everything should run.

Genome editing

Intentionally altering the genetic code of a living organism. Can be done with ZFNs, TALENs, or CRISPR. These systems are used to create a double-strand break at a specific DNA site. When the cell repairs the break, the sequence is changed. Can be used to remove, change, or add DNA.

Genome surgery

Repairing harmful DNA through a one-time genome editing procedure. Unlike taking a drug that will temporarily reduce long-term symptoms, altering a patient’s genetic code with the CRISPR-Cas9 “molecular scalpel” would permanently and directly reverse the cause of a genetic disease.

Genomics

The study of the genome, all the DNA from a given organism. Involves a genome’s DNA sequence, organization and control of genes, molecules that interact with DNA, and how these different components affect the growth and function of cells.

Germ cells

The cells involved in sexual reproduction: eggs, sperm, and precursor cells that develop into eggs or sperm. The DNA in germ cells, including any mutations or intentional genetic edits, may be passed down to the next generation. In contrast, the genetic material in somatic cells (all the cells in the body except for germ cells) cannot be inherited by offspring. Note that genome editing in an early embryo is considered to be germline editing since any DNA changes will likely end up in all cells of the organism that is eventually born.

Guide RNA (gRNA)

A two-piece molecule that Cas9 binds and uses to identify a complementary DNA sequence. Composed of the CRISPR RNA (crRNA) and trans-activating CRISPR RNA (tracrRNA). Cas9 uses the tracrRNA portion of the guide as a handle, while the crRNA spacer sequence directs the complex to a matching DNA sequence.
Scientists have also formed a version of the guide RNA that consists of a single molecule, the single-guide RNA (sgRNA).

Homology-directed repair (HDR)

A way for a cell to repair a break in its DNA by “patching” it with a piece of donor DNA. The donor DNA must contain similar sequences, or homology, to the broken DNA ends for it to be incorporated. HDR is a more precise repair pathway than non-homologous end joining. In genome engineering, a researcher designs and adds in the donor DNA, potentially allowing scientists to replace a disease-causing gene with a healthy copy.

Indel

Abbreviation for insertion or deletion. Refers to the random removal or addition of nucleotides from a DNA sequence. This can be enough to stop a gene from functioning (imagine removing a page from the middle of an instruction manual). Indels occur when DNA is broken and “sloppily” repaired by the cell in a process called non-homologous end joining (NHEJ).

Microbe

A microscopic organism. Can be single-celled or multicellular, and is sometimes used to refer to viruses, although they are not considered to be alive. Examples include bacteria, yeast, and algae.

Mutation

A change from one genetic letter (nucleotide) to another. Variation in DNA sequence gives rise to the incredible diversity of species in the world, and even occurs between different organisms of the same species. While some mutations have no consequence at all, certain mutations can directly cause disease. Mutations may be caused by DNA-damaging agents such as UV light or may arise from errors that occur when DNA is copied by cellular enzymes. They can also be made deliberately via genome engineering methods.

Non-homologous end joining (NHEJ)

A way for a cell to repair a break in its DNA by attaching the free DNA ends. This pathway is “sloppier” than homology-directed repair, and often results in the random addition or removal of nucleotides around the site of the DNA break, causing insertions or deletions in the genetic code. In genome engineering, this allows scientists to stop a gene from working (similar to removing a page from the middle of an instruction manual).

Nick

When only one strand of DNA is broken, there is a gap called a nick in the backbone, but the DNA does not separate. A tool like CRISPR-Cas9 may be used to generate a nick.

Nuclease

An enzyme that breaks apart the backbone of RNA or DNA. Breaking one strand generates a nick and breaking both strands generates a double-strand break.
An endonuclease cuts in the middle of RNA or DNA, while an exonuclease cuts from the end of the strand. Genome engineering tools like Cas9 are endonucleases.

Nucleic acid

A term for DNA and RNA. Refers to nucleotides, the basic chemical units that are strung together to make DNA or RNA. One of the four macromolecules that make up all living things (protein, lipids, carbohydrates, and nucleic acids).

Nucleotide

One of the basic chemical units strung together to make DNA or RNA. Consists of a base, a sugar, and a phosphate group. The phosphates can link with sugars to form a string called the DNA/RNA backbone, while the bases can bind to their complementary partners to form base pairs.

Off-target effect

When a genome engineering enzyme cuts DNA at an unintended, "off-target," site that is similar to the intended target.

Protospacer adjacent motif (PAM)

A short sequence that must be present next to a DNA target sequence for Cas9 to bind and cut. Prevents cleavage of host CRISPR array, where PAM is not present.

Pathogen

A microbe that causes illness. Most micro-organisms are not pathogenic to humans, but some strains or species are harmful.

Phage

A type of virus that infects bacteria or archaea, formally called bacteriophage.

Prokaryote

A ­­­­­­­category of living organisms that encompasses all bacteria and archaea. Prokaryotes are microscopic, single-celled organisms that do not have a nucleus or other membrane-bound organelles. Compare to eukaryotes.

Protein

A string of amino acids folded into a three-dimensional structure. Proteins are each specialized to perform a specific role to help cells grow, divide, and function. One of the four macromolecules that make up all living things (protein, lipids, carbohydrates, and nucleic acids).

Ribonucleoprotein complex (RNP)

An assembly of molecules containing both protein and RNA. Often used to describe Cas9 protein bound to guide RNA (gRNA), which together form an active enzyme. For genome editing in cells, Cas9 can be delivered as a pre-assembled RNP, or as DNA or RNA encoding the genetic instructions for the protein and RNA compo­­nents.

RNA

Abbreviation of ribonucleic acid. Transcribed from a DNA template and typically used to direct the synthesis of proteins. CRISPR-associated proteins use RNAs as guides to find matching target sequences in DNA.

Single-guide RNA (sgRNA)

A version of the naturally occurring two-piece guide RNA complex engineered into a single, continuous sequence. The simplified single-guide RNA is used to direct the Cas9 protein to bind and cleave a particular DNA sequence for genome editing.

Somatic cells

All the cells in a multicellular organism except for germ cells (eggs or sperm). Mutations or changes to the DNA in the soma will not be inherited by subsequent generations.

Stem cells

Cells with the potential to turn into a specialized type of cell or to divide to make more stem cells. Most cells in your body are differentiated – that is, their fate has already been decided and they cannot morph into a different kind of cell. For example, a cell in your brain cannot transform into a skin cell. Embryonic stem cells are found in developing embryos, while adult stem cells are found in tissues including bone marrow, blood, and fat. Adult stem cells replenish the body as it becomes damaged over time.

Strand

A string of connected nucleotides; can be DNA or RNA. Two strands of DNA can zip together when complementary, bases match up to form base pairs. DNA typically exists in this double-stranded form, which takes the shape of a twisted ladder or double helix. RNA is typically composed of just a single strand, though it can fold up into complex shapes.

Transcription activator-like effector nuclease (TALEN)

A genetic engineering tool wherein one portion of the protein recognizes a specific DNA sequence and another part cuts DNA. Made by attaching a series of smaller DNA-binding domains together to recognize a longer DNA sequence. This DNA-binding domain is fused to a nuclease that will cut nearby DNA. Like CRISPR-Cas9 and ZFNs, it can be used to alter DNA sequences.

Trans-activating CRISPR RNA (tracrRNA)

A genetic engineering tool wherein one portion of the protein recognizes a specific DNA sequence and another part cuts DNA. Made by attaching a series of smaller DNA-binding domains together to recognize a longer DNA sequence. This DNA-binding domain is fused to a nuclease that will cut nearby DNA. Like CRISPR-Cas9 and ZFNs, it can be used to alter DNA sequences.

Transcription

The process by which DNA information is copied into a strand of RNA; performed by an enzyme called RNA polymerase.

Translation

The process by which proteins are made based on instructions encoded in an RNA molecule. Performed by a molecular machine called the ribosome, which links together a series of amino acid building blocks. The resulting polypeptide chain folds up into a particular 3D shape, known as a protein.

Virus

An infectious entity that can only persist by hijacking a host organism to replicate itself. Has its own genome, but is technically not considered a living organism. Viruses infect all organisms, from humans to plants to microbes. Multicellular organisms have sophisticated immune systems that combat viruses, while CRISPR systems evolved to stop viral infection in bacteria and archaea.

Zinc-finger nuclease (ZFN)

A genetic engineering tool wherein one portion of the protein recognizes a specific DNA sequence and another part cuts DNA. Made by attaching a series of smaller DNA-binding domains together to recognize a longer DNA sequence. This DNA-binding domain is fused to a nuclease that will cut nearby DNA. Like CRISPR-Cas9 and TALENs, it can be used to alter DNA sequences.

Credits

Glossary adapted from information provided by the Innovative Genomics Institute.

This IGI Glossary Icon Collection by Christine Liu of Two Photon Art for the Innovative Genomics Institute is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

The science of genomics is progressing at an exponential rate

Subscribe to our newsletter for major developments and insights.