Base: Also referred to as a nucleotide, a base is a molecule that is the structural unit of DNA (or RNA). The four DNA bases are adenine (A), cytosine (C), guanine (G), and thymine (T). Joined together, they make up the strands of the DNA double helix.
Base pairs: DNA strands pair together using specific rules: adenine (A) bonds with thymine (T) and guanine (G) with cytosine (C). These complementary pairs of DNA bases are referred to as “base pairs.” The human genome is about three billion base pairs long.
Bioinformatics: Bioinformatics, or “computational biology,” is a relatively new scientific discipline that involves using computer software and databases to analyze genomic data.
Biomarker: A measurement that can be used as a test for or to predict outcome ofdisease, such as a genetic mutation or the level of a specific protein in blood.
Biorepository: A large collection or “warehouse” of biological materials (such as tissue, blood, urine, or other samples) stored, organized, and distributed for future use by scientists.
Chromosome: An organized structure of DNA found in cells. Chromosomes help maintain the integrity of DNA’s genetic code and regulate when genes are turned on or off. Human cells have 22 pairs of chromosomes called autosomes and two sex chromosomes, X or Y. Half of the autosomes and one sex chromosome come from each of one’s parents.
Chronic bronchitis: A chronic inflammation of the bronchi, the medium-size airways in the lungs. The disease, one of two forms of chronic obstructive pulmonary disorder (COPD), is characterized by a persistent cough and an excess production of mucus.
Chronic obstructive pulmonary disease (COPD): A progressively worsening lung disease, the symptoms of which include an overproduction of mucus, shortness of breath, wheezing, and chest tightness. COPD is caused by smoking or long-term exposure to other environmental irritants such as air pollution and chemical fumes. See also, COPD under "Learn More."
DNA: Shorthand for deoxyribonucleic acid, DNA is the molecule that carries the genetic instructions in (most) living organisms.
DNA microarray: Commonly known as a gene chip, a microarray can simultaneously test for millions of mutations in a patient or look at all 25,000 genes in the genome to measure which are turned on or off.
DNA sequence: The order of bases—As, Ts, Cs, and Gs—in a given region or a single strand of DNA.
DNA sequencing: The process of determining the order of nucleotide bases in a region of DNA. See also, Research Methods under "What We Do."
Emphysema: A progressive, obstructive lung disease that causes shortness of breath and ultimately leads to the destruction of lung tissue. Emphysema is often associated with chronic obstructive pulmonary disease (COPD).
Epigenetics: The study of factors other than changes in the DNA sequence that can alter how genes are expressed. The LGRC is studying methylation, one of the most important epigenetic factors.
Expression profile: Genes are expressed at different levels in different cell types—or as a disease develops and progresses. Expression profiles capture the levels for large numbers of genes and can be used as biomarkers.
Gene: The basic unit of heredity, a gene is a segment of DNA containing the information a cell needs to make (or “encode”) proteins or perform a regulatory function. Humans have about 25,000 protein-coding genes.
Gene expression: Genes turn on and off (or more accurately, up and down) by producing more or less RNA. Since genes encode proteins, an alteration in gene expression can lead to disease, such as when the wrong protein, or too much or too little of the right protein, is made in a particular cell.
Gene mutation: A change in a DNA sequence that changes the DNA code and so might result in disease. Mutations can be caused by many factors, including exposure to mutagens in tobacco smoke.
Gene regulation: The 25,000 or so genes in the cell encode proteins, not all of which are required by each and every cell at all times. Gene regulation is how cells control which genes are active in each cell.
Gene variant: Gene sequences vary among individuals. A gene variant, determined by its DNA sequence, is the specific form of a gene that a person carries. For example, the major blood types, A, B, and O, are determined through the gene variants a person carries for the ABO gene.
Genetics: The branch of biology that studies heredity and variation in living organisms; the science of identifying the genes responsible for particular traits (such as risk for disease).
Genome: Refers to all the biological information—all the genes—encoded in an organism’s DNA and required for life. See human genome.
Genomic analysis: The use of techniques such as genome sequencing, genome-wide association studies, gene expression analysis, epigenetic analysis, together with bioinformatics, to understand the how changes in the DNA sequence, its structure, or how genes are expressed can be linked to disease.
Genomic data: The information that comes out of genomic studies.
Genomic medicine: An emerging practice of medicine that involves using genomic data to better predict, diagnose, and treat disease.
Genomic science: See genomics.
Genomics: The branch of biology that involves the simultaneous study of large numbers of genes, or all the genes, in an organism.
Genotype: An organism’s unique genetic blueprint as contained in its DNA and inherited from previous generations.
Human genome: The entire collection of genetic information found in the cells of humans—approximately three billion bases that comprise the entire DNA sequence in a cell.
Human Genome Project: An international research project to uncover the genetic makeup of humans by determining the sequence of DNA for the genome and mapping the approximately 25,000 genes it contains. Most of the project was conducted from 1990 to 2003, but further analysis is under way even today. For more information, visit the National Human Genome Research Institute.
Interstitial lung disease (ILD): A general term that describes a group of disorders in which scar tissue forms in the interstitium, the tissue and space around the air sacs of the lungs, causing progressive scarring and affecting the ability to breathe. See also, ILD under "Learn More."
Methylation: A type of chemical modification that adds a chemical group—a methyl group—to DNA without changing the sequence. Methylation is a type of epigenetic change that can alter how genes are expressed.
MicroRNA: Also known as miRNA, these are small RNA molecules that are transcribed from the genome. They do not encode proteins themselves, but they can prevent messages encoded by other genes from being translated into proteins.
Molecular biology: The branch of biology that involves the study of the interactions and regulation of molecules within a cell, including DNA, RNA, and proteins.
Mutation: A change in DNA sequence, mutations can cause disease by altering the protein code contained within a gene.
Nucleic acid: A class of molecules found in cells that store and express genetic information. DNA and RNA are nucleic acids.
Nucleotide: See also, base.
Phenotype: The observable appearance of traits of an organism. Eye color, blood type, and weight are examples of phenotypes, as is the presence or absence of disease.
Protein: Also known as polypeptides, proteins are the primary building blocks of cells. A gene carries the code that a cell uses to make a particular protein.
RNA: Shorthand for ribonucleic acid, this molecule is similar to DNA, but is single-stranded and uses uracil (U) in place of thymine (T). RNA serves as the intermediary between genes and the proteins they manufacture. A DNA sequence of bases is “transcribed” into RNA and then later “translated” into protein.
Signaling pathway: Cells have networks of genes and proteins, or pathways, that carry out particular functions. A signaling pathway is one that responds to an outside stimulus and, typically, activates the expression of certain genes.
Telomerase genes: Genes that encode the enzyme telomerase, a protein that adds DNA sequence repeats to the ends of chromosomes (called telomeres). Repeated sequences help protect a chromosome’s structural integrity.
Transcription: The process by which the information stored in the genes of DNA is copied to RNA.
Transgenic: Describing an organism into which a gene or genes from another organism's cells have been introduced through a laboratory process. For example, transgenic mice that carry mutated human genes are useful for studying disease, since the effects of the gene mutations can be observed in the mice.
Translation: The process by which the information carried in RNA is decoded to make proteins.