Definition
A gene is an extremely specific sequence of nucleotide monomers that has the ability to completely or partially control the expression of one or more traits in every type of living organism. Genes are formed from deoxyribonucleic acid (DNA) and, in the case of some viruses, ribonucleic acid (RNA) polymers.
Overview
It was Wilhelm Johannsen who first used the term ‘gene’, using his botany background to study the genetic traits of plants. Modern genetics no longer accepts earlier theories that depict the gene as a singular piece of information that can only produce a single protein. We now know that one gene is capable of providing multiple, different transcription units of messenger RNA (mRNA), depending on where the replication process begins. A single gene may also make up just a small section of an mRNA transcription unit; again, according to where on the gene the transcription process is initiated.
It is now understood that genes are able to perform a second function without either losing their original function or going through the replication process. This phenomenon, known as ‘protein moonlighting’, means a gene can be edited without being incorrectly copied. The common definition of a single gene controlling a single function is outdated; although genetic research remains in its infancy, it is clear that a single gene can have multitudinous roles.
Gene Examples
The gene examples listed here are recent examples. A list composed in the future may differ. Due to the current surge in genetic research and our understanding of the codes that make each organism unique, gene examples are constantly evolving.
RNA Virus Genes
Viruses can be categorized according to gene type. They can be either RNA or DNA viruses. Genes found inside the virus are few in number, from a handful to a maximum of around 200 genes.
RNA viruses exhibit extremely high rates of gene mutation – mutation is the changing of a naturally coded sequence into a different sequence, either at a single point or in multiple areas of the gene, and during replication, transcription and translation. This rate of mutation can be as high as one mutation for every replication. DNA mutation rates are much rarer – approximately one in every few hundred replications to one in tens of thousands.
Viruses can also change their genetic information by way of recombination, where two viruses inside a host organism exchange their genetic material. A virus (retrovirus) can also insert a copy of its genome into host cells.
This ability to constantly change the genetic code means RNA viruses can adapt to survive and replicate in previously immune or resistant hosts. Some of the world’s most feared viruses are RNA viruses. This group of pathogens includes the viruses that cause Ebola, rabies, influenza, West Nile fever, polio, and measles (pictured below).
An example of a viral gene would be BALF5. This gene produces a DNA polymerase protein subunit in the Epstein-Barr virus.
Bacterial Genes
Bacteria are estimated to have between 500 and 7500 genes, depending on their complexity. Many bacteria have a single chromosome containing the bacterial genome, as well as separate structures called plasmids, which can replicate independently of the chromosome. This gives the DNA inside plasmids the name ‘extrachromosomal DNA’. While bacterial chromosomes are usually reported to be circular in form, they can also be linear. A basic diagram can be seen below.
An example of a bacterial gene is blaOXA-2, which encodes a protein that contributes to beta-lactamase production. The finished product is an enzyme that is known to increase the resistance of many bacteria, including Escherichia Coli, to beta-lactam antibiotics.
Human Genes
The more complex the organism, the more complex its genome and the higher the number of genes. The Human Genome Project estimates that around 30,000 human genes provide the codes for the proteins that create each person’s unique anatomical and physiological identity.
The human genome contains approximately three billion base pairs as subunits of deoxyribonucleic acid nucleotide monomers. The sequence of these base pairs forms the code of each gene, and each gene provides the transferable data for one or more proteins.
The Forkhead box protein P2 (FOXP2) gene encodes a transcription factor. The FOXP2 gene is found at the same chromosomal loci in every human cell (except mature red blood cells), but is only expressed in the brain, the gut and in the lung. This particular transcription factor binds with DNA, but is not limited to a single function as it has the ability to bind with hundreds of DNA promotors and therefore, as previously mentioned, can contribute to the production of more than one protein. However, one of FOXP2’s primary functions is in human development of speech and language. We know this, because mutations in the FOXP2 gene lead to ‘autosomal dominant speech and language disorder with orofacial dyspraxia’, or SPCH1.
The BRCA gene mutation is well known as a cause of breast cancer. Usually, the BRCA genes stop tumor formation by repairing DNA damage caused by pollution, diet, lifestyle habits such as smoking, exposure to radiation, and many other factors. In humans with mutated or damaged BRCA genes, this protection no longer applies. Men and women with BRCA mutations either at locus 17q21 (BRCA 1) or 13q12.3 (BRCA 2) have a much higher risk of developing breast cancer, and females have a higher risk of developing ovarian cancer.
The COL1A1 gene encodes a single component of alpha-1 type I collagen, a protein found in many types of connective tissue. This gene can be found at locus 17q21.33. This ‘address’ refers to the COL1A1 gene’s position on the 17th chromosome, more specifically on the longer ‘q’ arm, and in region 2, band 1, sub-band 33.
MTHFR gives the code the human body requires in order to manufacture methylenetetrahydrofolate reductase. A mutation in the MTHFR gene is actually quite common, and this result means hindrance or an inability to carry out steps within the process of manufacturing end products such as homocysteine and the nucleoside thymidine. This can lead to hyperhomocysteinemia which leads to certain vitamin B deficiencies; one of these, vitamin B9 (folic acid), is necessary for embryonal neural development.
CXorf38 is another gene which codes proteins for tissue formation. Cxorf38 is predominantly expressed in glands and lymph nodes and can be found at locus Xp11.4, which indicates the X chromosome (non-autosomal chromosome or sex chromosome), the shorter ‘p’ arm, region 1, band 1, sub-band 4. This locus is pictured below.
A Short Glossary of Genetic Terms
A gene is not the same as a genome. A genome describes the entire genetic blueprint of a single organism; a gene is a specific part of an organism’s DNA or, in viruses, RNA. A gene features a sequenced chain of nucleic acids which are able to pass on pieces of genetic information through the process of replication, transcription, and translation and provide one or more proteins, the type of which is based upon that piece of code’s specific sequence.
Other terms one often comes across in the field of genetics are:
- Trait: a specific characteristic, such as eye color
- Locus: the position on a chromosome where a single gene can be found
- Allele: the possible variations of a single gene found at a single locus, such as part of the blueprint data for blue and brown eyes
- Genotype: the entire set of alleles found in one or more genes and at various loci that control a single trait
- Phenotype: the observable trait – the actual eye color of the organism
The image below shows a simple diagram featuring fruit fly alleles on a single chromosome arm. The available genetic codes (alleles) for each potential and actual trait are found at the same locus. These traits include leg length, eye color, antennae length, wing shape and abdomen color.
Further Gene Terminology
Genes are associated with many other processes as they are the primary building block of life. This title had previously been given to amino acids, but it is genes that are responsible for the codes to create amino acids and their descendants – proteins.
Gene Therapy
A scientific breakthrough in recent decades, gene therapy makes amendments to existing genes, either resetting mutations or to produce a specific protein. It does this by using a virus which has been made harmless to carry new genetic information into a target cell. This courier function gives the virus a new name: a vector. Gene therapy has not yet developed into a viable and common human therapy and is still in very early stages. Research currently concentrates on designing a safe but effective vector, but clinical trials have already begun on certain populations and successes include treatment for immune deficiencies, hereditary degenerative vision, hemophilia, thalassemia, fat metabolism and cancer. However, success stories are intermittent and vectors can sometimes cause other disorders.
Gene Mutation
Genes and genetic information can undergo mutation in various ways. These include nonsense, missense, silent and substitution mutations and can take place either in the DNA itself or during the processes of replication, transcription, and translation. Only a small percentage of gene mutations cause benign tumor growth or cancer. Mutations can be hereditary or acquired. Acquired mutations may be caused by environmental factors such as radiation, chemical pollutants, viral and bacterial infections, and physical processes of degeneration like the ageing process.
Gene Flow
Gene flow is also known as gene migration or allele flow, and concerns the merging of a species through the transfer of its genetic information. For gene flow to occur, this must happen during a period of migration where the DNA of a usually absent population enters the gene pool of another.
Gene Pool
The sum of all the genes (expressed and non-expressed) within a population of a single species is known as the gene pool. This gives a broad range of possible phenotypes and therefore extensive variation between a single species.
Gene Regulation
Some genes are only expressed at certain times or in certain tissues or cell types. Every mechanism that induces or represses gene expression is grouped under the term gene regulation.
Genetic Drift
The geneticist Sewall Green Wright was the first to theorize about this mechanism of evolution, where certain individuals within a population produce more offspring than others. This does not have to be because these individuals are fitter or stronger (survival of the fittest), but can be due to other factors. An example might be the alpha males of a predatory pack of African wild dogs being killed during a hunting trip, leaving one or two weaker males to provide future descendants. This leads to a drift towards the genetic information of these individuals.
Quiz