User:Dennyigo/sandbox

From Wikipedia, the free encyclopedia

Non-coding DNA sequences are components of an organism's DNA that do not encode protein sequences. Some non-coding DNA is transcribed into functional non-coding RNA molecules (e.g. transfer RNA, ribosomal RNA, and regulatory RNAs) and some others are transcribed into nonfunctional noise. Other functions of non-coding DNA include the transcriptional and translational regulation of protein-coding sequences, scaffold attachment regions, origins of DNA replication, centromeres and telomeres. Its RNA counterpart is non-coding RNA. Function for the classification below is defined as selected-effect function, which means that a sequence is maintained in the genome through natural selection for its function. [1][2]

Functional non-coding DNA sequences[edit]

Cis- and trans-regulatory elements[edit]

Cis-regulatory elements are sequences that control the transcription of a nearby gene. Many such elements are involved in the evolution and control of development.[3] Cis-elements may be located in 5' or 3' untranslated regions or within introns. Trans-regulatory elements control the transcription of a distant gene.

Promoters facilitate the transcription of a particular gene and are typically upstream of the coding region. Enhancer sequences may also exert very distant effects on the transcription levels of genes.[4]

Telomeres[edit]

Telomeres are regions of repetitive DNA at the end of a chromosome, which provide protection from chromosomal deterioration during DNA replication. Recent studies have shown that telomeres function to aid in its own stability. Telomeric repeat-containing RNA (TERRA) are transcripts derived from telomeres. TERRA has been shown to maintain telomerase activity and lengthen the ends of chromosomes.[5]

RNA specifying genes[edit]

RNA specifying genes are regulatory factors that change the expression of RNA sequences which does not require a protein to achieve. (need to cite)

Origins of replication[edit]

Origins of replication is a sequence that is an initiator for DNA replication [6]

Centromeres[edit]

Centromeres are important regions of a chromosome that allow for microtubule attachment.[7]

Scaffold attachment regions[edit]

Scaffold/matrix attachment regions are DNA elements whose function is to compartmentalize chromatin into functional domains. [8]

Nonfunctional non-coding DNA sequences[edit]

Pseudogenes[edit]

Pseudogenes are DNA sequences, related to known genes, that have lost their protein-coding ability or are otherwise no longer expressed in the cell. Pseudogenes arise from retrotransposition or genomic duplication of functional genes, and become "genomic fossils" that are nonfunctional due to mutations that prevent the transcription of the gene, such as within the gene promoter region, or fatally alter the translation of the gene, such as premature stop codons or frameshifts.[9] Pseudogenes resulting from the retrotransposition of an RNA intermediate are known as processed pseudogenes; pseudogenes that arise from the genomic remains of duplicated genes or residues of inactivated genes are nonprocessed pseudogenes.[9] Transpositions of once functional mitochondrial genes from the cytoplasm to the nucleus, also known as NUMTs, also qualify as one type of common pseudogene.[10] Numts occur in many eukaryotic taxa.

While Dollo's Law suggests that the loss of function in pseudogenes is likely permanent, silenced genes may actually retain function for several million years and can be "reactivated" into protein-coding sequences[11] and a substantial number of pseudogenes are actively transcribed.[9][12] Because pseudogenes are presumed to change without evolutionary constraint, they can serve as a useful model of the type and frequencies of various spontaneous genetic mutations.[13]

Introns[edit]

Illustration of an unspliced pre-mRNA precursor, with five introns and six exons (top). After the introns have been removed via splicing, the mature mRNA sequence is ready for translation (bottom).

Introns are non-coding sections of a gene, transcribed into the precursor mRNA sequence, but ultimately removed by RNA splicing during the processing to mature messenger RNA. A smaller portion of introns actually are functional through enabling alternative splicing.[14] Many introns appear to be mobile genetic elements.[15]

Studies of group I introns from Tetrahymena protozoans indicate that some introns appear to be selfish genetic elements, neutral to the host because they remove themselves from flanking exons during RNA processing and do not produce an expression bias between alleles with and without the intron.[15] Some introns appear to have significant biological function, possibly through ribozyme functionality that may regulate tRNA and rRNA activity as well as protein-coding gene expression, evident in hosts that have become dependent on such introns over long periods of time; for example, the trnL-intron is found in all green plants and appears to have been vertically inherited for several billions of years, including more than a billion years within chloroplasts and an additional 2–3 billion years prior in the cyanobacterial ancestors of chloroplasts.[15]

Transposons[edit]

Transposons are repeated DNA that can sometimes move throughout the genome by a cut-and-paste mechanism, there are active transposons which replicate and inactive called fossils.[16]

Adenoviruses[edit]

Adenoviruses are viruses that infect the human genome in order to replicate. [17]

NUMT[edit]

NUMTs are mitochondrial pseudogenes present in the human genome which have been associated with fossil record and complete mitochondrial genome sequences.[18]

Contents and History[edit]

The amount of non-coding DNA varies greatly among species. Often, only a small percentage of the genome is responsible for coding proteins, but an increasing percentage is being shown to have regulatory functions. When there is much non-coding DNA, a large proportion appears to have no biological function, as predicted in the 1960s. Since that time, this non-functional portion has controversially been called "junk DNA".[19]

The variance of non-coding proportions throughout different species genome’s is highlighted by the fact that H. sapiens genome is 98% non-coding, while C. elegans are 76%, C. cerevisiae are 32%, and E. coli only have 12% of the genome being non-coding. [20] In general the fraction of non-coding DNA in the genome increases with increasing organismal complexity, bacteria typically have 10% non-coding, 32% in yeast, 76-77% in nematodes, and 98-98.5% in mammals. [21]

The non-coding genome percentages for specific taxon. Shabalina, Svetlana A.; Spiridonov, Nikolay A. (25 March 2004). "The mammalian transcriptome and the function of non-coding DNA sequences". Genome Biology. p. 105. doi:10.1186/gb-2004-5-4-105.

The use of non-coding DNA took off around the late 1970s [22] Non-coding DNA is a sequence that does not code for amino acids meaning it is either not transcribed or translated [23]. Non-coding DNA represents 98% of the human genome and this 98% is composed of: 54% Fossil transposable elements and viruses, 28% introns, 9% unknown, 5% pseudogenes, 1% centromeres, 0.5% untranscribed regulatory sequences, 0.3% origins of replication, 0.2% active transposable elements and viruses, and 0.1% telomeres [24]. There still is confusion with the term junk DNA being referred to as non-coding DNA [25]. This is wrong for two reasons, junk DNA does not have a function and it can be transcribed and translated while as we see above non-coding can have a function but it does not code for amino acids. [26] Some non-coding DNA can be junk but junk DNA does not have to be non-coding DNA [27]

The Encyclopedia of DNA Elements (ENCODE) project uncovered, by direct biochemical approaches, that at least 80% of human genomic DNA has biochemical activity such as "transcription, transcription factor association, chromatin structure, and histone modification".[28] Though this was not necessarily unexpected due to previous decades of research discovering many functional non-coding regions,[29][30] some scientists criticized the conclusion for conflating biochemical activity with biological function.[31][32][33][34][35] Estimates for the biologically functional fraction of the human genome based on comparative genomics range between 8 and 15%.[36][37][38] However, others have argued against relying solely on estimates from comparative genomics due to its limited scope since non-coding DNA has been found to be involved in epigenetic activity and complex networks of genetic interactions and is explored in evolutionary developmental biology.[30][37][39][40]

Fraction of non-coding genomic DNA[edit]

Utricularia gibba has only 3% non-coding DNA.[41]

The amount of total genomic DNA varies widely between organisms, and the proportion of coding and non-coding DNA within these genomes varies greatly as well. For example, it was originally suggested that over 98% of the human genome does not encode protein sequences, including most sequences within introns and most intergenic DNA,[42] while 20% of a typical prokaryote genome is non-coding.[29]

In eukaryotes, genome size, and by extension the amount of non-coding DNA, is not correlated to organism complexity, an observation known as the C-value enigma.[43] For example, the genome of the unicellular Polychaos dubium (formerly known as Amoeba dubia) has been reported to contain more than 200 times the amount of DNA in humans.[44] The pufferfish Takifugu rubripes genome is only about one eighth the size of the human genome, yet seems to have a comparable number of genes; approximately 90% of the Takifugu genome is non-coding DNA.[42] Therefore, most of the difference in genome size is not due to variation in amount of coding DNA, rather, it is due to a difference in the amount of non-coding DNA.[45]

In 2013, a new "record" for the most efficient eukaryotic genome was discovered with Utricularia gibba, a bladderwort plant that has only 3% non-coding DNA and 97% of coding DNA. Parts of the non-coding DNA were being deleted by the plant and this suggested that non-coding DNA may not be as critical for plants, even though non-coding DNA is useful for humans.[41] Other studies on plants have discovered crucial functions in portions of non-coding DNA that were previously thought to be negligible and have added a new layer to the understanding of gene regulation.[46]


Repeat sequences, transposons and viral elements[edit]

Mobile genetic elements in the cell (left) and how they can be acquired (right)

Transposons and retrotransposons are mobile genetic elements. Retrotransposon repeated sequences, which include long interspersed nuclear elements (LINEs) and short interspersed nuclear elements (SINEs), account for a large proportion of the genomic sequences in many species. Alu sequences, classified as a short interspersed nuclear element, are the most abundant mobile elements in the human genome. Some examples have been found of SINEs exerting transcriptional control of some protein-encoding genes.[47][48][49]

Endogenous retrovirus sequences are the product of reverse transcription of retrovirus genomes into the genomes of germ cells. Mutation within these retro-transcribed sequences can inactivate the viral genome.[50]

Over 8% of the human genome is made up of (mostly decayed) endogenous retrovirus sequences, as part of the over 42% fraction that is recognizably derived of retrotransposons, while another 3% can be identified to be the remains of DNA transposons. Much of the remaining half of the genome that is currently without an explained origin is expected to have found its origin in transposable elements that were active so long ago (> 200 million years) that random mutations have rendered them unrecognizable.[51] Genome size variation in at least two kinds of plants is mostly the result of retrotransposon sequences.[52][53]

Evidence of functionality[edit]

Some non-coding DNA sequences must have some important biological function. This is indicated by comparative genomics studies that report highly conserved regions of non-coding DNA, sometimes on time-scales of hundreds of millions of years. This implies that these non-coding regions are under strong evolutionary pressure and positive selection.[54] For example, in the genomes of humans and mice, which diverged from a common ancestor 65–75 million years ago, protein-coding DNA sequences account for only about 20% of conserved DNA, with the remaining 80% of conserved DNA represented in non-coding regions.[55] Linkage mapping often identifies chromosomal regions associated with a disease with no evidence of functional coding variants of genes within the region, suggesting that disease-causing genetic variants lie in the non-coding DNA.[55] The significance of non-coding DNA mutations in cancer was explored in April 2013.[56]

Non-coding genetic polymorphisms play a role in infectious disease susceptibility, such as hepatitis C.[57] Moreover, non-coding genetic polymorphisms contribute to susceptibility to Ewing sarcoma, an aggressive pediatric bone cancer.[58]

Some specific sequences of non-coding DNA may be features essential to chromosome structure, centromere function and recognition of homologous chromosomes during meiosis.[59]

According to a comparative study of over 300 prokaryotic and over 30 eukaryotic genomes,[60] eukaryotes appear to require a minimum amount of non-coding DNA. The amount can be predicted using a growth model for regulatory genetic networks, implying that it is required for regulatory purposes. In humans the predicted minimum is about 5% of the total genome.

Over 10% of 32 mammalian genomes may function through the formation of specific RNA secondary structures.[61] The study used comparative genomics to identify compensatory DNA mutations that maintain RNA base-pairings, a distinctive feature of RNA molecules. Over 80% of the genomic regions presenting evolutionary evidence of RNA structure conservation do not present strong DNA sequence conservation.

Non-coding DNA may perhaps serve to decrease the probability of gene disruption during chromosomal crossover.[62]

Evidence from Polygenic Scores and GWAS[edit]

The fraction of predictor SNPs in various polygenic risk predictors that are within, or close to, protein coding regions; the horizontal axis denotes the inclusion also of SNPs that are within 0-30,000 base pairs from coding regions. These predictors were trained using LASSO.[63]

Genome-wide association studies (GWAS) and machine learning analysis of large genomic datasets has led to the construction of polygenic predictors for human traits such as height, bone density, and many disease risks. Similar predictors exist for plant and animal species and are used in agricultural breeding.[64] The detailed genetic architecture of human predictors has been analyzed and significant effects used in prediction are associated with DNA regions far outside coding regions. The fraction of variance accounted for (i.e., fraction of predictive power captured by the predictor) in coding vs. non-coding regions varies widely for different complex traits. For example, atrial fibrillation and coronary artery disease risk are mostly controlled by variants in non-coding regions (non-coding variance fraction over 70 percent), whereas diabetes and high cholesterol display the opposite pattern (non-coding variance roughly 20-30 percent).[63] Individual differences between humans are clearly affected in a significant way by non-coding genetic loci, which is strong evidence for functional effects. Whole exome genotypes (i.e., which contain information restricted to coding regions only) do not contain enough information to build or even evaluate polygenic predictors for many well-studied complex traits and disease risks.

In 2013, it was estimated that, in general, up to 85% of GWAS loci have non-coding variants as the likely causal association. The variants are often common in populations and were predicted to affect disease risks through small phenotypic effects, as opposed to the large effects of Mendelian variants.[65]

Regulating gene expression[edit]

Some non-coding DNA sequences determine the expression levels of various genes, both those that are transcribed to proteins and those that themselves are involved in gene regulation.[66][67][68]

Transcription factors[edit]

Some non-coding DNA sequences determine where transcription factors attach.[66] A transcription factor is a protein that binds to specific non-coding DNA sequences, thereby controlling the flow (or transcription) of genetic information from DNA to mRNA.[69][70]

Scientists showed experimentally, with brain organoids grown from stem cells, how differences between humans and chimpanzees are also substantially caused by non-coding DNA – in particular via cis-regulatory element-regulated expression of the ZNF558 gene for a transcription factor that regulates the SPATA18 gene.[71][72]

Operators[edit]

An operator is a segment of DNA to which a repressor binds. A repressor is a DNA-binding protein that regulates the expression of one or more genes by binding to the operator and blocking the attachment of RNA polymerase to the promoter, thus preventing transcription of the genes. This blocking of expression is called repression.[73]

Enhancers[edit]

An enhancer is a short region of DNA that can be bound with proteins (trans-acting factors), much like a set of transcription factors, to enhance transcription levels of genes in a gene cluster.[74]

Silencers[edit]

A silencer is a region of DNA that inactivates gene expression when bound by a regulatory protein. It functions in a very similar way as enhancers, only differing in the inactivation of genes.[75]

Promoters[edit]

A promoter is a region of DNA that facilitates transcription of a particular gene when a transcription factor binds to it. Promoters are typically located near the genes they regulate and upstream of them.[76]

Insulators[edit]

A genetic insulator is a boundary element that plays two distinct roles in gene expression, either as an enhancer-blocking code, or rarely as a barrier against condensed chromatin. An insulator in a DNA sequence is comparable to a linguistic word divider such as a comma in a sentence, because the insulator indicates where an enhanced or repressed sequence ends.[77]

Uses[edit]

Evolution[edit]

Shared sequences of apparently non-functional DNA are a major line of evidence of common descent.[78]

Non-functional sequences appear to accumulate mutations more rapidly than fuctional sequences due to a loss of selective pressure.[13] This allows for the creation of mutant alleles that incorporate new functions that may be favored by natural selection; thus, pseudogenes can serve as raw material for evolution and can be considered "protogenes".[79]

A study published in 2019 shows that new genes (termed de novo gene birth) can be fashioned from non-coding regions.[80] Some studies suggest at least one-tenth of genes could be made in this way.[80]

Long range correlations[edit]

A statistical distinction between coding and non-coding DNA sequences has been found. It has been observed that nucleotides in non-coding DNA sequences display long range power law correlations while coding sequences do not.[81][82][83]

Forensic anthropology[edit]

Police sometimes gather DNA as evidence for purposes of forensic identification. As described in Maryland v. King, a 2013 U.S. Supreme Court decision:[84]

The current standard for forensic DNA testing relies on an analysis of the chromosomes located within the nucleus of all human cells. 'The DNA material in chromosomes is composed of "coding" and "non-coding" regions. The coding regions are known as genes and contain the information necessary for a cell to make proteins. . . . Non-protein coding regions . . . are not related directly to making proteins, [and] have been referred to as "junk" DNA.' The adjective "junk" may mislead the lay person, for in fact this is the DNA region used with near certainty to identify a person.[84]

See also[edit]

References[edit]

  1. ^ Neander, Karen (June 1991). "Functions as Selected Effects: The Conceptual Analyst's Defense". Philosophy of Science. 58 (2): 168–184. doi:10.1086/289610.
  2. ^ Millikan, Ruth Garrett (June 1989). "In Defense of Proper Functions". Philosophy of Science. 56 (2): 288–302. doi:10.1086/289488.
  3. ^ Carroll SB (July 2008). "Evo-devo and an expanding evolutionary synthesis: a genetic theory of morphological evolution". Cell. 134 (1): 25–36. doi:10.1016/j.cell.2008.06.030. PMID 18614008. S2CID 2513041.
  4. ^ Visel A, Rubin EM, Pennacchio LA (September 2009). "Genomic views of distant-acting enhancers". Nature. 461 (7261): 199–205. Bibcode:2009Natur.461..199V. doi:10.1038/nature08451. PMC 2923221. PMID 19741700.
  5. ^ Cusanelli E, Chartrand P (May 2014). "Telomeric noncoding RNA: telomeric repeat-containing RNA in telomere biology". Wiley Interdisciplinary Reviews: RNA. 5 (3): 407–19. doi:10.1002/wrna.1220. PMID 24523222. S2CID 36918311.
  6. ^ Basic virology (3rd ed.). Malden, MA: Blackwell Pub. 2008. ISBN 1-4051-4715-6.
  7. ^ "Centromere - an overview | ScienceDirect Topics". www.sciencedirect.com.
  8. ^ Narwade, Nitin; Patel, Sonal; Alam, Aftab; Chattopadhyay, Samit; Mittal, Smriti; Kulkarni, Abhijeet (22 August 2019). "Mapping of scaffold/matrix attachment regions in human genome: a data mining exercise". Nucleic Acids Research. pp. 7247–7261. doi:10.1093/nar/gkz562.
  9. ^ a b c Zheng D, Frankish A, Baertsch R, Kapranov P, Reymond A, Choo SW, Lu Y, Denoeud F, Antonarakis SE, Snyder M, Ruan Y, Wei CL, Gingeras TR, Guigó R, Harrow J, Gerstein MB (June 2007). "Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution". Genome Research. 17 (6): 839–51. doi:10.1101/gr.5586307. PMC 1891343. PMID 17568002.
  10. ^ Lopez JV, Yuhki N, Masuda R, Modi W, O'Brien SJ (1994). "Numt, a recent transfer and tandem amplification of mitochondrial DNA to the nuclear genome of the domestic cat". Journal of Molecular Evolution. 39 (2): 174–190. doi:10.1007/bf00163806. PMID 7932781.
  11. ^ Marshall CR, Raff EC, Raff RA (December 1994). "Dollo's law and the death and resurrection of genes". Proceedings of the National Academy of Sciences of the United States of America. 91 (25): 12283–7. Bibcode:1994PNAS...9112283M. doi:10.1073/pnas.91.25.12283. PMC 45421. PMID 7991619.
  12. ^ Tutar Y (2012). "Pseudogenes". Comparative and Functional Genomics. 2012: 1–4. doi:10.1155/2012/424526. PMC 3352212. PMID 22611337.
  13. ^ a b Petrov DA, Hartl DL (2000). "Pseudogene evolution and natural selection for a compact genome". The Journal of Heredity. 91 (3): 221–7. doi:10.1093/jhered/91.3.221. PMID 10833048.
  14. ^ Kelemen, Olga; Convertini, Paolo; Zhang, Zhaiyi; Wen, Yuan; Shen, Manli; Falaleeva, Marina; Stamm, Stefan (1 February 2013). "Function of alternative splicing". Gene. pp. 1–30. doi:10.1016/j.gene.2012.07.083.
  15. ^ a b c Nielsen H, Johansen SD (2009). "Group I introns: Moving in new directions". RNA Biology. 6 (4): 375–83. doi:10.4161/rna.6.4.9334. PMID 19667762. S2CID 30342385.
  16. ^ Martin, Munoz-Lopez; Jose, L. Garcia-Perez (31 March 2010). "DNA Transposons: Nature and Applications in Genomics". Current Genomics. pp. 115–128. doi:10.2174/138920210790886871.
  17. ^ "Adenovirus | CDC". www.cdc.gov. 16 March 2021.
  18. ^ Bensasson, Douda; Feldman, Marcus W.; Petrov, Dmitri A. (1 September 2003). "Rates of DNA Duplication and Mitochondrial DNA Insertion in the Human Genome". Journal of Molecular Evolution. pp. 343–354. doi:10.1007/s00239-003-2485-7.
  19. ^ Pennisi E (September 2012). "Genomics. ENCODE project writes eulogy for junk DNA". Science. 337 (6099): 1159–1161. doi:10.1126/science.337.6099.1159. PMID 22955811.
  20. ^ Shabalina, Svetlana A.; Spiridonov, Nikolay A. (25 March 2004). "The mammalian transcriptome and the function of non-coding DNA sequences". Genome Biology. p. 105. doi:10.1186/gb-2004-5-4-105.{{cite web}}: CS1 maint: unflagged free DOI (link)
  21. ^ Shabalina, Svetlana A.; Spiridonov, Nikolay A. (25 March 2004). "The mammalian transcriptome and the function of non-coding DNA sequences". Genome Biology. p. 105. doi:10.1186/gb-2004-5-4-105.{{cite web}}: CS1 maint: unflagged free DOI (link)
  22. ^ Michel, Jean-Baptiste; Shen, Yuan Kui; Aiden, Aviva Presser; Veres, Adrian; Gray, Matthew K.; Pickett, Joseph P.; Hoiberg, Dale; Clancy, Dan; Norvig, Peter; Orwant, Jon; Pinker, Steven; Nowak, Martin A.; Aiden, Erez Lieberman (14 January 2011). "Quantitative analysis of culture using millions of digitized books". Science (New York, N.Y.). pp. 176–182. doi:10.1126/science.1199644.
  23. ^ "Talking Glossary of Genetic Terms | NHGRI". www.genome.gov.
  24. ^ Moran, L. "What's In your genome? - the pie chart".
  25. ^ "Talking Glossary of Genetic Terms | NHGRI". www.genome.gov.
  26. ^ "Talking Glossary of Genetic Terms | NHGRI". www.genome.gov.
  27. ^ "Talking Glossary of Genetic Terms | NHGRI". www.genome.gov.
  28. ^ The ENCODE Project Consortium (September 2012). "An integrated encyclopedia of DNA elements in the human genome". Nature. 489 (7414): 57–74. Bibcode:2012Natur.489...57T. doi:10.1038/nature11247. PMC 3439153. PMID 22955616..
  29. ^ a b Costa F (2012). "7 Non-coding RNAs, Epigenomics, and Complexity in Human Cells". In Morris KV (ed.). Non-coding RNAs and Epigenetic Regulation of Gene Expression: Drivers of Natural Selection. Caister Academic Press. ISBN 978-1904455943.
  30. ^ a b Carey M (2015). Junk DNA: A Journey Through the Dark Matter of the Genome. Columbia University Press. ISBN 9780231170840.
  31. ^ McKie R (24 February 2013). "Scientists attacked over claim that 'junk DNA' is vital to life". The Observer.
  32. ^ Eddy SR (November 2012). "The C-value paradox, junk DNA and ENCODE". Current Biology. 22 (21): R898–9. doi:10.1016/j.cub.2012.10.002. PMID 23137679. S2CID 28289437.
  33. ^ Doolittle WF (April 2013). "Is junk DNA bunk? A critique of ENCODE". Proceedings of the National Academy of Sciences of the United States of America. 110 (14): 5294–300. Bibcode:2013PNAS..110.5294D. doi:10.1073/pnas.1221376110. PMC 3619371. PMID 23479647.
  34. ^ Palazzo AF, Gregory TR (May 2014). "The case for junk DNA". PLOS Genetics. 10 (5): e1004351. doi:10.1371/journal.pgen.1004351. PMC 4014423. PMID 24809441.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  35. ^ Graur D, Zheng Y, Price N, Azevedo RB, Zufall RA, Elhaik E (2013). "On the immortality of television sets: "function" in the human genome according to the evolution-free gospel of ENCODE". Genome Biology and Evolution. 5 (3): 578–90. doi:10.1093/gbe/evt028. PMC 3622293. PMID 23431001.
  36. ^ Ponting CP, Hardison RC (November 2011). "What fraction of the human genome is functional?". Genome Research. 21 (11): 1769–76. doi:10.1101/gr.116814.110. PMC 3205562. PMID 21875934.
  37. ^ a b Kellis M, Wold B, Snyder MP, Bernstein BE, Kundaje A, Marinov GK, et al. (April 2014). "Defining functional DNA elements in the human genome". Proceedings of the National Academy of Sciences of the United States of America. 111 (17): 6131–8. Bibcode:2014PNAS..111.6131K. doi:10.1073/pnas.1318948111. PMC 4035993. PMID 24753594.
  38. ^ Rands CM, Meader S, Ponting CP, Lunter G (July 2014). "8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage". PLOS Genetics. 10 (7): e1004525. doi:10.1371/journal.pgen.1004525. PMC 4109858. PMID 25057982.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  39. ^ Mattick JS (2013). "The extent of functionality in the human genome". The HUGO Journal. 7 (1): 2. doi:10.1186/1877-6566-7-2. PMC 4685169.{{cite journal}}: CS1 maint: unflagged free DOI (link)
  40. ^ Morris K, ed. (2012). Non-Coding RNAs and Epigenetic Regulation of Gene Expression: Drivers of Natural Selection. Norfolk, UK: Caister Academic Press. ISBN 978-1904455943.
  41. ^ a b "Worlds Record Breaking Plant: Deletes its non-coding "Junk" DNA". Design & Trend. May 12, 2013. Retrieved 2013-06-04.
  42. ^ a b Elgar G, Vavouri T (July 2008). "Tuning in to the signals: non-coding sequence conservation in vertebrate genomes". Trends in Genetics. 24 (7): 344–52. doi:10.1016/j.tig.2008.04.005. PMID 18514361.
  43. ^ Thomas CA (1971). "The genetic organization of chromosomes". Annual Review of Genetics. 5: 237–56. doi:10.1146/annurev.ge.05.120171.001321. PMID 16097657.
  44. ^ Gregory TR, Hebert PD (April 1999). "The modulation of DNA content: proximate causes and ultimate consequences". Genome Research. 9 (4): 317–24. doi:10.1101/gr.9.4.317. PMID 10207154.
  45. ^ Ohno S (1972). Smith HH (ed.). "So much "junk" DNA in our genome". Brookhaven Symposia in Biology. 23. Gordon and Breach, New York: 366–70. PMID 5065367. Retrieved 2013-05-15.
  46. ^ Waterhouse PM, Hellens RP (April 2015). "Plant biology: Coding in non-coding RNAs". Nature. 520 (7545): 41–2. Bibcode:2015Natur.520...41W. doi:10.1038/nature14378. PMID 25807488. S2CID 205243381.
  47. ^ Ponicsan SL, Kugel JF, Goodrich JA (April 2010). "Genomic gems: SINE RNAs regulate mRNA production". Current Opinion in Genetics & Development. 20 (2): 149–55. doi:10.1016/j.gde.2010.01.004. PMC 2859989. PMID 20176473.
  48. ^ Häsler J, Samuelsson T, Strub K (July 2007). "Useful 'junk': Alu RNAs in the human transcriptome". Cellular and Molecular Life Sciences (Submitted manuscript). 64 (14): 1793–800. doi:10.1007/s00018-007-7084-0. PMID 17514354. S2CID 5938630.
  49. ^ Walters RD, Kugel JF, Goodrich JA (August 2009). "InvAluable junk: the cellular impact and function of Alu and B2 RNAs". IUBMB Life. 61 (8): 831–7. doi:10.1002/iub.227. PMC 4049031. PMID 19621349.
  50. ^ Nelson PN, Hooley P, Roden D, Davari Ejtehadi H, Rylance P, Warren P, Martin J, Murray PG (October 2004). "Human endogenous retroviruses: transposable elements with potential?". Clinical and Experimental Immunology. 138 (1): 1–9. doi:10.1111/j.1365-2249.2004.02592.x. PMC 1809191. PMID 15373898.
  51. ^ Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. (February 2001). "Initial sequencing and analysis of the human genome". Nature. 409 (6822): 860–921. Bibcode:2001Natur.409..860L. doi:10.1038/35057062. PMID 11237011.
  52. ^ Piegu B, Guyot R, Picault N, Roulin A, Sanyal A, Saniyal A, Kim H, Collura K, Brar DS, Jackson S, Wing RA, Panaud O (October 2006). "Doubling genome size without polyploidization: dynamics of retrotransposition-driven genomic expansions in Oryza australiensis, a wild relative of rice". Genome Research. 16 (10): 1262–9. doi:10.1101/gr.5290206. PMC 1581435. PMID 16963705.
  53. ^ Hawkins JS, Kim H, Nason JD, Wing RA, Wendel JF (October 2006). "Differential lineage-specific amplification of transposable elements is responsible for genome size variation in Gossypium". Genome Research. 16 (10): 1252–61. doi:10.1101/gr.5282906. PMC 1581434. PMID 16954538.
  54. ^ Ludwig MZ (December 2002). "Functional evolution of non-coding DNA". Current Opinion in Genetics & Development. 12 (6): 634–9. doi:10.1016/S0959-437X(02)00355-6. PMID 12433575.
  55. ^ a b Cobb J, Büsst C, Petrou S, Harrap S, Ellis J (April 2008). "Searching for functional genetic variants in non-coding DNA". Clinical and Experimental Pharmacology & Physiology. 35 (4): 372–5. doi:10.1111/j.1440-1681.2008.04880.x. PMID 18307723. S2CID 2000913.
  56. ^ Khurana E, Fu Y, Colonna V, Mu XJ, Kang HM, Lappalainen T, et al. (October 2013). "Integrative annotation of variants from 1092 humans: application to cancer genomics". Science. 342 (6154): 1235587. doi:10.1126/science.1235587. hdl:11858/00-001M-0000-0019-02F5-1. PMC 3947637. PMID 24092746.
  57. ^ Lu YF, Mauger DM, Goldstein DB, Urban TJ, Weeks KM, Bradrick SS (November 2015). "IFNL3 mRNA structure is remodeled by a functional non-coding polymorphism associated with hepatitis C virus clearance". Scientific Reports. 5: 16037. Bibcode:2015NatSR...516037L. doi:10.1038/srep16037. PMC 4631997. PMID 26531896.
  58. ^ Grünewald TG, Bernard V, Gilardi-Hebenstreit P, Raynal V, Surdez D, Aynaud MM, et al. (September 2015). "Chimeric EWSR1-FLI1 regulates the Ewing sarcoma susceptibility gene EGR2 via a GGAA microsatellite". Nature Genetics. 47 (9): 1073–8. doi:10.1038/ng.3363. PMC 4591073. PMID 26214589.
  59. ^ Subirana JA, Messeguer X (March 2010). "The most frequent short sequences in non-coding DNA". Nucleic Acids Research. 38 (4): 1172–81. doi:10.1093/nar/gkp1094. PMC 2831315. PMID 19966278.
  60. ^ Ahnert SE, Fink TM, Zinovyev A (June 2008). "How much non-coding DNA do eukaryotes require?". Journal of Theoretical Biology. 252 (4): 587–92. arXiv:q-bio/0611047. doi:10.1016/j.jtbi.2008.02.005. PMID 18384817. S2CID 1717725.
  61. ^ Smith MA, Gesell T, Stadler PF, Mattick JS (September 2013). "Widespread purifying selection on RNA structure in mammals". Nucleic Acids Research. 41 (17): 8220–36. doi:10.1093/nar/gkt596. PMC 3783177. PMID 23847102.
  62. ^ Dileep V (2009). "The place and function of non-coding DNA in the evolution of variability". Hypothesis. 7 (1): e7.
  63. ^ a b Yong SY, Raben TG, Lello L, Hsu SD (2020-07-21). "Genetic Architecture of Complex Traits and Disease Risk Predictors". Scientific Reports. 10 (10): 12055. Bibcode:2020NatSR..1012055Y. doi:10.1038/s41598-020-68881-8. PMC 7374622. PMID 32694572.
  64. ^ Wray NR, Kemper KE, Hayes BJ, Goddard ME, Visscher PM (April 2019). "Complex Trait Prediction from Genome Data: Contrasting EBV in Livestock to PRS in Humans: Genomic Prediction". Genetics. 211 (4): 1131–1141. doi:10.1534/genetics.119.301859. PMC 6456317. PMID 30967442.
  65. ^ Pennachio LA, Bickmore W, Dean A, Nobrega MA, Bejerano G (2013-03-18). "Enhancers: five essential questions". Nature Reviews Genetics. 14 (4): 288–295. doi:10.1038/nrg3458. PMC 4445073. PMID 23503198.
  66. ^ a b Callaway, Ewen (March 2010). "Junk DNA gets credit for making us who we are". New Scientist.
  67. ^ Carroll SB, Prud'homme B, Gompel N (May 2008). "Regulating evolution". Scientific American. 298 (5): 60–7. Bibcode:2008SciAm.298e..60C. doi:10.1038/scientificamerican0508-60. PMID 18444326.
  68. ^ Stojic L, Niemczyk M, Orjalo A, Ito Y, Ruijter AE, Uribe-Lewis S, Joseph N, Weston S, Menon S, Odom DT, Rinn J, Gergely F, Murrell A (February 2016). "Transcriptional silencing of long noncoding RNA GNG12-AS1 uncouples its transcriptional and product-related functions". Nature Communications. 7: 10406. Bibcode:2016NatCo...710406S. doi:10.1038/ncomms10406. PMC 4740813. PMID 26832224.
  69. ^ Latchman DS (December 1997). "Transcription factors: an overview". The International Journal of Biochemistry & Cell Biology. 29 (12): 1305–12. doi:10.1016/S1357-2725(97)00085-X. PMC 2002184. PMID 9570129.
  70. ^ Karin M (February 1990). "Too many transcription factors: positive and negative interactions". The New Biologist. 2 (2): 126–31. PMID 2128034.
  71. ^ "What makes us human? The answer may be found in overlooked DNA". Cell Press. Retrieved 15 November 2021.
  72. ^ Johansson, Pia A.; Brattås, Per Ludvik; Douse, Christopher H.; Hsieh, PingHsun; Adami, Anita; Pontis, Julien; Grassi, Daniela; Garza, Raquel; Sozzi, Edoardo; Cataldo, Rodrigo; Jönsson, Marie E.; Atacho, Diahann A. M.; Pircs, Karolina; Eren, Feride; Sharma, Yogita; Johansson, Jenny; Fiorenzano, Alessandro; Parmar, Malin; Fex, Malin; Trono, Didier; Eichler, Evan E.; Jakobsson, Johan (7 October 2021). "A cis-acting structural variation at the ZNF558 locus controls a gene regulatory network in human brain development". Cell Stem Cell. doi:10.1016/j.stem.2021.09.008. ISSN 1934-5909.
  73. ^ Lewin B (1990). Genes IV (4th ed.). Oxford: Oxford University Press. pp. 243–58. ISBN 978-0-19-854267-4.
  74. ^ Blackwood EM, Kadonaga JT (July 1998). "Going the distance: a current view of enhancer action". Science. 281 (5373): 60–3. Bibcode:1998Sci...281...60.. doi:10.1126/science.281.5373.60. PMID 9679020. S2CID 11666739.
  75. ^ Maston GA, Evans SK, Green MR (2006). "Transcriptional regulatory elements in the human genome". Annual Review of Genomics and Human Genetics. 7: 29–59. doi:10.1146/annurev.genom.7.080505.115623. PMID 16719718. S2CID 12346247.
  76. ^ "Analysis of Biological Networks: Transcriptional Networks – Promoter Sequence Analysis" (PDF). Tel Aviv University. Retrieved 30 December 2012.
  77. ^ Burgess-Beusse B, Farrell C, Gaszner M, Litt M, Mutskov V, Recillas-Targa F, Simpson M, West A, Felsenfeld G (December 2002). "The insulation of genes from external enhancers and silencing chromatin". Proceedings of the National Academy of Sciences of the United States of America. 99 Suppl 4: 16433–7. Bibcode:2002PNAS...9916433B. doi:10.1073/pnas.162342499. PMC 139905. PMID 12154228.
  78. ^ "Plagiarized Errors and Molecular Genetics", talkorigins, by Edward E. Max, M.D., Ph.D.
  79. ^ Balakirev ES, Ayala FJ (2003). "Pseudogenes: are they "junk" or functional DNA?". Annual Review of Genetics. 37: 123–51. doi:10.1146/annurev.genet.37.040103.103949. PMID 14616058. S2CID 24683075.
  80. ^ a b Levy A (16 October 2019). "How evolution builds genes from scratch - Scientists long assumed that new genes appear when evolution tinkers with old ones. It turns out that natural selection is much more creative". Nature. 574 (7778): 314–316. doi:10.1038/d41586-019-03061-x. PMID 31619796. S2CID 204707405.
  81. ^ Peng CK, Buldyrev SV, Goldberger AL, Havlin S, Sciortino F, Simons M, Stanley HE (March 1992). "Long-range correlations in nucleotide sequences". Nature. 356 (6365): 168–70. Bibcode:1992Natur.356..168P. doi:10.1038/356168a0. PMID 1301010. S2CID 4334674.
  82. ^ Li W, Kaneko K (1992). "Long-Range Correlation and Partial 1/falpha Spectrum in a Non-Coding DNA Sequence" (PDF). Europhys. Lett. 17 (7): 655–660. Bibcode:1992EL.....17..655L. CiteSeerX 10.1.1.590.5920. doi:10.1209/0295-5075/17/7/014.
  83. ^ Buldyrev SV, Goldberger AL, Havlin S, Mantegna RN, Matsa ME, Peng CK, Simons M, Stanley HE (May 1995). "Long-range correlation properties of coding and noncoding DNA sequences: GenBank analysis". Physical Review E. 51 (5): 5084–91. Bibcode:1995PhRvE..51.5084B. doi:10.1103/PhysRevE.51.5084. PMID 9963221.
  84. ^ a b Slip opinion for Maryland v. King from the U.S. Supreme Court. Retrieved 2013-06-04.

Further reading[edit]

External links[edit]


Category:DNA Category:Gene expression