In his 2011 book The Myth of Junk DNA, Jonathan Wells called the notion of junk DNA a science-stopper. Noting discoveries already made by 2011, he said these are “exciting times,” predicting that ongoing research would continue to discover functions that were not yet imagined. As the following papers show, he was right.
In PNAS, Dongyin Guan and Mitchell A. Lazar commented on work by Fei et al., concluding that “noncoding mutations in enhancers and other, less well-characterized, TF [transcription factor] binding regions also have large effects on cell survival and proliferation.” Their review, “Shining light on dark matter in the genome,” begins,
The complexity of multicellular organisms requires the genome to be transcribed in a cell-type-dependent manner that is responsive to signals, such as hormones, from the internal environment. This is mediated by the epigenome, which decorates and organizes the genome in a web of modified histone proteins functioning in nucleosomes and chemical modifications to genomic DNA arranged 3-dimensionally in the cell nucleus. Functional features of the epigenome such as acetylation of histone lysine residues are “read” by specialized proteins such as those containing bromodomains. Likewise, the genome itself is read by proteins known as sequence-specific transcription factors (TFs), which recognize and bind to specific motifs in genomic DNA. The totality of these sites for a given transcription factor in a given cell is known as its “cistrome”. Most of these binding sites occur in the ∼99% of the genome that does not encode for proteins. [Emphasis added.]
The work by Fei et al. showed a way forward in identifying the binding sites used by TFs in non-coding DNA. Binding sites are like handles. The handles on a tool are functional, even if they are not part of the machine itself. It would be bad to break the handles on a chainsaw, letting it swing loose and cause damage. For similar reasons, the binding sites need to remain intact for all the enhancers and TFs that bind to them. They must not be broken by mutations, even if the real work is done by tools coded by the genes. In another study, hundreds of enhancers were deleted, establishing that most of them reside adjacent to genes. The research has shown, so far, that most enhancer sites — but not all — are located near the genes they regulate. It would be reckless, therefore, to expect deletion of a non-coding region to have no effect.
Together, these studies from 2 independent groups suggest the functional enhancer-promoter interaction is more proximally than distally regulated at the genome-wide level. It would be interesting to combine such analysis with an unbiased study of genome architecture to determine the proportion of functional enhancers that are required for long-range enhancer-promoter interactions and how this differs from that of the entire cistrome.
With the recent progress made in identifying function in non-coding regions, “The power of this technique will increase as more whole-genome sequence information becomes available.”
Portions of DNA that can change location have been characterized as “selfish elements” that parasitize the genome. News from Tokyo Tech is questioning that paradigm, saying, “Not so selfish after all ― Key role of transposable elements in mammalian evolution.” While they still ascribe to a “co-option” theory of TEs, they are surprised at how many they found to be useful to the “host” genome:
The human genome contains 4.5 million copies of transposable elements (TEs), so-called selfish DNA sequences capable of moving around the genome through cut-and-paste or copy-and-paste mechanisms. Accounting for 30-50% of all of the DNA in the average mammalian genome, these TEs have conventionally been viewed as genetic freeloaders, hitchhiking along in the genome without providing any benefit to the host organism. More recently, however, scientists have begun to uncover cases in which TE sequences have been co-opted by the host to provide a useful function, such as encoding part of a host protein.
Professor Hidenori Nishihara’s team found tens of thousands of “potentially co-opted TE sequences” in a comprehensive search for them in mammalian genes.
Surprisingly, 20-30% of all of the binding sites across the genome were located in TEs, with as many as 38,500 TEs containing at least one binding site. The majority of these were in a copy-and-paste type of TE known as a retrotransposon, which duplicates itself, leaving a new copy in a new location.
The TE-derived binding site sequences were more conserved across species than expected, indicating that they are being preserved by evolution because they serve some important function.
The evolutionary co-option tale may be set for a fall, because “it remains unclear how common this mode of TE-mediated regulatory network evolution is.” Design advocates could look at this data with non-Darwinian assumptions and conclude, with Paul Nelson, “If it works, it’s not happening by accident.”
Non-Coding DNA Solution Proposed
A review essay in BioEssays by Giorgio Bernardi has the intriguing title, “The Genomic Code: A Pervasive Encoding/Molding of Chromatin Structures and a Solution of the ‘Non‐Coding DNA’ Mystery.” The solution? Much of the non-coding DNA is involved in creating the architectural structure that the coding regions require in order to be accessible. They build the “LADs” (lamina-associated domains) that connect to the nuclear envelope, and the “TADs” (topologically-associated domains) that create boundaries in the genome. This may help explain the mystery of GC-rich parts of the genome that are characterized by vast numbers of guanine/cytosine links in the double helix. Bernardi says:
the genomic code, which is responsible for the pervasive encoding and molding of primary chromatin domains (LADs and primary TADs, namely the “gene spaces”/”spatial compartments”) resolves the longstanding problems of “non‐coding DNA,” “junk DNA,” and “selfish DNA” leading to a new vision of the genome as shaped by DNA sequences.
Bernardi revels in the succeeding waves of discovery that untied the “mystery” of non-coding DNA. What were these long, repetitive sequences, called isochores, doing? Why were they there? It took 66 years, but Bernardi seems pleased that molecular biology has finally arrived at a “complete picture of the genomic code, including the crucial connection with gene spaces/spatial compartments that the genes rely on for their function.” Junk DNA thereby has been promoted from a parasitic burden to an essential part of the genome.
Three main explanations were put forward for the existence of non‐coding DNA, a problem deserving of the term “mystery,” given that it had withstood 50 years of probing. Ohno, mostly focusing on pseudo‐genes, proposed that non‐coding DNA was “junk DNA.” Doolittle and Sapienza and Orgel and Crick suggested the idea of “selfish DNA,” mainly involving transposons visualized as molecular parasites rather than having an adaptive function for their hosts. In contrast, the ENCODE project claimed that the majority (≈80%) of the genome participated “in at least one biochemical RNA‐ and/or chromatin‐associated event in at least one cell type.” This claim, however, was rejected, mainly because of the loose definition of “functional” elements, in favor of the view that “junk DNA” or “selfish DNA” correspond to 80-95% of the human genome.
At first sight, the pervasive involvement of isochores in the formation of chromatin domains and spatial compartments seems to leave little or no room for “junk” or “selfish” DNA. However, one should now consider that coding sequences are compositionally correlated with the isochores in which they are located (the compositional constraints of the latter depending upon the need to encode/mold chromatin structures as already mentioned) and yet they are expressed. This indicates that there is no problem for transposons to be, on the one hand, compositionally correlated with the “host” isochores, and on the other, to be active. Needless to say, this view also leads to an understanding of the “overlapping” transcription of long non‐coding RNAs that originate from the majority of DNA sequences and that plays important roles.
Let the evolutionists relegate their selfish/junk terminology to the dustbin of history. Anything with an important role is not junk. ID advocates predicted that the vast non-coding portions of DNA would prove to be functional, and they were right.