The parts of the human genome that control when and where genes are turned on have been successfully identified.
The map created with this information will be a shot in the arm for researchers trying to understand and interpret genetic changes linked to human diseases.
The results are published in Nature today (Oct 13).
This has become possible by comparing the sequences of 29 mammalian genomes. The genomes of mammals studied include those of chimpanzees, rhesus monkeys, mice, dogs, rabbits, rats, cats, squirrels, fruit bats, horses, cows, and even elephants.
The authors were able to detect highly conserved regions of the genome in all the 29 mammals studied. The highly conserved regions have remained the same across species for a very long time.
These highly conserved regions make up nearly 4 per cent of the human genome. They were also able to ascribe potential functions to around 60 per cent of the bases found in the conserved regions. This comparative study has helped in understanding how the regulatory controls have remained the same across all mammals.
Compare this with just about 1.5 per cent of the human genome that was found to encode for protein sequence when genome was studied in isolation. But when a comparative study was done with the genomes of mouse, rat and dog, it was found that at least 5 per cent of the protein sequence was probably functional.
A very interesting offshoot of this study was the certainty with which scientists could understand how evolution dating back to more than 100 millions years ago has contributed to adaptation to different environments and lifestyles.
For instance, they were able to pinpoint the specific proteins that are under rapid evolution, like those for the immune system, taste perception, and cell division. Even the protein domains within genes — like those related to bone remodelling and retinal functions – have been found to be evolving rapidly.
Of special interest is the finding that certain DNA controls have been changing only in human and primate genomes.
If scientists were earlier able to identify 200 such regions, the latest study has helped in expanding the list to more than 1,000 regions. This will help in understanding human evolution.
The study has particular relevance in helping us understand genetic variants or mutations closely tied to certain diseases. Individuals suffer from certain diseases when these mutations are disrupted.
Surprisingly, most of the genetic mutations have been identified in the non-protein coding regions of the genome. But for this comparative study, it would have been very difficult to identify mutations that cause diseases in the non-protein coding regions.
“Sequencing of additional species should enable discovery of lineage-specific elements within mammalian clades and provide increased resolution for shared mammalian constraint,” the authors note.
The authors were also able to assign or suggest possible functions for more than half of the 360 million DNA letters present in the conserved elements. These regions have been carefully preserved across mammals for millions of years.
The authors now intend to sequence 100 to 200 mammalian species so as to achieve single-nucleotide resolution.
The biggest advantage of comparing the sequence of many mammals becomes apparent in the case of humans.
For instance, even to undertake experimental studies to know the functional regions requires prior knowledge of the biochemical activity sought.
But “comparative approaches provide an unbiased catalogue of shared functional regions independent of biochemical activity or condition,” the authors write. “It can thus capture experimentally intractable or rare activity patterns.”