Two decades ago, scientists presented the world with a map of the human genome for the first time, marking a significant breakthrough. However, 8% of the human DNA remained unsequenced, making it an incomplete victory. Now, a multi-disciplinary team has successfully accounted for that 8% and completed the picture of the human genome. In 2020, the Genome India Project (GIP), an ambitious gene-mapping project was approved by the Ministry of Science and Technology.
Understanding Genomes
A genome refers to the entirety of an organism’s genetic material. Interestingly, every human genome is mostly similar, with minor differences in the Deoxyribose Nucleic Acid (DNA) from one individual to another. Every organism’s genetic makeup is contained in its DNA, the building blocks of life. The discovery of DNA’s structure as a double helix by James Watson and Francis Crick in 1953 propelled efforts to decode how genes shape life, its traits, and diseases it can cause. Each genome carries all the vital information needed to build and maintain the organism. In humans, a complete copy of the entire genome holds more than 3 billion DNA base pairs.
Distinguishing between Genomes and Genes
The Human Genome Project, conducted between 1990 and 2003, provided the first genetic sequence in 2003. This international collaboration offered insights into a portion of the human genome known as the euchromatin, which is gene-rich, and the DNA encodes for proteins. The unsequenced 8% was found in an area named heterochromatin, comprising a smaller portion of the genome and not producing protein.
Challenges and Breakthroughs
Heterochromatin was given less priority due to two primary reasons. Firstly, it was considered “junk DNA” as it seemingly didn’t have any clear function. Secondly, the euchromatin contained more genes, which were easier to sequence with the technology available at that time. However, the Telomere-2-Telomere (T2T) project, a global collaboration, has now fully sequenced the genome using new methods of DNA sequencing and computational analysis.
Significance of the 8%
The newly sequenced reference genome, T2T-CHM13, includes highly repetitive DNA sequences found in and around telomeres (structures at the chromosome ends) and centromeres (the middle section of each chromosome). This sequence reveals long DNA stretches duplicated in the genome, playing crucial roles in evolution and disease. The findings have indicated a large number of genetic variations predominantly within these repeated sequences. Even though they don’t include active genes, these newly revealed regions contribute significantly to genomic functions.
Impact of this Breakthrough
A complete human genome will simplify the study of genetic variation among individuals or populations. Scientists can use it as a reference while studying genomes of various individuals to understand potential disease-causing variations. The T2T consortium utilized the now-complete genome sequence to discover over 2 million additional variants in the human genome. The new T2T reference genome will supplement the standard human reference genome, known as Genome Reference Consortium build 38 (GRCh38), which originated from the Human Genome Project and has been regularly updated.
Genome Sequencing in Agriculture and Disease Control
Genome sequencing techniques promise major advancements in agriculture and disease control. For instance, these techniques have been used to identify genetic markers for disease resistance and drought tolerance in diverse crop plants. This has expedited the development of new varieties of crop plants and deciphered host-pathogen relationships in crops. For example, Chinese scientists decoded the rice genome in 2002, and Indian Agricultural Research Institute (IARI) scientists used genome sequencing to develop better rice varieties. Genome sequencing also plays a role in combating diseases. The Cas9 protein, often mentioned in news, is a part of the CRISPR-Cas9 technology, allowing geneticists and medical researchers to edit parts of the genome by removing, adding or altering DNA sections.