Decoded script

GenomeIndia project creates genomic sequence database of 10,000 Indians, moving a step forward in disease detection and treatment
Decoded script
Published on

Imagine this. An Indian individual with elevated cholesterol levels, is prescribed statins, a class of medications that block an enzyme crucial for making cholesterol. A year later, there is no change. The doctor knows statins are only extensively assessed for efficacy in western populations— Indian genes may differ. A blood test indicates that the patient has a genetic mutation hindering statins’ effectiveness. The doctor then pres-cribes more suitable drugs based on the patient’s genetic profile.

Such personalised medical treatment could soon be possible. Scientists from 20 research institutions in the country have brought us closer to such a scenario. The group said on January 9 that the GenomeIndia project has successfully sequenced 10,074 DNA samples from healthy individuals, creating India’s largest genetic reference database so far. Analysis of 5,750 samples, as per the project website, found unique features in DNA, including rare variations unique to Indians.

“By identifying genetic variants associated with diseases, the project will enable early diagnosis and prevention of genetic disorders,” says V Mohan, chairperson, Dr. Mohan’s Diabetes Specialities Centre and Madras Diabetes Research Foundation, Chennai. Although Mohan is not involved in the project, he says the database could help in research of rare monogenic forms of diabetes, caused by a change in a single gene. Genome sequencing also helps study why some population groups are more susceptible to specific diseases, says Raghu Padinjat, professor, National Centre for Biological Sciences, Bengaluru, and principal investigator of the project.

The project was conceptualised in 2017, to capture India’s genetic diversity. India has over 4,600 population groups, segregated by caste, tribe and religion. They differ in culture, location, climate, physical features, marriage practices, linguistics and genetic architecture. “Although 99.9 per cent of the genomes in humans are identical, 0.1 per cent is highly variable between people. This variable element is important for disease development and treatment responses,” Padinjat tells Down To Earth (DTE).

To create the reference database, the team needed DNA from healthy volunteers who were not on any medication, and whose family history did not show inter-marriage (outside the community). Work began in 2020, after the project received funds from the Centre’s Department of Biotechnology (DBT).

The first step was sample collection. In Maharashtra, for example, Mayurika Lahiri, associate professor, Indian Institutes of Science Education and Research, Pune, and principal investigator of the project, set up camps in July-September 2021 (after delays due to the novel coronavirus pandemic). Volunteers shared their family history and samples were tested for cholesterol, blood sugar levels, and so on. Similar drives occurred in other states. Then came whole genome sequencing, which determines the order of nucleotides in an individual’s DNA and can catch variations in any part of the genome. The next step was analysis and annotation (designating locations of genes and other features), says Padinjat. The samples are now stored at the Centre for Brain Research, Indian Institute of Science campus, Bengaluru. Digital data is at the Indian Biological Data Center, a repository at the Regional Centre for Biotechnology, Faridabad.

Indian genetics

While the GenomeIndia project is so far the largest, there have been efforts earlier—largely to boost representation of Indian genetic data. The first Indian genome was only sequenced in 2009, years after the Human Genome Project, an international initiative conducted in 1990-2003, sequenced the first human genome. In 2010, the Council for Scientific and Industrial Research (CSIR) launched the Indian Genome Variation Consortium to form a database of genomic variations in Austro-Asiatic, Ti-beto-Burman, Indo-European and Dravidian linguistic groups. It sequenced single nucleotide polymorphisms (SNPs), variations that occur when a single nucleotide (a building block of DNA) differs. The team studied SNPs of 900 genes from over 1,800 people to show diversity in Indian population.

In 2016, GenomeAsia 100K, a non-profit consortium, sequenced 100,000 Asian genomes, with nearly 600 from India. The database primarily included tribal groups and specific castes majorly from southern India, reads a 2020 paper published in Nucleic Acids research.

In 2019, CSIR launched the Indigene project that sequenced whole genomes of 1,029 individuals from different states, says Sridhar Sivasubbu, then chief scientist at CSIR-Institute of Genomics and Integrative Biology, Delhi. Sivasubbu, now with an oncology platform in Mumbai, and his team estimated 1 per cent (15 million) of Indians likely possess a genetic mutation predisposing them to genetic conditions like sudden cardiac death and treatable intellectual disabilities.

Privacy concerns

While tapping into genetic data has potential, it must be protected against misuse or discrimination.

Consider sickle cell disease. The inheritable disease is caused by a mutation in a gene affecting red blood cells, with symptoms like anaemia, fatigue and pain. It affects people of African, Central and South American, Mediterranean, West and South Asian descents. In the 1970s, several US states made it mandatory for African-Americans to test for the disease. Children who did not, were not allowed to attend schools. This was seen as discrimination because other populations were not targeted for other debilitating diseases, reads a 2006 study in the Journal of Medical Ethics.

The US in 2008 introduced the Genetic Information Nondiscrimination Act, prohibiting genetic information discrimination in employment. The EU’s General Data Protection Regulation, 2016, also protects genetic data. France, Canada, Italy and Germany have similar measures, says a 2022 study in European Journal of Human Genetics. “India, however, does not have such explicit regulation,” says Sivasubbu. Indigene created a freely available open database, Indigenome, which shares identity-masked aggregate data for research and development and through written con-sent for building products specific for India, he says. In 2021, DBT released Biotech-PRIDE (Promotion of Research and Innovation through Data Exchange) guidelines with norms for safety and quality of datasets and data access. The GenomeIndia database for now will be shared with researchers from renowned institutes, says Suchita Ninawe, a scientist with DBT. But scientists call for more legislation.

This was first published in the 16-28 February, 2025 print edition of Down To Earth

Related Stories

No stories found.
Down To Earth
www.downtoearth.org.in