Science & Technology

‘An individual’s genome carries information shaped over millenia of evolutionary history’

Down To Earth spoke with Aryn P Wilder, co-author of one of 11 research papers providing insights into mammalian evolution

 
By Rohini Krishnamurthy
Published: Friday 28 April 2023

The research papers aligned and compared the genomes of 240 mammal species. Photo: iStock

Zoonomia, an international collaboration focused on uncovering new ways of understanding mammalian evolution and humans, released 11 research papers on April 27, 2023 that provided insights into mammalian evolution.

One paper used genetic data to predict extinction risk. Down To Earth spoke with Aryn P Wilder, a researcher from San Diego Zoo Wildlife Alliance, who is also the paper’s corresponding author. Here are edited excerpts from the interview. 

Rohini Krishnamurthy: How did this study come about and what inspired the idea of using genomic data to predict extinction risk? 

 

Aryn Wilder: There has been this general idea that this [genome] information will be helpful for conservation, but with this study, we wanted to test that idea and quantify how well it works. 

RK: How did you estimate the extinction risk? What factors did you consider?

AW: We aligned and compared the genomes of 240 mammal species. We considered one individual per species and then lined them up.

This allowed us to look systematically across the genomes. We looked for positions in the genome that seem conserved across species. These regions have not changed much. 


Read more: What makes humans unique? Scientists compare genomes of 240 species to uncover secrets of mammalian evolution


We assume that a position in the genome similar across species is likely to be under strong selection to stay that way and that they carry out important biological functions. 

We also assume that any mutation at that conserved position is harmful. We also looked at protein-coding genes (genes that carry instructions to make proteins). 

From mice studies, we picked up mutations in protein-coding genes that could cause lethality or be harmful. Then we assume that those are also likely to be really dangerous. This helped us estimate the genetic load, which estimates the total number of harmful or potentially harmful mutations across the genome. 

We also looked at genetic diversity (variations) across the genomes of all 240 species and population history and measured effective population size over time. We used a computer model that traces how and when mutations were likely to have arisen. 

Based on that, the programme tells us the population size over time, from 10,000 years ago to tens of millions of years in the past.

Three factors were considered: Genetic diversity, the historical effective population size and genetic load. We looked at how these different factors are correlated with each other and species conservation status as classified by the International Union for Conservation of Nature’s Red List. 

RK: How were the factors correlated with each other?

AW: An individual’s genome carries that information shaped over millions of years of evolutionary history.

We found that effective population size, for example, is correlated with heterozygosity (little genetic variability). Larger populations tend to have more genetic diversity, which is something that we would expect.

Historically small populations also tended to have higher burdens of genetic load. And that fits well with theoretical predictions — so we confirmed expectations. 

We used machine learning — a type of artificial intelligence. Machine learning models based on genomic factors (historical effective population size, genetic load and genetic diversity) were compared to those based on ecological variables like body size, geographic home range and diet.

Models based on ecological variables also do a good job of predicting IUCN status. But collecting ecological variables can be really difficult. 

For example, collecting gestation length and litter size information is time-consuming and laborious. But a genome can be sequenced relatively quickly and cheaply these days.

RK: How accurate are predictions based on genetic information?

AW: We found that the genomic information from a single individual isn’t as good as ecological models at predicting extinction risk. However, the difference is a little.


Read more: ‘Changing climatic conditions induce vegetation changes, which shapes human evolution’


This suggests that a genome can be really valuable when we don’t have enough ecological information. To measure accuracy, we used a metric called area under the receiver operating characteristic (AUROC). 

It is a metric of how often the model is right versus wrong. A model with an AUROC of 0.5 just predicts at random, and an AUROC of 1 has perfect accuracy.

The models that had only ecological variables had an AUROC of 0.88. The scores of genomic models ranged from 0.69 up to 0.82.

RK: Your study suggested that historical demography is an important indicator of the resilience of the current species. Could you elaborate on that?

AW: We hypothesise that species with historically small populations in the distant past also have small populations now. And just by having small population sizes, the species is much more vulnerable to environmental shifts, over-exploitation, or climate change. They lose resiliency just for that reason. 

RK: How can genetic data support conservation measures moving forward? 

AW: The IUCN classifies more than 20,000 species as data deficient. So there is no adequate data to categorise their extinction risk or put them into one of these conservation categories. 

The IUCN assessment process is lengthy, costly and complex. So we need a way to triage the species that are data deficient and prioritise the ones that are most likely to have high conservation needs and need intervention immediately.

By themselves, genomic models don’t give an end-all conservation status. We can’t classify animals as endangered based on that. Instead, we can use genetic data to say that this species has a high risk, so now we need to follow up and determine its conservation status.

RK: You have analysed 240 mammals. Do you plan to expand this in the future?

AW: Yes. We want to include more mammalian species and also species outside of mammals. We also hope to study multiple individuals per species. This may improve the predictions of extinction risks.


Read more: Scientists find nine-million-year-old ape fossils in Himachal Pradesh


RK: What challenges must be addressed before scaling this mode of predicting extinction risks?

AW: The computation time and the computational power required for this kind of analysis are a challenge. It takes a lot of computation time on supercomputers. 

We hope to streamline that process and determine where we can take shortcuts.

For example, should we look at variants or a subset of genetic variants across the entire genome? Do we need to align and compare multiple species every time we add a new species, or can we use a shortcut and add species to the existing alignments? These need to be worked out before it becomes widely available.

Read more:

Subscribe to Daily Newsletter :

Comments are moderated and will be published only after the site moderator’s approval. Please use a genuine email ID and provide your name. Selected comments may also be used in the ‘Letters’ section of the Down To Earth print edition.