Machine learning identifies mammal species with the


image: The model predicted a high zoonotic capacity for macaques, which are commonly traded and kept in zoos where there are many opportunities for close contact with humans.
seen Following

Credit: Photo by Leng Cheng via Flickr.

Round-trip transmission of SARS-CoV-2 between humans and other mammals increases the risk of new variants and threatens COVID-19 control efforts. A new study, published today in Proceedings of the Royal Society B, used a new modeling approach to predict the zoonotic capacity of 5,400 mammal species, extending the predictive capacity by an order of magnitude. Of the high-risk species reported, many live near people and in COVID-19 hotspots.

Limited data on ACE2, the cellular receptor to which SARS-CoV-2 binds in animals, is a major bottleneck in predicting high-risk mammalian species. ACE2 allows SARS-CoV-2 to enter host cells and is found in all major groups of vertebrates. It is likely that all vertebrates have ACE2 receptors, but the sequences were only available for 326 species.

To overcome this hurdle, the team developed a machine learning model that combined data on the biological traits of 5,400 species of mammals with data available on ACE2. The objective: to identify mammalian species with a high “zoonotic capacity”, ie the capacity to be infected with SARS-CoV-2 and to transmit it to other animals and humans. The method they developed could help extend the predictive capacity of disease systems beyond COVID-19.

Co-lead author Ilya Fischhoff, postdoctoral associate at the Cary Institute of Ecosystem Studies, comments: “SARS-CoV-2, the virus that causes COVID-19, originated in an animal before it passed to humans. Now people have caused back infections in a variety of mammals, including those kept on farms, zoos, and even our homes. Knowing which mammals are able to re-infect us is essential to prevent re-infections and dangerous new variants. “

When a virus passes from humans to animals and returns to humans, this is called a secondary overflow. This phenomenon can accelerate the establishment of new variants in humans that are more virulent and less susceptible to vaccines. A secondary overflow of SARS-CoV-2 has already been reported among farmed mink in Denmark and the Netherlands, where it has led to at least one new variant of SARS-CoV-2.

Senior author and Cary Institute disease ecologist Barbara Han said, “The secondary spillover allows SARS-CoV-2 established in new hosts to transmit potentially more infectious strains to humans. Identifying the mammalian species that effectively transmit SARS-CoV-2 is an important step in guiding surveillance and preventing the virus from continuously circulating between humans and other animals, making disease control even more expensive. and difficult. “

Binding to ACE2 receptors is not always sufficient to facilitate the replication, excretion and viral transmission of SARS-CoV-2. The team trained their models on a conservative binding strength threshold informed by published ACE2 amino acid sequences from vertebrates, analyzed using a software tool called HADDOCK (High Ambiguity Driven protein-protein DOCKing). This software evaluated each species on the expected binding strength; stronger binding probably promotes successful infection and viral shedding.

Cary Institute co-lead author and postdoctoral analyst Adrian Castellanos says: “The ACE2 receptor performs important functions and is common in vertebrates. It is likely that it evolved in animals alongside other ecological and biological traits. By comparing the biological traits of species known to have the ACE2 receptor with the traits of other mammalian species, we can make predictions about their ability to transmit SARS-CoV-2.

This combined modeling approach predicted the zoonotic capacity of mammalian species known to transmit with 72% accuracy and identified many additional mammalian species that may transmit SARS-CoV-2. The predictions matched the results observed for white-tailed deer, mink, raccoon dogs, snow leopards and others. The model found that the most risky mammal species are often those that live in disturbed landscapes and close to humans – including pets, livestock, and animals that are traded and hunted.

The top 10% of high risk species covered 13 orders. Primates were to have the highest zoonotic capacity and the strongest viral binding among groups of mammals. Water buffaloes, bred for dairy production and agriculture, presented the highest risk among livestock. The model also predicted a high zoonotic potential in live traded mammals, including macaques, Asian black bears, jaguars and pangolins, highlighting the risks posed by live markets and the wildlife trade.

SARS-CoV-2 also presents challenges for wildlife conservation. The infection has already been confirmed in western lowland gorillas. For high-risk charismatic species like mountain gorillas, backfire infection could occur through ecotourism. Grizzly bears, polar bears and wolves, all in the 90th percentile of predicted zoonotic capacity, are frequently treated by biologists for research and management purposes.

Han explains, “Our model is the only one that has been able to make risk predictions for almost all species of mammals. Whenever we hear of a new species found positive for SARS-CoV-2, we revisit our list and find it ranked high. Snow leopards had a risk score around the 80th percentile. We now know that they are among the wild species that could die from COVID-19. “

People working near high risk mammals should take extra precautions to prevent the spread of SARS-CoV-2. This includes prioritizing vaccinations among veterinarians, zookeepers, animal handlers and others who have regular contact with animals. The results may also guide targeted vaccination strategies for mammals at risk.

Han concludes, “We have found that the most risky mammal species are often those that live alongside us. It is essential to target these species for further laboratory validation and field monitoring. We should also explore underutilized data sources such as natural history collections, to fill gaps in data on animal and pathogen traits. More efficient iteration between computer predictions, laboratory analysis, and animal monitoring will help us better understand what enables spillover, spillover, and secondary transmission – information that is needed to guide the response to a zoonotic pandemic. today and in the future. “



  • Ilya R. Fischhoff – Cary Institute of Ecosystem Studies
  • Adrian A. Castellanos – Cary Institute of Ecosystem Studies
  • João PGLM Rodrigues – Department of Structural Biology, Stanford University School of Medicine
  • Arvind Varsani – Biodesign Center for Basic and Applied Microbiomics, Center for Evolution and Medicine, School of Life Sciences, Arizona State University; Structural Biology Research Unit, Department of Integrative Biomedical Sciences, University of Cape Town
  • Barbara A. Han – Cary Institute of Ecosystem Studies

Funding statement

The authors were supported by various grants during this work, including the National Institutes of Health (NIH), the National Science Foundation (NSF), and the Defense Advanced Research Projects Agency (DARPA).


Cary Institute for Ecosystem Studies is an independent, non-profit center for environmental research. Since 1983, our scientists have been studying the complex interactions that govern the natural world and the impacts of climate change on these systems. Our results lead to more effective resource management, political actions and environmental literacy. The staff is made up of global experts in the ecology of cities, diseases, forests and freshwater.


Comments are closed.