Research team discovers new species of coronavirus in unexpected places


A former UBC postdoctoral researcher led an international research team to reanalyze all public RNA sequencing data to discover nearly ten times more RNA viruses than previously known, including several new coronavirus species in unexpected places.

This planetary-scale database of RNA viruses can help pave the way for the rapid identification of virus spread in humans, as well as viruses that affect livestock, crops and endangered species.

Dr. Artem Babaian (he/him) originated the Serratus Project collaboration. He published the amazing research results in the prestigious scientific journal Nature Last week.

Working with the Cloud Innovation Center, a public/private collaboration between UBC and Amazon Web Services, Project Serratus was able to build a “ridiculously powerful” supercomputer on AWS equivalent in power to 22,500 processors, Babaian said.

The supercomputer read 20 million gigabytes of publicly available genetic sequence data from 5.7 million biological samples worldwide, looking for a specific gene indicating the presence of an RNA virus. The samples have been collected and freely shared within the global research community for 13 years and include everything from ice core samples to animal feces.

Planetary DNA/RNA sequencing

World Sequencing Data Map. Credit: Serratus Project

Project Serratus researchers have found 132,000 RNA viruses (where only 15,000 were previously known) and nine new species of coronavirus. Babaian estimates that without CIC and the AWS Cloud, it would take a traditional supercomputer well over a year and hundreds of thousands of dollars to complete the 2,000 years of CPU time required for this analysis. Serratus completed it in 11 days for $24,000.

“We are entering a new era of understanding the genetic and spatial diversity of viruses in nature, and how a wide variety of animals interface with these viruses. The hope is that we won’t be caught off guard if something like SARS-CoV-2, the novel coronavirus that causes COVID-19, reappears. These viruses can be recognized more easily and their natural reservoirs can be found more quickly. The real goal is for these infections to be recognized so early that they never become pandemics,” said Babaian, who has a doctorate in medical genetics from UBC and is now a Banting Fellow at the University of Cambridge.

“If a patient has a fever of unknown origin, once that blood is sequenced, you can now connect that unknown virus in humans to a much larger database of existing viruses. If a patient, for example, has a viral infection of unknown origin in St. Louis, you can now search the database in about two minutes and connect that virus to, say, a camel from sub-Saharan Africa. sampled in 2012.”

Babaian, 32, was conducting cancer genetic research with BC Cancer when the COVID-19 pandemic hit and he shifted gears.

The work, which Babaian said began as a “fun side project,” began March 3, 2020, when he and his climbing partner friend, UBC engineering student Jeff Taylor, sketched the idea “on the back of a napkin”. said Babaian.

“I should have kept that towel,” he noted.

Babaian reached out to UBC’s Cloud Innovation Center for help soon after. Serratus, named after Serratus Mountain in British Columbia’s Tantalus Range, which he and Taylor saw while climbing in 2020, was born.

Babaian recalled sitting in his wife’s nurse’s chair when the first results began to appear on his laptop, indicating that Serratus was not just working, but producing data at nearly incomprehensible.

“It was probably the most exciting scientific period of my life,” he said. “There are two types of pleasure. Type 1 is smiling and fun. Type 2 is when you’re miserable doing it but the memory shines, like rock climbing. In many ways, Serratus is type 2 fun. You just have to believe it’s going to work. »

Babaian said he would not have been able to do this work without the support of the UBC Cloud Innovation Center.

“The Cloud Innovation Center was really there to open doors for us,” he said. “We had an idea and they brought in experts from their networks to bring it to life. Now the global community can benefit from all this previously untapped research.

“Artem approached us with an innovative vision. The power of the Cloud Innovation Center is that we pair our in-house UBC innovation and technology teams with those of Amazon Web Services,” said Marianne Schroeder, Director of the UBC Cloud Innovation Center. “It has been our great privilege to support the realization of this vision; helping find a technological solution to complex problems, that’s what we do.

The Center, which was launched just before the pandemic in January 2020, supports challenges focused on community health and wellness. To date, the team has published more than 20 projects including reference architectures and deployment guides, all open source available.

“While the public cloud as we know it has been around for 15 years, the last few years of innovation at Amazon Web Services have really made genomics research possible in a new way,” said Coral Kennett, who leads the Center. for Amazon Web Services. “We were able to give Artem access to computing power for pennies per request. We strongly encourage the research community to submit their projects and ideas to the Cloud Innovation Center so that more innovations come to light for the benefit of the community.

Reference: “Petabase-wide Sequence Alignment Catalyzes Virus Discovery” by Robert C. Edgar, Jeff Taylor, Victor Lin, Tomer Altman, Pierre Barbera, Dmitry Meleshko, Dan Lohr, Gherman Novakovsky, Benjamin Buchfink, Basem Al-Shayeb, Jillian F. Banfield, Marcos de la Peña, Anton Korobeynikov, Rayan Chikhi and Artem Babaian, January 26, 2022, Nature.
DOI: 10.1038/s41586-021-04332-2


Comments are closed.