Back on March 12—a lifetime ago—the New Mexico Health Department reported the state's first four COVID-19 cases: a Socorro couple who had recently traveled to Egypt and two women—one from Bernalillo County and the other from Santa Fe County—who had each been to New York.

I thought of those early reports recently after exploring an interactive New York Times feature that traces COVID-19's spread throughout the US. Cheerfully named "How the Virus Won," the piece uses travel information, hidden infections and, notably, the virus' genetic data to show how it spread across the country.

If you're interested in a deep-dive into the genetics of this virus, the Johns Hopkins Bloomberg School of Public Health has an online white paper that lays it all out but, in a nutshell, coronaviruses—including this one—are RNA viruses, meaning they have genomes made out of ribonucleic acid. Through sequencing, scientists are able to read the specific genetic information in the RNA of SARS-CoV-2, the virus that causes the disease named COVID-19.

For example, researchers cited in the Times story used genetic sampling to trace cases in 14 states back to the Seattle outbreak. More than 5,000 contagious people moved in and out of New York City the first two weeks of March. Again, genetic sampling links thousands of cases back to that city.

New Mexico scientists also have taken up this work on a variety of fronts. At the state level, New Mexico is part of a national collaboration through the Centers for Disease Control and Prevention conducting genome sequencing using existing samples from the state's Scientific Laboratory Division (the state lab). Deputy State Epidemiologist Chad Smelser tells me the virus had multiple introductions into New Mexico, with our early cases corresponding with those in New York and "some of our neighboring states." As noted, we also had early cases associated with travel to Egypt.

In addition to showing which strains of the virus came here, sequencing the virus has other benefits, Smelser notes, such as educating the public about the dangers of traveling, "which is one of the main things we discouraged during this pandemic, because we don't want to spread the virus around." The sequencing work also helps scientists see how the virus is changing over time.

Los Alamos National Laboratory scientist Patrick Chain (above) says the lab’s COVID-19 Genome Analytics site is intended to help democratize bioinformatics work sequencing and analyzing the SARS-CoV-2 genome.
Los Alamos National Laboratory scientist Patrick Chain (above) says the lab’s COVID-19 Genome Analytics site is intended to help democratize bioinformatics work sequencing and analyzing the SARS-CoV-2 genome. | Photos Courtesy LANL

Tracking those changes helps ensure diagnostic testing keeps pace. That’s one of the objectives for a new website launched by Los Alamos National Laboratory’s Bioscience Division. The COVID-19 Genome Analytics site (covid19.edgebioinformatics.org) screens current COVID-19 assays (basically tests that determine the accuracy of the tests) against the sequences to determine the most reliable ones for diagnosing the disease itself as the virus’ genetics shift.

The diagnostic angle "is extremely important," LANL scientist Patrick Chain says. "The virus mutates over time; it's not very fast, but some of those mutations are what allows researchers to track its progress and advancement along a geographic map. We were evaluating the mutations to see whether or not they effect the particular regions of the virus that are being targeted" by the testing re-agents used in the diagnostic tests.

Then there's the information the genomes provide about the different strains of the virus. The site can generate a complete SARS-CoV-2 genome for public data repositories and, moreover, provides tools for public health researchers who might not have either the background or resources for advanced bioinformatics.

"We're hoping to help public health labs, other state labs and researchers around the world be able to process the genetic sequencing data in a very easy and streamlined fashion that's also standardized between groups," says Chain, a lab scientist with a microbiology background, who also is a genomics and bio-informations expert. The latter field, he notes, is not a large one and not many researchers—be they epidemiologists or genomicists—"have adequate knowledge of the right technologies, the right tools and the right type of hardware that would be required to be able to process all this data properly."

This type of computational work could not have happened a decade ago, he notes. The first time genomics was really utilized was in the early 2000s for the Ebola outbreak but, even today, only a few thousand genomes have been sequenced for Ebola, "whereas even after the first few months of the [SARS-CoV-2] outbreak hitting North America, there were tens of thousands of genomes available for this work."

That's in part because the virus has been so widespread and due to increasing understanding of genome sequencing's value to public health responses. And, of course, the technology has advanced rapidly. All those factors have made the scale of genome sequencing for this outbreak unprecedented, Chain says, however: "From here on and out, you will see genome sequencing as the first line tool to try to understand the outbreak."