Model Citizens

Scientists join together to show the power of data in fighting the COVID-19 pandemic

When Los Alamos National Laboratory scientist Sara Del Valle gives public presentations on her work, she often shows a clip from the film Contagion, in which a new bat-borne virus swiftly spreads from China to the rest of the world, swiftly and gruesomely killing many and crippling society while scientists race to curtail the spread and find a cure.

"I always tell people, 'Hollywood almost got it right,'" says Del Valle, a deputy group leader and applied mathematician in LANL's Information Systems and Modeling Group and part of the team modeling the COVID-19 pandemic. Where the film errs, she says, is in its portrayal of Dr. Erin Mears (Kate Winslet) as the sole epidemiologist investigating the outbreak.

"In reality," Del Valle says, "there are hundreds of thousands of people working around the clock in trying to help mitigate the spread."

For decades, scientists from across disciplines have been studying and modeling infectious diseases, providing crucial information for outbreaks such as H1N1, SARS and Ebola, as well as seasonal influenza. Del Valle gravitated to the field as an undergraduate when she discovered it served both her love of mathematics and her upbringing by pastor parents to perform public service. She has also developed epidemiological models for smallpox, anthrax and HIV.

To combat COVID-19, scientists have come together to provide policymakers models for everything from hospital surges to mortality rates to hidden outbreaks, using a variety of mathematical approaches and data sets, and sometimes drawing differing conclusions. New Mexico's modeling team has taken a distinctly local approach, relying on local data to monitor the disproportionate impact on the state's most vulnerable citizens. Just last week, the New Mexico Department of Health unveiled a new website making their work more visible.

Indeed, the pandemic has put all these scientists' work on display to the general public in unprecedented and sometimes misinterpreted ways, with discussions of transmission rates, mortality forecasts and peak dates dominating discourse everywhere from Facebook to the evening news.

For the scientists, their specialized work applying computational math to massive data sets has significant implications as government and health care leaders look to the models to make swift policy decisions. At the same time, the pandemic carries an opportunity to demonstrate the importance of science in civic life and a teachable moment about what the immediate and foreseeable future could look like for everyone.

Yet while the groundwork for modeling the COVID-19 pandemic was in place prior to the outbreak, even trained scientists admit they are in uncharted territory.

COVID-19's high level of contagion, its silent transmission, delays in testing and the long timeline before a vaccine emerges have created an unexpected situation in which "our only move to protect lives and prevent catastrophic hospital surges" was to shut everything down, says Lauren Ancel Meyers, the Cooley Centennial Professor of Integrative Biology and Statistics & Data Sciences at The University of Texas at Austin and a member of the External Faculty and Scientific Advisory Board of the Santa Fe Institute. A pioneer in the field of epidemic modeling, Meyers and her lab have provided several reports featured in the national press on peak death rates, hidden epidemics of the disease in US cities and social distancing efficacy, to name just a few.

"In my 20 years of modeling pandemics and modeling global viruses that spread across the globe and are very deadly, this was not among the scenarios people in my field and public health agencies had really considered," she says.

Toward the end of September, 2019, just a few months before the start of the COVID-19 pandemic, Meyers delivered the first in a two-part lecture, "Preventing the Next Pandemic," to a packed crowd at the Lensic Performing Arts Center as part of Santa Fe Institute's annual Stanislaw Ulam Memorial lecture.

Meyers' interest in the topic of deadly outbreaks started in childhood when she devoured books on the topic and had nightmares "about the big one that was going to wipe out life on earth as we knew it." She went on to train as a mathematical biologist at Harvard and Stanford universities, but "everything clicked into place," she said, while she was a post-doc at SFI studying complex systems in science and was invited to collaborate with the Centers for Disease Control and Prevention. "I had the opportunity to use math and biology and complex systems thinking to solve really practical problems about how we could control emerging outbreaks. And ever since then that has been my professional charge: to build complex system models that help us understand how diseases are spreading and come up with better strategies for controlling them," she said during the lecture.

Three major outbreaks stoked her childhood fears: the 2002/2003 SARS outbreak, H1N1 in 2009 and the Ebola epidemic in west Africa in 2014. Her research on the diseases has appeared in more than 100 published peer-reviewed articles and also in reporting by the Wall Street Journal, New York Times and Washington Post, to name a few. Then COVID-19 came along. Meyers first read about the virus emerging in Wuhan in January and, from that moment, "we've been working around the clock not only to track and understand the virus, but to build models that can support public health and city/state national leaders navigating how best to protect our populations from transmission," she tells SFR during a telephone interview.

Both of Meyers' pandemic lectures are part of SFI's Complexity Explorer course on pandemics, and Meyers also provided an introduction to the math behind the modeling during her lecture in Santa Fe.

The COVID-19 pandemic has introduced many to the concept of flattening the curve, a visual representation of the slowing of spread. The curve represents how swiftly the virus is spreading. A steep curve means it's spreading quickly. The reproduction number, or R0 (pronounced R naught), is one key data point in disease modeling and expresses how many people one sick person can infect—how contagious an illness is (explained aptly by Winslet's character in Contagion). The effective rate of transmission measures what is actually happening in terms of person-to-person infection in specific communities.

As such, the transmission rate isn't fixed. Social distancing measures aim to reduce the rate. New Mexico's target for re-opening more of the economy is a 1.15 rate of transmission; the state's most recent modeling report calculated an effective transmission rate here of 1.24 compared to 1.28 the week prior, where small changes of rates of transmission have huge impacts across large populations. The higher the rate of transmission, the more quickly cases double, what's known as exponential growth. As it slows, the curve flattens.

While there are many different types of models and computations, the simplest—and perhaps the most well-known—is the SIR model, in which mathematical equations are used to model the flow of a virus through people in three states: susceptible, infected and recovered.

The math used to understand how a disease will march through a network of people derives from an area of statistical physics called percolation theory, first developed in the 1940s to model the flow of a liquid through material and determine, by considering the configuration of the material's channels and the viscosity of the liquid, whether the liquid will permeate the material. "Disease transmission is similar," Meyers explained in her lecture. "The channels are the contacts between individuals or groups of individuals, the liquid is our disease and its viscosity is the transmissibility."

Scientists began applying this type of math to disease modeling in 1983 and it grew in importance in the early 2000s, in part as a result of Meyers' and her colleagues' work and research conducted at SFI.

In the COVID-19 pandemic, researchers across the globe are using different classes of models and different computational approaches. These include the SIR model (which the state of New Mexico uses), agent-based models, as well as curve-fitting approaches such as Meyers' lab used to model peak death in each US state using geolocation data from cellphones to determine the impact of social distancing. The much-referenced University of Washington's Institute for Health Metrics and Evaluation models also uses a curve-fitting approach. Much of Meyers' early work was focused on network modeling and using machine learning to improve outbreak detection, forecasting and control.

Whatever the approach, she tells SFR, every model is different, even ones in the same class.

"It's not a competition," she says. "None of us believe that we have the one right model and that everyone else has the wrong model. All of these models have different expertise built into them, different data built into them and they complement each other and they each give us a different perspective on what might be happening and what might be to come."

The CDC, she says, is taking an "ensemble" approach and asking all the modeling groups to provide their projections so they can be considered collectively to help guide decision making.

"There's a diverse ecosystem of modeling approaches," she says. "Even two models that have the same general style will have differences in the details."

Those details have been important in New Mexico, where health officials have emphasized their modeling efforts as distinctly local. Up until last week, New Mexicans learned of those projections from Health and Human Services Secretary David Scrase during weekly livestreamed updates with Gov. Michelle Lujan Grisham. On April 30, the health department unveiled a new modeling website, which it plans to update weekly (

Jason Mitchell, chief medical and clinical transformation officer for Presbyterian Healthcare Service, leads the 25-person team modeling the pandemic for the state. For Mitchell, that effort began with questions directly impacting Presbyterian's health facilities.

"We wanted to be able to be prepared for how many staff do we need, how many vents do we need, how many beds do we need, how much protection equipment? It was an exercise in preparing our organization to make sure we could take care of New Mexico," he says.

That effort became collaborative with Los Alamos and Sandia national labs along with health department epidemiologists, all of whom consult weekly to adjust projections.

"[It] became a great collaborative effort to each share the data analytics we have and to validate those analytics with one another and identify what's going to happen in the next week, the next month, and over the next few months to a year," Mitchell says. The process, he notes, "helped the governor make some really good policy decisions for the state and save a lot of lives."

That might not have happened, he says, if the state hadn't developed its own models.

"Obviously I look at a lot of the models," Mitchell says, "Some of the models painted very rosy pictures early on… those models are more concerning because they wouldn't encourage you to take action to prevent further spread."

For example, the state's April 28 epidemiological report discusses why the much-touted University of Washington's IHME model would have been less effective for New Mexico's response to the pandemic. For one, it used historic death rates to infer hospitalization and ICU use. New Mexico health officials say the state's low death rate creates problems in the IHME model, as does the latter's lack of New Mexico-specific data. IHME doesn't use real-time data, account for the state's diverse population and it estimates, rather than calculates, the rate of effective transmission.

New Mexico's model, on the other hand, uses significant local data on a daily basis, including statewide testing rates and results; capacity and demand by county and facilities; county-level projections; adjusted population risk; along with data sets on social determinants of health and analytics from John's Hopkins Adjusted Clinical Groups system. The system also uses information derived from Presbyterian's health plan claims and clinical databases.

"New Mexico is unique," Mitchell says. "We have 19 pueblos, we have three tribes and we have portions of the Navajo Nation, and then we have both very urban and very rural areas in addition to that. We have a really unique geographic mix, and so a general state model would never address that mix."

Much of this has allowed the state to forecast COVID-19's impact on its most vulnerable citizens.

"When you look nationally, you see this massive disparate impact on poor communities and communities of color," Mitchell says. "We've been able to build that in advance into our model and predict that so we can direct where action needs to be taken into the state. That's a really important aspect of this model that no other models can offer you."

The state anticipated, for example, the high levels of COVID-19 cases in the northwestern counties such as McKinley and San Juan. As of last week's modeling report, pueblos and nursing homes in those northwest counties, along with Cibola County, continued to have growing COVID-19 cases—per capita four times that of the state overall.

"From a data point of view, you would expect a huge impact in portions of our state, specifically northwest portions," Mitchell says. "You've got a high level of disease burden there, you've got poverty in that section of the state." The state focused its efforts there, he says, because "it's really important to take care of populations that are vulnerable in this pandemic."

It's also important, he adds, to recognize that vulnerability stems from structural conditions, not culpability.

"All the tribes have really worked hard and done a good job of taking care of the people and the fact they had an outbreak is not because they did anything wrong or different," he says. "It's because you have large families living in homes together [leading to] family spread… you have diabetes and other disease burdens. This is because of the demographics and population of the state. I don't want people to think anyone in the state has not done the right thing."

Del Valle, who is part of the state team, hopes to focus future modeling work on the role of age and underlying factors for COVID-19 patients. "I'm interested in having a really good forecasting model that incorporates social demographic and health factors so we can better forecast for hospitalizations," she says.

A medical doctor, Mitchell also is certified in infomatics and sees data playing a crucial role in contemporary health care.

"I'm a family medicine doctor, I love the breadth, I love the impact we can have on the community level. Data analytics is how you understand where to focus and my passion stems from that desire to help large populations and invest in the tools to do so. That translates very well to a pandemic: By having the right data, you can save lots of lives."

Saving lives drives all the models in the COVID-19 pandemic—their purpose is to help policy makers choose which actions will impact outcomes for the best: slower spread, fewer hospital surges, less death.

Needless to say, the pandemic has provided a teachable moment about disease modeling.

For instance, LANL has a forecasting model that calculates COVID-19's trajectory using confirmed cases and deaths. These types of models come closer to weather forecasting as they don't use social distancing, transmission rates or any other type of assumption. Del Valle says LANL's model has been made possible by the access to real-time data, often hard to come by for infectious disease modeling. She'd like to see more access going forward for all infectious diseases in order to provide such forecasting models for other outbreaks.

Projection models, by contrast, consider different assumptions and scenarios to compute potential outcomes. A common misperception by the public, scientists say, is trying to read such models like weather forecasts.

"If there's a hurricane coming, you can move stuff out of the way of the hurricane," says Samuel Scarpino, an assistant professor in the Network Science Institute at Northeastern University, "but as far as I know, we can't move the hurricane yet, probably never will be able to…The whole point of the models was to influence the decisions and thus influence the epidemic trajectory. We are trying to intervene, meaning if we are right in the beginning, we can very likely look like we were wrong because the measures went into place and the epidemic changed trajectory."

Scarpino, who runs Northeastern's Emergent Epidemics Lab, began his career as a field biologist and a bench scientist working on empirical population genetics. At UT Austin, Meyers was one of his dissertation advisors and he ended up working on several infectious disease projects with her, including the notable research and modeling efforts around the 2009 H1N1 pandemic. After earning his PhD, he became an Omidyar Fellow at SFI in 2013 where he made the conscious choice to blend both interests and build a career "around the intersection of ecology and public health epi."

Like all the scientists involved in modeling the pandemic, Scarpino has had his work reported on by the national press, and has devoted time to moderating science forums, along with sharing information with journalists, students and the general public in forums such as a Reddit Ask Anything session last month. Part of this front-facing effort stems from a belief in the importance of discourse, he says, but he also sees it as his responsibility in these unprecedented times.

"So much of what's going on in terms of our response is being made at the political levels, is a result of public opinion, is a result of what individuals know or understand or don't know or understand about this outbreak," he says. "Everything we can do to contribute to putting out accurate information that reflects our current best understanding and appropriately communicates the uncertainty or certainty is something that we just all have to do."

The type of computer modeling at play, Scarpino says, represents a field that has developed dramatically in the last 20 years, in terms of expertise, as well as relationship-building between scientists and public health officials. But COVID-19's scale has no contemporary precedent and its unknowns remain vast and significant.

In fact, all of the scientists interviewed were at pains to emphasize the importance of recognizing the pandemic as a fluctuating situation without scripted outcomes. Physics and turbulence drive the weather, Del Valle notes. Public policy and human behavior—less predictable factors, in other words—drive disease outbreaks.

"This is not a one and done thing," Mitchell says of the peak—the height of cases. Models of past epidemics, such as Spanish flu and H1N1—show that future waves and future peaks will come as restrictions lift. On this point, all the models agree. "We don't want to be sick and have mass casualties, and we don't want to languish where no one can earn an income or get basic healthcare services," he says. "Modeling becomes so imperative so we can have the best of both worlds."

Letters to the Editor

Mail letters to PO Box 4910 Santa Fe, NM 87502 or email them to editor[at] Letters (no more than 200 words) should refer to specific articles in the Reporter. Letters will be edited for space and clarity.

We also welcome you to follow SFR on social media (on Facebook, Instagram and Twitter) and comment there. You can also email specific staff members from our contact page.