MODELING AND ANALYSIS OF THE SPREAD OF COVID-19 UNDER A MULTIPLE-STRAIN MODEL WITH MUTATIONS

oyagan@andrew.cmu.edu, anirudhs@princeton.edu,reletreby@cmu.edu,slevin@princeton.edu, jplotkin@sas.upenn.edu,poor@princeton.edu ABSTRACT. Since December 2019, COVID-19 has caused worldwide devastation. To devise effective countermeasures, it is important to develop mathematical models that help us to understand and predict the spreading of COVID-19, as well as to provide guidelines on what can be done to limit its spread. To this end, we leverage recent work of Eletreby et al. (2020) which studies a model where multiple strains of a virus propagate through a network while also undergoing mutations. Highlighting the recent reports on a mutation of SARS-CoV-2 that is thought to be more transmissible than the original strain, we discuss the importance of incorporating mutations and evolutionary adaptations in epidemic models. We also demonstrate how the results of Eletreby et al. (2020) can be used to assess the effectiveness of mask-wearing in limiting the spread of COVID-19. These are supported by simulation results showing the impact of various mutation and mask-wearing possibilities.


INTRODUCTION
From its start in December 2019 in Wuhan, China, the novel coronavirus (known to cause a respiratory disease known as COVID-19) has spread rapidly and broadly and has since devastated a significant fraction of the world population. The World Health Organization (WHO) has identified the spread of COVID-19 as a pandemic, and as of February 2021, over 100 million individuals were infected, with about 2.5 million dying of the disease. In addition, the spread of the virus and the countermeasures taken against it have severely impacted the economy, with industries such as tourism, travel and entertainment suffering the most. Schools at all levels are closed in several countries around the world; and highly anticipated events, including the Tokyo 2020 Summer Olympics and Euro 2020 Championship, were cancelled. In summary, the spread of COVID-19 is among the most catastrophic events affecting the health and well-being of humans worldwide since World War II.
A key scientific goal concerning COVID-19 is to develop mathematical models that help us to understand and predict its spreading behavior, as well as to provide guidelines on what can be done to limit its spread. With tight restrictions on traveling, large gatherings, and commercial entertainment in place across many jurisdictions, another important question is to understand the order and time in which these restrictions can be safely eliminated.
In the recent work (Eletreby et al., 2020), a class of spreading processes (including information propagation in online social networks and virus propagation) was studied under a multiple-strain model with mutations; i.e., a mutation may take place at each host leading to a different strain of the virus/information with different transmissibility from the originally acquired strain. Their work aimed to bridge the disconnect between how spreading processes propagate and evolve in real life versus mathematical and simulation models that ignore evolutionary adaptations. The results in Eletreby et al. (2020) are shown to predict accurately the epidemic threshold, expected epidemic size, and the expected fraction of individuals infected by each strain in this model. The key finding is that classical epidemic models that do not consider the evolution of the strain lead to incorrect predictions on the spreading dynamics when mutations that affect transmission are present.
The purpose of the current paper is to discuss how the findings in Eletreby et al. (2020) can help address some key questions concerning the spread of COVID-19. We highlight the recent reports on a mutation of SARS-CoV-2 that is thought to be more transmissible than the original strain, and discuss the importance of incorporating mutation and evolutionary adaptations (together with the network structure) in epidemic models. We also demonstrate how the multiple-strain transmission model studied in Eletreby et al. (2020) can be used to assess the effectiveness of mask-wearing in limiting the spread of COVID-19. Finally, we present simulation results on a few sample cases 3 to demonstrate our ideas and the utility of the findings of Eletreby et al. (2020) in the context of COVID-19.
The rest of the paper is organized as follows. In Section 2, we present the model studied by Eletreby et al. (2020) and summarize its main results. There, we also summarize some recent reports on the mutations that SARS-CoV-2 has exhibited so far. In Section 3, we present our ideas on how the model and findings of Eletreby et al. (2020) can be utilized to better understand the spread of COVID-19. There, we show, through an analogy between the multiple-strain virus propagation and a single-strain propagation where some individuals wear a mask, that the results in Eletreby et al. (2020) can help assess the effectiveness of mask-wearing in the spread of COVID-19. Finally, in Section 4 we present numerical results to demonstrate our main ideas in a few sample cases. We remark that our forthcoming conference publication (Sridhar et al., 2021) will support our work in this paper by providing an abridged discussion and additional simulations to validate our claims.

MODELING THE SPREAD OF A VIRUS
Modeling the spread of epidemics. What causes an outbreak of a disease? How can we predict its emergence and control its progression? Over the past several decades, multidisciplinary research efforts have been converging to tackle the above questions, aiming for providing a better understanding of the intricate dynamics of disease propagation and accurate predictions on its course (Anderson et al., 1992;Barabási, 2016;Daszak et al., 1999;Fraser et al., 2004;Granell et al., 2014;Lloyd-Smith et al., 2005;Moreno et al., 2002;Morens et al., 2004;Newman, 2002;Pastor-Satorras & Vespignani, 2001;Wei et al., 1995;Wolfe et al., 2007). At the heart of these research efforts is the development of mathematical models that provide insights on predicting, assessing, and controlling potential outbreaks (Brauer et al., 2012;Diekmann & Heesterbeek, 2000;Keeling & Rohani, 2011;Siettos & Russo, 2013). The early mathematical models relied on the homogeneous mixing assumption, meaning that an infected individual is equally likely to infect any other individual in the population, without regard to her location, age, or the people with whom she interacts. A finer approach is taken by metapopulation models, where the population is divided into several sub-populations in which the epidemic may have different propagation characteristics or may interact in different ways (Hanski & Hanski, 1999;Keeling & Rohani, 2002;Watts et al., 2005). More recently, network epidemics has emerged as a mathematical modeling approach that takes the underlying contact network between individuals into consideration (Allard et al., 2009;Barabási, 2016;Keeling & Eames, 2005;Miller & Kiss, 2014;Newman, 2002;Pastor-Satorras et al., 2015). The main goal of these mathematical models is to characterize the speed and scale of propagation and to provide insights into how the parameters of a disease, e.g., its basic reproductive number denoted by R 0 , can be used to predict the ultimate reach of the virus. In a nutshell, R 0 is defined as the mean number of secondary infections in a naive population, i.e., the expected number of infections directly generated by one individual in a population where all individuals are susceptible to infection.

Evolution of infectious diseases.
A common theme among the proposed models for network epidemics is the assumption that the virus is transferred across individuals without going through any modification or evolution (Anderson et al., 1992;Balthrop et al., 2004;Dodds & Watts, 2004;Newman et al., 2002;Qian et al., 2012;Sahneh et al., 2013;Yagan & Gligor, 2012;Yagan et al., 2013;Zhuang & Yagan, 2016). However, in real-life spreading processes, pathogens evolve in response to changing environments and medical interventions (Alexander & Day, 2010;Antia et al., 2003; SPREAD OF COVID-19 UNDER A MULTIPLE-STRAIN MODEL WITH MUTATIONS 4 Leventhal et al., 2015;Morens et al., 2004;Pfennig, 2001). Although the vast majority of molecular changes are neutral or deleterious, there can be strong selection to promote the spread of rare adaptive changes in pathogens. In fact, 60% of the (approximately) 400 emerging infectious diseases that have been identified since 1940 are zoonotic 1 (Jones et al., 2008;Morse et al., 2012a). A zoonotic pathogen is usually poorly adapted, poorly replicated, and inefficiently transmitted when first introduced into the human population (Parrish et al., 2008), but it may eventually acquire and fix mutations that allow it to spread more efficiently from human to human (Jones et al., 2008;Morens et al., 2004;Morse et al., 2012b;Pfennig, 2001;Plowright et al., 2017;Woolhouse et al., 2005). Evolutionary adaptations are often the result of amino-acid substitutions in pathogen proteins that facilitate host cell binding, entry, and release. Some pathogens, such as seasonal influenza viruses, undergo ongoing amino-acid substitutions in immunogenic proteins that facilitate escape from immunity in the host population. But the molecular basis of adaptation can also include recombination and reassortment (e.g., H5N1 influenza) as well as hybridization (e.g., Phytophthora alni) (Woolhouse et al., 2005). In fact, a key event that caused the emergence of the 1918 H1N1 influenza pandemic was a re-assortment of viral RNA segments during co-infection, resulting in a novel virus with increased infectivity and virulence (Klempner & Shapiro, 2004).
The emergence of COVID-19 provides an ongoing example of pathogen evolution, and it highlights the role of molecular evolution in facilitating pathogen establishment in a new host species. Originally a non-human disease, the ability of COVID-19 to undergo animal-to-human transmission and eventually human-to-human transmission was paved by mutation and selection that produced a strain that is well-adapted to the human host. In addition to the original mutation that led COVID-19 to spread among humans, there is evidence (Long et al., 2020;Tang et al., 2020) that the novel coronavirus has already diverged into distinct lineages with functional differences. Detailed molecular studies  have identified a specific point mutation in the spike protein that allows the virus to infect host cells more readily; and this mutation also appears to dominate other variants epidemiologically. In the coming weeks and months, COVID-19 may evolve a greater variety of functional variants, including functions related to transmissibility, virulence, and even, eventually, vaccine escape. Functionally distinct strains of COVID-19 may co-evolve or compete with one another. Multiple-strain model with mutations. In Eletreby et al. (2020), we studied the (inhomogeneous) multiple-strain 2 model and characterized the spread and evolution of a pathogen on a contact network. In particular, we i) developed a mathematical theory that characterizes the epidemic threshold, expected epidemic size and the expected fraction of individuals infected by each strain; ii) validated our results on a real-world contact network (a contact network among students, teachers, and staff at a US high school (Salathé et al., 2010) and a contact network among professional staff and patients in a hospital in Lyon, France (Vanhems et al., 2013)); and iii) provided a detailed 1 A zoonosis is any disease or infection that is naturally transmissible from vertebrate animals to humans. 2 At a high level, strains represent homogeneous groups within species (Balmer & Tanner, 2011) and they generally possess unique features such as virulence, infectivity, growth rate, etc.

SPREAD OF COVID-19 UNDER A MULTIPLE-STRAIN MODEL WITH MUTATIONS
5 analysis of the case in which co-infection is possible. The model considered in our work 3 is particularly appropriate for pathogens with short infectious periods and high mutation and generation rates, e.g., RNA viruses (Grenfell et al., 2004) such as COVID-19.
The multiple-strain model (Alexander & Day, 2010) can be briefly outlined for the two-strain case as follows; cases with arbitrary number m of strains can be modeled similarly. Consider a spreading process that starts with an individual, i.e., the seed, receiving infection (from an external reservoir) with strain-1 of a particular pathogen. The seed infects each of her contacts independently with probability T 1 , called the transmissibility of strain-1. Once a susceptible individual receives the infection from the seed, the pathogen may evolve within that new host prior to any subsequent infections. In particular, the pathogen may remain as strain-1 with probability µ 11 or mutate to strain-2 (that has transmissibility T 2 ) with probability µ 12 = 1 − µ 11 . If the pathogen remains as strain-1 (respectively, mutates to strain-2) within a newly infected host, then that host infects each of her susceptible neighbors in the subsequent stages independently with probability T 1 (respectively, T 2 ). As the process continues to grow, if any susceptible individual receives strain-1, the pathogen may remain as strain-1 with probability µ 11 or mutate to strain-2 with probability µ 12 = 1 − µ 11 prior to subsequent infections. Similarly, if any susceptible individual receives strain-2, the pathogen may remain as strain-2 with probability µ 22 or mutate to strain-1 with probability µ 21 = 1 − µ 22 prior to subsequent infections. The process continues to grow until no additional infections are possible; see Figure 1. The key contribution of Eletreby et al. (2020) is the mathematical theory that enables calculating the number of individuals who will be infected by strain-1 and strain-2, respectively, as a function of the transmissibility parameters (i.e., T 1 and T 2 ), mutation probabilities (i.e., µ 11 and µ 22 ), and the structure of the underlying contact network (e.g., its degree distribution).
It is important to note that the transmissibilities and mutation probabilities represent average values. In reality, the probability that individual A transmits the virus to individual B depends on numerous factors, such as the number of times A and B interact within close physical proximity and the amount of time A remains infectious. While the relatively simpler multiple-strain model described above abstracts out these complexities through the use of the transmissibilities and mutation probabilities, it can accurately compute quantities such as the probability that an epidemic emerges, the epidemic threshold, and the final size of the epidemic (see, e.g., (Newman, 2002)) in more complex models. On the other hand, the model involving transmissibilities does not capture the time-varying behavior of the epidemic, such as the location and size of the epidemic peak. We elaborate on models that can capture time-varying behavior in Section 4.
Modeling the contact network. In modeling the underlying contact network, we utilize random graphs with arbitrary degree distribution generated by the configuration model (Molloy & Reed, 1995;Newman et al., 2001). The configuration model generates random graphs with specified degree 3 An important focus of Eletreby et al. (2020) was to understand the role of mutation and evolution in the spread of information in social networks. The similarity between evolution of viruses and information was inspired by the mechanism of natural selection. For example, a virus may go through a number of mutations, which is mostly a random process. However, among them, the mutation that leads to highest infectiousness may be "naturally selected" and become prevalent. A similar process can be envisioned for the spread of information. Several people can attempt to modify a piece of information to make it more attractive, e.g., to get more "likes" or "retweets". Then, it will be the audience who may "select" the most appealing version of the information, and make it much more prevalent than others. sequence (sampled from an arbitrary degree distribution), but are otherwise random, by taking a uniformly random matching on the half-edges of the specified degree sequence. The model provides a tractable mathematical framework that allows the investigation of several key properties related to the spreading process and how it interacts with the structure of the underlying graph, as specified by its degree distribution. In addition, since the model could match the degree sequence of real-world social networks, it would essentially generate graphs that resemble such real-world networks to some extent.

UTILIZING THE MULTIPLE-STRAIN MODEL WITH MUTATIONS TO BETTER UNDERSTAND THE SPREAD OF COVID-19.
As mentioned earlier, the main goal of this paper is to discuss potential ways that the model and results in Eletreby et al. (2020) can help shed light on the spread of COVID-19. We believe that the multiple-strain model of spreading processes can be utilized towards i) understanding the impact of potential mutations in COVID-19, and ii) assessing the impact of some mitigation strategies (e.g., mask-wearing) on the spread of COVID-19. Below, we explain in more detail how to achieve these goals, and present sample simulation results in Section 4.
3.1. Understanding the potential impact of mutations. In light of the discussion above, we believe it is of utmost importance to incorporate potential mutations and evolution into the mathematical models used for predicting the future spread of COVID-19. This may help us better prepare for different mutation scenarios, including worst-cases for the current or future pandemics. For example, it may help us understand the effects of a new strain emerging with significantly different spreading characteristics than existing ones. In Section 4, we provide several examples from simulations under different mutation scenarios to demonstrate these ideas.
3.2. Understanding the impact of mitigation strategies. In addition to understanding the potential impact of different mutation scenarios, we believe that the multiple-strain model (and the accompanying results) can also help in evaluating the impact of some mitigation strategies implemented to slow down the spread. At a high level, this is motivated by the intuitive similarity between "a virus mutating and becoming less easy to infect individuals" and "people following mitigation measures (e.g., social distancing, wearing masks, etc.) and thus becoming less easy to contract the disease". The analogy here is strengthened by the fact that those who do not obey mitigation strategies are more likely to infect each other (in a similar manner to carrying a different strain of the virus) than those who are obeying the guidelines.
Below, we formally show how the impact of people wearing masks can be captured by the multiple-strain model in Eletreby et al. (2020). In Section 4, we provide simulation results for a few cases to demonstrate this analogy and show that the analytical results in Eletreby et al. (2020) can be directly used to determine the probability of emergence under the mask model introduced below.
Mapping the impact of wearing masks to the mutation model. We introduce a potential approach to understand the impact of mask wearing through the mutation model introduced in Eletreby et al. (2020). First, some notation is defined. Let T N N , T N M , T M N , T M M denote the probabilities that an infectious individual will transfer the virus to each of their contacts in the network (independently from each other) according to the four possibilities given below.
• T N N : probability that a non-mask-wearing individual infects a non-mask-wearing individual. The assumption T M N < T N M can be explained as follows. The mask is likely to be useful in limiting the droplets emanating from an infectious person, but not as effective in preventing a healthy person catching the virus from a non-mask-wearing, infectious individual . Finally, we assume that each individual has a probability p of wearing a mask independently from others. As a starting point, it is convenient to assume that mask-wearing behavior is independent from the network structure. Future studies may focus on more complex cases to capture the notion that if most of one's friends are not wearing a mask, then he/she is also likely to not wear a mask (and, vice versa). Henceforth, we refer to the spreading model with parameters p, T M M , T M N ,T N M , T N N described above as the mask model.
In the SIR/bond percolation model (Allard et al., 2009;Newman, 2002;Yagan et al., 2013), the main parameter of interest is the transmissibility of the virus, which represents the mean probability that an infectious person transfers the virus to a susceptible person. Here, the mean is taken due to the fact that not every individual has the same contact frequency/behavior with each of their neighbors in the network. In the formulation described above, we can define and calculate two transmissibility parameters, one for mask-wearing individuals (say, T 1 ) and one for non-maskwearing individuals (say, T 2 ). Initially, the probability that a susceptible vertex wears a mask is p, so we may compute the initial transmissibilities T 1 and T 2 by first conditioning on the maskwearing status of a susceptible neighbor and then taking an expectation: With the ordering given at (3.1), we obtain T 1 < T 2 . The parameters T 1 , T 2 in (3.2) can be viewed as the transmission probabilities of two different strains of the virus. Those wearing a mask are assumed to be carrying strain-1 that has a smaller transmissibility than strain-2 carried by non-maskwearers. Next, we calculate the mutation probabilities; e.g., probability that an individual who is infected by a strain-1-carrying individual (i.e., a mask-wearer) transmits to its contacts strain-2 of the virus (i.e., transmits with the probability T 2 associated with non-mask-wearers). For a newly infected individual whose mask-wearing behavior is unknown, assume that we know the maskwearing behavior of the person who infected them. We can calculate the posterior probability of this newly infected person wearing a mask as P[newly infected is wearing a mask at time t | they were infected by a mask-wearer] In the mutation model of Eletreby et al. (2020), µ 11 represents the probability that a person infected by strain-1 of the virus remains in strain-1. In the mask model, the quantity µ 11 has a potentially useful meaning as well. In particular, each person infected by a mask-wearer will also be a wearing a mask with probability µ 11 (and not wearing a mask with probability µ 12 = 1 − µ 11 ). Put differently, for a mask-wearing infectious person, µ 11 (respectively, µ 12 ) represents the expected fraction of mask-wearers (respectively, non-mask-wearers) among those that they infect.
Following the same approach, it is easy to compute all four mutation probabilities and thus the mutation probability matrix. We have . Using the analogy between the mask model and the multiple-strain model with mutations, we can leverage the analytical results obtained in Eletreby et al. (2020) and Alexander and Day (2010) to study the mask model. It is important to note that the two models are equivalent (with the analogy introduced above) as long as the parameter p gives the probability that a susceptible individual wears a mask throughout the spreading process. In other words, the analogy would lead to exact results at the early stages of the spreading process, i.e., when the population is naive until a significant fraction of the population becomes infected. After that point, we would expect the fraction of susceptible individuals who wear a mask to increase as more people are infected (given that non-mask-wearers are more likely to be infected than mask-wearers), making it necessary to render the parameter p, and thus the mutation probabilities in (3.3) and transmissibilities in (3.2), time-varying. Consequently, the probability that an epidemic emerges is the same in both models, meaning that previous results on the mutation model (Alexander & Day, 2010;Eletreby et al., 2020) can be directly used in calculating the probability of emergence in the mask model; see Section 3.3 for a formal discussion of this. Although the results of Eletreby et al. (2020) do not lead to exact predictions for the expected size of the epidemic under the mask model, they can still be used to obtain rough estimates for the epidemic size as we illustrate in Figure 6 in Section 4.
3.3. The probability of emergence. We now provide a formal argument to show that the analytical results from Alexander and Day (2010), Eletreby et al. (2020) on the mutation model yield precise results for the probability of epidemic emergence under the mask model. Recently, Tian et al. (2020) derived the probability that an epidemic emerges in the mask model by a direct analysis of the model. We briefly review their results. Let {p k } ∞ k=0 be the degree distribution of the contact network, where p k is the probability that a vertex has degree k. Let g be the probability generating function (PGF) for the degree distribution, defined by We also define the G to be the PGF for the excess degree distribution, defined by where k = ∞ k=0 kp k is the mean degree. They showed that the PGF for the number of infected neighbors of patient zero of each type (mask-wearing and non-mask-wearing) is given by where γ M is the PGF if patient zero wears a mask, and γ N is the PGF if patient zero does not wear a mask. Similarly, the PGF for the number of infected neighbors of a later-generation infective of each type is given by where, as before, Γ M is the PGF if the later-generation infective wears a mask, and Γ N is the PGF if the later-generation infective does not wear a mask. Finally, using well-known results from branching process theory, if (Q 1 , Q 2 ) is the smallest non-negative solution of the equation (s, t) = (Γ M (s, t), Γ N (s, t)), then we have (P 1 , P 2 ) = (γ M (Q 1 , Q 2 ), γ N (Q 1 , Q 2 )) where P 1 (resp., P 2 ) is the probability that the epidemic dies out in finite time if patient zero wears a mask (resp., does not wear a mask). The probability of emergence is the probability of the complement of the aforementioned event, given by 1 − P 1 (resp., 1 − P 2 ) if patient zero wears a mask (resp., does not wear a mask). A key insight from Tian et al. (2020) is that the PGFs in (3.5), (3.6) are the same PGFs used in the derivation of the probability of emergence in a two-strain model with mutations with transmissibilities T 1 , T 2 and mutation probabilities µ. This shows that the probability of emergence is the same in the mask model and the mutation model.
In Tian et al. (2020), the authors also derived the epidemic threshold. Define the matrices Then the basic reproduction number is given by where ρ(Tp) is the spectral radius of Tp. Then, the epidemic threshold is expressed in the usual manner through the basic reproduction number, with a positive probability of epidemic emergence only if R 0 > 1; if R 0 < 1, the spreading process almost surely dies out without reaching a positive fraction of the population. It would be of interest to extend the results of Tian et al. (2020) and study a model that incorporates both mutations and the impact of mask-wearing. This would involve deriving a new set of formulas for the PGFs of number of infected nodes under different strains and mask-wearing behavior. It might be possible to accomplish this by extending (3.5)-(3.6) in combination with the corresponding formulas from Eletreby et al. (2020) to a new, and a more complicated, set of formulas; e.g., we would need to have four formulas for the case with two viral strains. Another interesting direction for future work would be to use a multi-layer contact network with layers representing different types of relationships between nodes; e.g., co-worker, neighbor, school, etc. This might help in understanding the impact of reducing, or entirely eliminating, the contact rate in certain layers (e.g., by closing schools) in slowing down the spread of the virus.

SIMULATION RESULTS
In this section, we present some simulation results pertaining to the mutation model and mask model. The goal of our simulations is to qualitatively illustrate how, in the case of the mutation model, an unlikely but highly virulent strain of a virus can rapidly spread through a contact network, and, in the case of the mask model, how the efficacy of masks and the fraction of mask-users influences the spread of a virus. It is of special interest to understand how masks and viral mutations can affect the spread of SARS-CoV-2, so we choose our simulation parameters to match that of SARS-CoV-2. We emphasize that there is still much that is unknown about the spreading dynamics of COVID-19, particularly as to how asymptomatic individuals spread the virus. Thus our results should be taken as a qualitative assessment of how the COVID-19 pandemic may change and evolve.
We begin by reviewing our simulation model. As mentioned earlier, the mask and mutation models abstract the complex mechanics of viral spreading into the transmissibility parameters, which represent the probability that an individual eventually infects a given neighbor. While such a model can be readily analyzed from a theoretical lens, it does not describe the time-varying behavior of the epidemic (e.g., where the peak in cases occur, how long the epidemic persists before the population recovers, etc.). To properly account for time-varying behavior, we assume that the time it takes for an infected individual to transmit the virus to a neighbor is an Exp(r) random variable, which is independent of all other randomness in the system. The parameter r is the contact rate, which represents the average number of potentially viral-spreading interactions in a given day. The contact rate may depend on the strain that is being transmitted, or could depend on whether the host or target is wearing a mask. If we denote the infectious period, which is the amount of time an infected person is contagious, by IP , the transmissibility of the virus is given by Parameter selection. The reproductive number, R 0 , of SARS-CoV-2 is estimated to be between 1.4 and 3.9 based on data from the earliest 425 confirmed cases in Wuhan, China ; we will use the conservative estimate R 0 = 4.0. The same authors estimate the serial intervals, which is the average time between infections, to be 7.5 days. For our model, this implies that the contact rate is r = 1/7.5 days −1 . We next estimate the infectious period. There have been multiple studies on the viral shedding period (Bullard et al., 2020;van Kampen et al., 2020;Wölfel et al., 2020), which indicate that the median viral shedding period after the onset of symptoms is at most 8 days. A recent survey of epidemiological studies of COVID-19 by McAloon et al. (2020) showed that the median time to symptom onset is 5 days. Since viral shedding can occur even before the onset of symptoms (He et al., 2020), a conservative estimate of the median infectious period (the amount of time for which an individual may infect others) is 13 days. Since the time between infections is an exponential random variable with a mean of 7.5 days, the transmissibility can be If we assume that the contact network as a Poisson(λ) degree distribution, (3.7) implies that R 0 = λT . If we use the conservative estimate R 0 = 4, this shows that a reasonable approximation is λ = 5; i.e., each individual interacts with 5 others, on average.

Emergence of highly contagious strains.
We consider a scenario where there is initially a single strain of the virus, but with some very small probability it may mutate into a highly contagious strain. In our simulations, we set T 1 = 0.3, T 2 = 0.8, µ 12 = 0, 0.001, 0.003 and µ 22 = 1.0. We simulated the epidemic on random networks with 5000 nodes and a Poisson(5) degree distribution. We conducted 1000 independent simulations, and averaged the time-varying behavior across We see that surgical masks significantly impede the spread of the virus compared to cloth masks as the fraction of mask-wearers increases.
all experiments. 5 Our results are displayed in Figure 2. There are several interesting phenomena that are highlighted by these plots. While Strain 1 dominates the infection curve when µ 12 = 0, Strain 2 dominates Strain 1 by a wide margin even if µ 12 is as small as 0.001; this margin is dramatically larger in the case where µ 12 = 0.003. Hence even if there is a very small chance that a virus could evolve into a highly contagious strain, it is likely in some cases that the more infectious strain will spread the most. Our forthcoming conference publication (Sridhar et al., 2021) includes simulations for additional transmissibilities which illustrate the same qualitative phenomenon. The impact of mask-wearing. As before, we assume that the contact rate between two non-maskwearing individuals is r = 1/7.5 days −1 . When one or both of the individuals are wearing a mask, the contact rate decreases since it is harder to transmit the virus. To make this idea more precise, let i denote the inward efficiency of a mask, defined to be the probability that the mask fails to block the pathogen from coming inside the mask. Similarly, let o denote the outward efficiency of a mask, defined to be the probability that a mask-wearing individual transmits a pathogen to others. An inward efficiency of i = 0 implies that no pathogen can pass from the outside to the inside of a mask, while i = 1 means that the mask is useless at blocking pathogens from outside. Similar interpretations hold for o = 0, 1. Let r M M denote the contact rate between two mask-wearers, let r M N denote the contact rate between mask-wearing infective and a non-mask-wearing neighbor, with similar interpretations for r N M and r N N . We can then write Using the estimate IP = 13 days, we can compute the transmissibility T M M as We can derive T M N , T N M , T N N through similar computations. Several papers have assessed the effectiveness of different types of masks in preventing the transmission of respiratory droplets; we refer the reader to the review in Eikenberry et al. (2020, Section 2.3). For simplicity, we shall assume that the inward and outward efficiencies are the same: = i = o . Based on the discussion by Eikenberry et al. (2020), a reasonable estimate of is 0.5 if cloth masks are used, and 0.2 if surgical masks are used. Our first set of results can be found in Figure 3, which shows how the infection curves change as the fraction of mask-wearers, p, varies between 0 and 1 when cloth masks or surgical masks are used. In Figure 4, we display (a) (b) (c) Figure 5. The probability of emergence as a function of p and . Lines (marked as "theory") correspond to analytical results from Eletreby et al. (2020) and Alexander and Day (2010) corresponding to the mutation model, while the symbols are obtained by empirical simulation of the mask model. Plot (a) studies the probability of emergence as a function of p when = 0.5 (cloth masks) and plot (b) does the same when = 0.2 (surgical masks). Plot (c) considers the probability of emergence as a function of when p = 0.77. The blue points/curves correspond to the case where patient zero wears a mask, and the red points/curves correspond to the case where patient zero does not wear a mask. For the most part, the empirical data matches the theoretical curves very well; we expect that the few fluctuations are due to finite-sample effects.
the infection curves when we vary between 0 and 1 and fix p to be a constant. We set p = 0.77, which was the estimated fraction of mask-wearers in New Jersey as of November 20, 2020 ("IHME: COVID-19 Projections", 2020). For all the plots in Figures 3 and 4, the curves shown are generated by averaging over the infection curves of 100 independent simulations run on contact networks with 5000 nodes and a Poisson(5) degree distribution. Additional simulations regarding the timevarying behavior of the mask model can be found in (Sridhar et al., 2021). Next, we compare the mask model to the analytic results of the corresponding mutation model. In Figure 5, we study the probability of emergence as a function of p when cloth masks or surgical masks are used, as well as the probability of emergence as a function of . To generate the plots in Figure 5, we conducted 10,000 independent simulations on contact networks with 50,000 nodes and a Poisson(5) degree distribution. To obtain the empirical estimates of the probability of emergence, we counted the fraction of simulations where the epidemic infected at least 5% of the population. We see a good fit between the empirical and theoretical values; while there are a few deviations, we expect that these are due to finite-sample effects. Finally, we study the expected size of the epidemic as a function of p as well as in Figure 6. To generate the empirical data in this figure, we ran 100 independent simulations on contact networks with 5000 nodes and a Poisson (5) degree distribution. If the epidemic did not infect more than 5% of the population, we threw out the simulation result and re-ran the simulation until it infected more than 5% of the population. This allowed us to estimate the expected epidemic size conditioned on the ultimate survival of the epidemic. While the analytical predictions from the mutation model do not match the empirical results from the mask model perfectly concerning the expected epidemic size, they still provide a ballpark estimate and can potentially be useful. As explained before, the mismatch can be attributed to the fact that the fraction p of the mask-wearers among the susceptible population changes over time as the pathogen spreads to a significant fraction of the population, leading to a change in the mutation probabilities from the values calculated via (3.3). Further simulations in Sridhar et al. (2021) show that, in some cases, even though the expected size of the epidemic differs in the mask and mutation models, the time-varying behaviors of the two models are very close in the earlier stages of the epidemic propagation.

CONCLUSION
The COVID-19 pandemic has claimed hundreds of thousands of lives along with disrupting the lives of billions. A key scientific goal towards helping with the fight against COVID-19 is to develop mathematical models to understand and predict its spreading behavior, as well as to assess the effectiveness of mitigation strategies implemented to limit its spread.
In this paper, we discuss how the multiple-strain epidemic model with mutations studied in Eletreby et al. (2020) can help address key questions concerning the spread of COVID-19. Highlighting the recent reports on a mutation of SARS-CoV-2 that is believed to be more transmissible than the original strain, we discuss the importance of incorporating mutation and evolutionary adaptations in epidemic models. We also demonstrate how the results of Eletreby et al. (2020) can be used to assess the effectiveness of mask-wearing in limiting the spread of COVID-19. We present simulation results on a few sample cases to demonstrate our ideas and the utility of the findings of Eletreby et al. (2020) in the context of COVID-19.
We believe that this paper may stimulate more research efforts on incorporating virus mutation and evolutionary adaptations in epidemic models. We also expect that the analogy established here between the mutation model and the mask model can help facilitate sound assessment of the effectiveness of masks in mitigating the spread of COVID-19. Disclosure statement. The authors have no conflicts of interest to declare.