Skip to main content
SearchLoginLogin or Signup

Predictions, Role of Interventions, and Effects of a Historic National Lockdown in India's Response to the COVID-19 Pandemic: Data Science Call to Arms

Published onJun 09, 2020
Predictions, Role of Interventions, and Effects of a Historic National Lockdown in India's Response to the COVID-19 Pandemic: Data Science Call to Arms

You're viewing an older Release (#5) of this Pub.

  • This Release (#5) was created on Jun 10, 2020 ()
  • The latest Release (#8) was created on May 23, 2022 ().


With only 536 COVID-19 cases and 11 fatalities, India took the historic decision of a 21-day national lockdown on March 25, 2020. The lockdown was first extended to May 3 soon after the analysis of this article was completed, and then to May 18 while this article was being revised. In this article, we use a Bayesian extension of the susceptible-infected-removed (eSIR) model designed for intervention forecasting to study the short- and long-term impact of an initial 21-day lockdown on the total number of COVID-19 infections in India compared to other, less severe nonpharmaceutical interventions. We compare effects of hypothetical durations of lockdown on reducing the number of active and new infections. We find that the lockdown, if implemented correctly, can reduce the total number of cases in the short term, and buy India invaluable time to prepare its health care and disease-monitoring system. Our analysis shows we need to have some measures of suppression in place after the lockdown for increased benefit (as measured by reduction in the number of cases). A longer lockdown from 42–56 days is preferable to substantially ‘flatten the curve’ when compared to 21–28 days of lockdown. Our models focus solely on projecting the number of COVID-19 infections and thus inform policymakers about one aspect of this multifaceted decision-making problem. We conclude with a discussion on the pivotal role of increased testing, reliable and transparent data, proper uncertainty quantification, accurate interpretation of forecasting models, reproducible data science methods, and tools that can enable data-driven policymaking during a pandemic. Our software products are available at

Keywords: basic reproduction number, coronavirus, credible interval, India, intervention forecasting, SIR model

This article includes a select anonymous review document. Anonymous review is a vital process for high quality publications in HDSR. With permissions of the authors and reviewers, we selectively post anonymously review reports, sometimes with authors' responses, that we believe can further enrich the intellectual journey generated by the corresponding publication.

1. Introduction

Four months since the first case of COVID-19 in Wuhan, China, the SARS-CoV-2 virus has engulfed the world and has been declared a global pandemic (World Health Organization [WHO], 2020b). The number of confirmed cases worldwide stands at a staggering 1,930,780 (as of 9:20 a.m. EST April 14, 2020, Microsoft, n.d.). Of these, 10,815 confirmed cases are from India (Figure 1), the world’s largest democracy with a population of 1.34 billion (compared to China at 1.39 billion and the United States at 325.7 million) (World Bank, n.d.). India has been vigilant and early in instituting strong public health interventions, including sealing the borders with a travel ban/canceling almost all visas, closing schools and colleges, and diligently following up with community inspection of suspected/exposed cases with respect to adherence of quarantine recommendations (Table 1). On March 24, India took the historic decision of a 21-day national lockdown starting March 25, when it had reported only 536 COVID-19 cases and 11 fatalities. In the subsequent days, we have seen a steady growth in the number of new cases and fatalities, with growth rates slower than other affected countries, but in 21 days, the curve has not yet ‘turned the corner’ or showed a steady decline in the number of newly diagnosed cases (Figure 2). All forecasting models in this article use data up to April 14 with the premise of a 21-day lockdown in place.

Figure 1. Description of the cases, recovered, and fatalities in India with landmark policy/recommendations. Data used up to April 14.

While India seems to have done relatively well in controlling the number of confirmed cases compared to other countries in the early phase of the pandemic (Figure 2), there is a critical missing or unknown component in this assessment: ‘The number of truly affected cases,’ which depends on the extent of testing, the accuracy of the test results and, in particular, the frequency and scale of testing of asymptomatic cases who may have been exposed. The frequency of testing has been low in India. According to the Indian Council of Medical Research (ICMR; 2020), only 229,426 subjects have been tested as of April 14 (<0.03% of the population). When there is no approved vaccine or drug for treating COVID-19, entering phase 2 or phase 3 of escalation will have devastating consequences on both the already overstretched health care system of India, and India’s large at-risk subpopulations (see Appendix Table A1). As seen for other countries like the United States or Italy, COVID-19 enters gradually and then explodes suddenly.

Table 1. Timeline of COVID-19 Interventions in India



March 3, 2020

●  India issues travel ban on four countries - China, South Korea, Italy, and Iran

March 6, 2020

●  Union Health Ministry issues advisory to avoid mass gatherings

March 7, 2020

●  Mayor of Agra urges the Union government to close down historical monuments including Taj Mahal

●  Kuwait suspends flights to India

March 9, 2020

●  Qatar puts India on travel ban list

March 10, 2020

●  Manipur closes its border with Myanmar

March 11, 2020

●  India suspends all visas/e-visas granted to nationals of France, Germany, and Spain on or before today

March 12, 2020

●  WHO declares the COVID-19 outbreak as 'pandemic'

●  India suspends all visas excepting those for diplomatic, UN, or international bodies, official and employment purposes until April 15

●  India reports 1st death

March 13, 2020

●  India reports 2nd death

●  Several academic institutions (e.g., JNU, IIT, IIM) cancel classes/convocations; some hostels close

March 16, 2020

●  Central government proposes social distancing measures until March 31

●  India bans passengers from EU countries, UK, and Turkey until March 31

●  Central government recommends closure of educational institutions until March 31

March 17, 2020

●  Taj Mahal is shut until March 31; ASI closes 3,000 monuments and 200 museums

●  Mandatory quarantine is imposed on passengers coming from UAE, Qatar, Oman, and Kuwait

●  India is heading toward a countrywide lockdown mode

March 19, 2020

●  India halts all incoming commercial international flights for one week

●  Some state governments ban public transportation

●  Prime Minister urges people of India to observe self-imposed curfew (‘Janata Curfew’) on March 22

March 20, 2020

●  Maharashtra announces lockdown in Mumbai, Nagpur, and Pune

●  Jawaharlal Nehru University (JNU) in Delhi orders students to vacate hostels

March 21, 2020

●  Private labs can conduct COVID-19 tests, says Maharashtra government

●  Rajasthan government declares lockdown until March 31

March 22, 2020

●  12 states, including Telangana and Delhi, announce lockdown until March 31

●  International commercial passenger flights are disallowed to land in India for one week starting today

●  Railways suspend all train services until March 31

March 23, 2020

●  Central government orders all states in India to impose lockdown

●  Legal action is to be initiated against people violating lockdown measures

March 24, 2020

●  Prime Minister of India announces lockdown for 21 days as country records 552 COVID-19 cases and 10 deaths

March 28, 2020

●  Central government unveils stimulus package to help those hit by 21-day lockdown

●  Priorities are to construct COVID-19 hospitals, sample testing, contact-tracing, and social distancing: Union Health ministry

April 2, 2020

●  Common exit strategy necessary for ‘staggered’ relaxations after lockdown period ends, prime minister tells chief ministers

April 6, 2020

●  Prime Minister instructs union ministers to prepare a graded plan to gradually open departments that are not COVID-19 hotspots

April 8, 2020

●  Prime minister and chief ministers decide on lockdown extension to April 11

April 9, 2020

●  Odisha extends lockdown until April 30 and becomes first Indian state to do so


We provide a table listing other highly affected countries along with their first reported case, initial interventions, crude fatality rates, and active case counts in Appendix Table A2 for reference. The estimated capacity of hospital beds in India is 70 per 100,000 people (World Bank, 2020), which is an upper bound on treatment capacity. Given an average occupancy rate of 75%, only a quarter of these are available (Sindhu et al. 2019). Moreover, critically ill COVID-19 patients (about 5 to 10% of those infected) will require ICU beds and ventilator support. India has only 35,000–58,000 ICU beds, with very high occupancy rates and at most one ventilator per two ICU beds (Times of India, 2020). In order to roll out interventions and plan for health care infrastructure, robust projection models for outcomes of interest are necessary. There are many outcomes that are of potential interest to policymakers, for example: How many infected cases will be hospitalized? How many will be admitted to the ICU? How many patients will need ventilators? And, finally, what will be the mortality due to COVID-19 infections? We focus on the number of active cases as our target of prediction due to the limited data from India on the other outcomes. From other nations we know that roughly 20% of infections will probably need hospitalization (Root, 2020), 5% will need ICU admission (Guan et al., 2020), and case-fatality rates vary from 1 to 5% of those hospitalized(Oke & Henegan, 2020). This may provide crude estimates of other outcomes from case-count predictions.

Figure 2. Early phase of the epidemic and daily growth in cumulative COVID-19 case counts in India compared to other countries affected by the pandemic using data through April 14.

At the time of this writing, there exist several models that have been used to analyze the COVID-19 case-count data from India. The approaches for modeling the disease transmission and then forecasting the number of cases at a future time can be broadly categorized into two genres: exponential/Poisson type models, and compartmental epidemiologic models. For instance, Ranjan (2020) and Gupta and Shankar (2020) used the classical exponential model, S. Das (2020) used a Poisson regression model, while Deb and Majumdar (2020) used an auto-regressive moving-average model to analyze incidence pattern over time. The compartmental epidemiologic models include variations of the susceptible-infected-removed (SIR) model, which is guided by a set of differential equations relating the number of susceptible people, the number of infected people (cases), and the number of people who have been removed (either recovered or dead) at any given time. This simple SIR model has been used by Ranjan (2020) and Dhanwant and Ramanathan (2020). Singh and Adhikari (2020) used an age-structured SIR and social contact model, where an SIR model is assumed in each age category. Another extension of the SIR model is the susceptible-exposed-infectious-recovered (SEIR) model, which incorporates an additional compartment of truly exposed people that is a latent variable. Mandal et al. (2020), Chatterjee et al. (2020), Sardar et al. (2020), and Senapati et al. (2020) used one or the other variation of the SEIR model. For example, Sardar et al. (2020) used an extra compartment for lockdown to capture in-home isolation and study the effects of lockdown on future case counts. Appendix Table A3 compares and summarizes these existing models for India.

In this article, we apply a Bayesian extension of the SIR model, the extended susceptible-infected-removed (eSIR) model, to explore two primary forecasting objectives: (a) forecasting future case counts (short term and long term) with different forms of suppression measures in place (post-lockdown) and (b) studying the relative impact of length/duration of a lockdown on our predictions of cumulative COVID-19 infections. We carry out extensive sensitivity analysis to assess the robustness of our forecasting models. We conclude with a discussion regarding the need for reliable case-count data, increased testing, the importance of uncertainty quantification of the projected case counts, and transparent data science methods that can inform and influence policymaking during a pandemic. Our data science products include three articles on media studying pre- (Ray et al., 2020), during (Salvatore, Wang, et al., 2020), and post- (Salvatore, Ray, et al., 2020) lockdown effects, providing critical information for policymakers and having an extensive reach (Reuters [Ghosal, 2020], Times of India [P. Das, 2020], The Guardian [Ellis-Petersen, 2020], The Economic Times [Noronha, 2020]), an interactive and dynamic R Shiny app that daily updates forecasts as new case counts are reported, and publicly available codes for reproducible research.

The rest of the article is organized as follows. In Section 2 we describe the structure of the eSIR model, our parameter choices, and the Bayesian computational algorithm. In Section 3 we present results from analyzing the data from India that include a sensitivity analysis. We assess how our forecasting model updates itself with more accrual of data over time. In Section 4 we provide an itemized discussion of some of the salient data and data science issues related to intervention forecasting and case-count projections. Section 5 presents a brief conclusion.

2. Methods and Notation

2.1. Study Design and Data Source

We used the current daily data on number of COVID-19 infected cases, recoveries, and deaths in India to predict the number of infected and removed cases at any given time (L. Wang et al., 2020). We obtained the data (up to April 14) from the 2019 Novel Coronavirus Visual Dashboard operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) and from (covid19india, 2020; Dong et al., 2020; Johns Hopkins University Center for Systems Science and Engineering, n.d.). Some of our testing data came from

2.2. Our Statistical Model for Predictions

Overview. The standard SIR model was recently extended to incorporate time-varying transmission rates or time-varying quarantine protocols and is known as the eSIR model (L. Wang et al., 2020). When using the eSIR model with time-varying disease transmission rate, it can depict a series of time-varying changes caused by either external variation like government-initiated macro-isolation measures, community-level protective measures and environment changes, or internal variations like mutations and evolutions of the pathogen. To implement the eSIR model, a Bayesian hierarchical framework is assumed. Using the current time-series data on the proportions of infected and the removed people, a Markov chain Monte Carlo (MCMC) implementation of this Bayesian model provides not only posterior estimation of parameters and prevalence of all the three compartments in the SIR model, but also predicted proportions of the infected and the removed people at future time points. The R package for implementing this general model for understanding disease dynamics is publicly available at

Mathematical Framework of the eSIR Model. The eSIR model works by assuming that the true underlying probabilities of the three compartments follow a latent Markov transition process, and that we observe only the daily proportions of infected cases and removed cases. First, let us establish some notation. Assume that the observed proportions of infected and removed cases on day t are denoted by YtIY_{t}^{I} and YtRY_{t}^{R}, respectively. Further, denote the true underlying probabilities of the S, I, and R compartments on day t by θtS\theta_{t}^{S}, θtI\theta_{t}^{I}, and θtR\theta_{t}^{R}, respectively, and assume that for any t, θtS+θtI+θtR=1\theta_{t}^{S} + \theta_{t}^{I} + \theta_{t}^{R} = 1. Assuming a usual SIR model on the true proportions (Appendix Figure A1-A), we have the following set of differential equations:

dθtSdt=βθtSθtI,\frac{d\theta_{t}^{S}}{\text{dt}} = - \beta\theta_{t}^{S}\theta_{t}^{I},
dθtIdt=βθtSθtIγθtI,\frac{d\theta_{t}^{I}}{\text{dt}} = \beta\theta_{t}^{S}\theta_{t}^{I} - \gamma\theta_{t}^{I},
dθtRdt=γθtI.\frac{d\theta_{t}^{R}}{\text{dt}} = \gamma\theta_{t}^{I}.

Here, β>0\beta > 0 denotes the disease transmission rate, and γ>0\gamma > 0 denotes the removal rate. The basic reproduction number R0βγR_{0} ≔ \frac{\beta}{\gamma} indicates the expected number of cases generated by one infected case in the absence of any intervention and assuming that the whole population is susceptible. At this stage, for the observed infected and removed proportions, we assume a Beta-Dirichlet state-space model, independent conditionally on the underlying process:

YtIθt,τBeta(λIθtI,λI(1θtI)),Y_{t}^{I}|\mathbf{\theta}_{\mathbf{t}}\mathbf{,\tau} \sim Beta\left( \lambda^{I}\theta_{t}^{I},\lambda^{I}\left( 1 - \theta_{t}^{I} \right) \right),
YtRθt,τBeta(λRθtR,λR(1θtR)).Y_{t}^{R}|\mathbf{\theta}_{\mathbf{t}}\mathbf{,\tau} \sim Beta\left( \lambda^{R}\theta_{t}^{R},\lambda^{R}\left( 1 - \theta_{t}^{R} \right) \right).

Further, the Markov process on the latent proportions is built as:

θtθt1,τDirichlet(κf(θt1,β,γ))\mathbf{\theta}_{\mathbf{t}}\mathbf{|}\mathbf{\theta}_{\mathbf{t - 1}}\mathbf{,\tau} \sim Dirichlet\left( \kappa\text{f}\left( \mathbf{\theta}_{\mathbf{t - 1}},\beta,\gamma \right) \right)

where θt\mathbf{\theta}_{\mathbf{t}} denotes the vector of the underlying population probabilities of the three compartments, whose mean is modeled as an unknown function of the probability vector from the previous time point, along with the transition parameters; τ=(β,γ,θ0T,λ,κ)\mathbf{\tau =}(\beta,\gamma,\mathbf{\theta}_{\mathbf{0}}^{T}\mathbf{,}\lambda,\kappa) denotes the whole set of parameters where λI, λR\lambda^{I},\ \lambda^{R} and κ\kappa are parameters controlling variability of the observation and latent process, respectively. The function f(.)f(.) is then solved as the mean transition probability determined by the SIR dynamical system, using a fourth-order Runge-Kutta approximation.

Priors and the MCMC Algorithm Setup of the eSIR Model. The prior on the initial vector of latent probabilities is set as

θ0Dirichlet (1Y1IY1R,  Y1I,  Y1R),θ0S=1θ0Iθ0R.\mathbf{\theta}_{\mathbf{0}}\sim Dirichlet\ (1 - Y_{1}^{I} - Y_{1}^{R},\ \ Y_{1}^{I},\ \ Y_{1}^{R}) , \theta_{0}^{S} = 1 - \theta_{0}^{I} - \theta_{0}^{R}.

The prior distribution of the basic reproduction number isR0LogNormal(0.582,0.223)R_{0} \sim LogNormal(0.582,0.223) so that E(R0)=2E\left( R_{0} \right) = 2 and SD(R0)=1\text{SD}\left( R_{0} \right) = 1, where EE and SD\text{SD} denote the mean and standard deviation, respectively. The prior distribution of the removal rate is γLogNormal(2.955,0.910)\gamma \sim LogNormal( - 2.955,0.910) so that E(γ)=0.082E\left( \gamma \right) = 0.082 and SD(γ)=0.1\text{SD}\left( \gamma \right) = 0.1. The prior mean of the removal rate γ\gamma indicates an average infectious period of 12 days, which is originally set using estimates from the SARS outbreak in Hong Kong (Mkhatshwa & Mummert, 2010) due to the similarity between the two viruses; this value also aligns well with several recent studies on COVID-19 in China (Chen et al., 2020; Li et al., 2020; Ryu & Chun, 2020). The prior mean of the basic reproduction number, 2.0, is approximately the average of estimates from many other COVID-19 studies on the Indian population (S. Das, 2020; Deb & Majumdar, 2020; Ranjan, 2020; Sardar et al., 2020; Singh & Adhikari, 2020). We have conducted a sensitivity analysis to evaluate how robust the model is toward the prior settings using Indian population COVID-19 data. The sensitivity issue can be minimized with more observed data of a longer exponentially increasing period and stronger intensities by focusing on cities or states that are highly exposed. Note that the prior mean of the distribution of the transmission rate β\beta equals γR0\gamma R_{0}. For the variability parameters, the default choice is to set large variances in both observed and latent processes, which may be adjusted over the course of the epidemic with more data becoming available:

κ, λI,λR  iid Gamma(2, 0.0001)\kappa,\ \lambda^{I},\lambda^{R}\sim\ \ iid\ Gamma\left( 2,\ 0.0001 \right)\text{.~}

Denoting t0t_{0} as the last date of data availability, and assuming that the forecast spans over the period [t0+1,T]\lbrack t_{0} + 1,T\rbrack, our algorithm is as follows.

0. Take MM draws from the posterior [θ1:t0,τY1:t0]\lbrack\mathbf{\theta}_{\mathbf{1:}\mathbf{t}_{\mathbf{0}}},\mathbf{\tau|}\mathbf{Y}_{\mathbf{1:}\mathbf{t}_{\mathbf{0}}}\rbrack.

1. For each solution path m{1,,M}m \in \{ 1,\ldots,M\}, iterate between the following two steps via MCMC.

i. Draw θt(m)\mathbf{\theta}_{\mathbf{t}}^{\mathbf{(m)}} from [θt|θt1(m1),τ(m)], t{t0+1,,T}\left\lbrack \mathbf{\theta}_{\mathbf{t}} \middle| \mathbf{\theta}_{t - 1}^{\left( m - 1 \right)},\mathbf{\tau}^{\left( m \right)} \right\rbrack,\ t \in \{ t_{0} + 1,\ldots,T\}.

ii. Draw Yt(m)\mathbf{Y}_{\mathbf{t}}^{\mathbf{(m)}} from [Yt|θt(m),τ(m)], t{t0+1,,T}\left\lbrack \mathbf{Y}_{\mathbf{t}} \middle| \mathbf{\theta}_{t}^{\left( m \right)},\mathbf{\tau}^{\left( m \right)} \right\rbrack,\ t \in \{ t_{0} + 1,\ldots,T\}.

Modeling Intervention. We model the effect of interventions by assuming that the intervention will result in a decrease in the transmission from the S compartment to the I compartment. We do so by decreasing the effective rate of transition (or, equivalently, the chance of interaction between members of S and I), by introducing a time-varying transmission rate modifier π(t)[0,1]\pi\left( t \right) \in \lbrack 0,1\rbrack. This updates the flow between the three compartments (Appendix Figure A1-B) via a set of differential equations as follows:

dθtSdt=βπ(t)θtSθtI,\frac{d\theta_{t}^{S}}{\text{dt}} = - \beta\pi(t)\theta_{t}^{S}\theta_{t}^{I},
dθtIdt=βπ(t)θtSθtIγθtI,\frac{d\theta_{t}^{I}}{\text{dt}} = \beta\pi(t)\theta_{t}^{S}\theta_{t}^{I} - \gamma\theta_{t}^{I},
dθtRdt=γθtI.\frac{d\theta_{t}^{R}}{\text{dt}} = \gamma\theta_{t}^{I}.

The reproductivity is thus modified by the intervention over time as R0π(t)R_{0}\pi\left( t \right). To better understand the introduction of this effect modifier, we follow an example given by L. Wang et al. (2020). Suppose at a time tt, qS(t)[0,1] q^{S\left( t \right)} \in \left\lbrack 0,1 \right\rbrack\ is the chance of an at-risk person being in home isolation, and qI(t)[0,1]q^{I\left( t \right)} \in \left\lbrack 0,1 \right\rbrack is the chance of an infected person being in hospital quarantine. Consequently, the chance of disease transmission when an at-risk person meets an infected person is β {1qS(t)}θtS {1qI(t)}θtI :=β π(t) θtSθtI \text{β\ }\left\{ 1 - q^{S\left( t \right)} \right\}\theta_{t}^{S}\ \left\{ 1 - q^{I\left( t \right)} \right\}\theta_{t}^{I}\ : = \beta\ \pi\left( t \right)\ \theta_{t}^{S}\theta_{t}^{I}\ , with π(t) := {1qS(t)}{1qI(t)}[0,1]\pi\left( t \right)\ : = \ \left\{ 1 - q^{S\left( t \right)} \right\}\left\{ 1 - q^{I\left( t \right)} \right\} \in \lbrack 0,1\rbrack. In effect, this π(t)\pi(t) modifies the chance of a susceptible person meeting with an infected person, which is termed as a transmission modifier. In this article, the functional form of π(t)\pi(t) is a continuous function that reflects a combination of steadily increased community-level awareness and responsible quarantine and preventive measures, and the country-wide lockdown measures initiated by the government. This predefined transmission modifier can be smoothly incorporated into the differential equations as well as the MCMC algorithms. Its functional form can be quite flexible in reflecting the changing pattern of human intervention that affects the transmission rate of the epidemic within the population.

Implementation of the eSIR Model. We implemented the proposed algorithm in R package rjags (Plummer et al.) and the differential equations were solved via a fourth-order Runge–Kutta approximation. To ensure quality of the MCMC, we set the adaptation number to be 10410^{4}, thinned the chain by keeping one draw from every 10 random draws to reduce autocorrelation, set a burn-in period of 10510^{5} draws to let the chain stabilize, and started from 4 separate chains. Thus, in total, we have 2×105{2 \times 10}^{5} effective draws with about 2×106{2 \times 10}^{6} draws discarded. One could reduce the computation time, but consequently might risk the quality of data. This implementation provides not only posterior estimation on parameters and prevalence of all three compartments in the SIR model, but also predicted proportions of the infected and the removed cases at future time points. To obtain predicted case counts from the predicted prevalence, we used 1.34 billion as the population of India, thus treating the country as a homogeneous system for the outbreak (World Bank, 2017).

Uncertainty Quantification. One major advantage of a Bayesian implementation is that uncertainty associated with all parameters and functions of parameters can be calculated from exact posterior draws without relying on large-scale approximation or the delta method. The credible intervals (CrI) for the prevalence are computed using the posterior distribution of proportions given the observed confirmed and removed prevalence, that is, Y(t0+1):TIY1:t0I, Y1:t0RY_{{(t}_{0} + 1):T}^{I}|Y_{1:t_{0}}^{I},\ Y_{1:t_{0}}^{R} and Y(t0+1):TRY1:t0I, Y1:t0RY_{{(t}_{0} + 1):T}^{R}|Y_{1:t_{0}}^{I},\ Y_{1:t_{0}}^{R}, where t0t_{0} denotes the last observed date, and TT denotes the last forecast date. More specifically, suppose we want to compute the 95% posterior CrI for the observed proportion of confirmed cases on the first day of forecast, that is, a CrI for the random variable Yt0+1IY_{t_{0} + 1}^{I}. Then, from the MM solution paths of the posterior, we have the draws {Yt0+1(m), 1mM}\left\{ Y_{t_{0} + 1}^{\text{I\ }\left( m \right)},\ 1 \leq m \leq M \right\}. We construct a 95% posterior CrI for Yt0+1IY_{t_{0} + 1}^{I} by simply computing the 2.5th upper and lower percentiles from this set of M draws. The cumulative prevalences are sums of the draws from the I and R compartments at a given time and thus the confidence interval for the sum can be calculated in a similar way. Case counts can be obtained from prevalences by using population size. Similar techniques apply to θt0+jI\theta_{t_{0} + j}^{I} for any 1jTt01 \leq j \leq T - t_{0} and transmission parameters like β\beta and γ\gamma. For instance, a 95% posterior CrI for β\beta can be constructed by calculating the 2.5th upper and lower percentiles of {β(m), 1mM}\left\{ \beta^{(m)},\ 1 \leq m \leq M \right\}. Therefore, we could simply define R(m)=β(m)γ(m)  1mMR^{(m)} = \frac{\beta^{(m)}}{\gamma^{(m)}}\ \forall\ 1 \leq m \leq M, and compute the 95% posterior CrI for the effective reproduction number RR from {R(m), 1mM}\left\{ R^{(m)},\ 1 \leq m \leq M \right\}.

2.3. Parameter Choices for Short-Term Forecasts

We made projections of the cumulative number of cases over a time horizon to assess the short-term impact of lockdown as well as the long-term impact of lockdown and post-lockdown activities. For the short-term forecast on April 30, we assumed lockdown is implemented until April 14 with either a 1- or a 2-week delay in people’s adherence/compliance to lockdown restrictions. We compared these projections with two hypothetical scenarios: (A) no non-pharmaceutical intervention (i.e., a constant disease transmission rate over time since the first case was reported in India), (B) a moderate intervention with social distancing and travel bans only (i.e., a decreased transmission rate compared to no intervention). The prior mean for R0R_{0} (the expected number of cases generated by one infected person assuming that the whole population is susceptible) was set at 2.0. This was estimated based on the early phase data in India and is consistent with other models (S. Das, 2020; Deb & Majumdar, 2020; Ranjan, 2020; Sardar et al., 2020; Singh & Adhikari, 2020). For the no intervention and the moderate intervention scenarios, we chose the transmission rate and the removal rate such that the means for the prior distribution of the basic reproductive number R0R_{0} are 2.0 and 1.5, respectively (SD = 1). The change in R0R_{0} from 2.0 to 1.5 as an effect of intervention was created based on what we saw regarding the effect of interventions and the relative reduction of R0R_{0} in Wuhan (C. Wang et al., 2020). Given the similar population size and comparable population densities in China and India, the assumption on similar effect of interventions on the pandemic across the two countries does not seem too restrictive. For the current scenario of lockdown, our chosen mean for the prior of R0R_{0} starts with 2.0 during the period of no intervention, drops to 75% of its original value or 1.5 during the period of moderate intervention, and further drops to 0.8 during the 21-day lockdown period, and moves back up to 1.5 after the lockdown ends as described in Figure 3 (assuming a gradual, moderate resumption of daily activities). The drop in R0R_{0} from 2.0 to 0.8 during lockdown represents a 60% reduction, which is proportionally slightly less than the ~65% drop estimated in R0R_{0} from the COVID-19 outbreak in Wuhan following the introduction of cordon sanitaire, or restriction of movement of people (Lin, 2020; Pan et al., 2020).

Figure 3. Implied  schedules corresponding the hypothetical scenarios under quick adherence. Corresponding plot for slow adherence is in Appendix Figure A2.

2.4. Parameter Choices for Long-Term Forecasts

For the longer term forecast until June 15, we considered three hypothetical post-lockdown scenarios: (i) people return to normal activities due to the urgent desire for reconnecting after lockdown, (ii) people return to moderate activities as they did during the period with social distancing and travel ban intervention, and (iii) people make a cautious return out of fear for the coronavirus and partake in subdued activities. For these three scenarios, we assume the prior mean on R0R_{0} moves back up from 0.8 to 2.0, 1.5 and 1.2, respectively, 3 weeks after lockdown ends on April 14. We compared these post-lockdown scenarios with another hypothetical scenario involving perpetual social distancing and travel ban without any lockdown (we fixed the prior mean on R0R_{0} at 1.5 over the entire forecasting interval). The time-dependent changes to R0R_{0} values across our simulation scenarios are depicted in Figure 3.

2.5. Parameter Choices for Duration of Lockdown Analysis

To assess the long-term impact of lockdown duration, we considered four scenarios: 21-, 28-, 42-, and 56-day lockdown periods. In all scenarios, we assume the prior mean on R0R_{0} remains at 0.8 for the duration of the lockdown. Post-lockdown, the prior mean on R0R_{0} gradually returns to a value of 1.5 over a span of 3 weeks (analogous to the ‘moderate return’ scenario). The changes to R0R_{0} values across our simulation scenarios for studying length of lockdown are depicted in Appendix Figure A7.

2.6. Open-Source Software

We are committed to data transparency and reproducible research. Daily updates of our India projections, based on cases, recovered, and deaths reported the day before by, a crowd-sourced database using state bulletins and official handles, can be found in our interactive and dynamic Shiny app ( Apart from the scenarios described in this article, anyone can create predictions under other hypothetical scenarios with quantification of uncertainties. Open-source code underlying this app are available at

3. Results

As we interpret the results from our model, let us use caution in not overinterpreting the numbers. Any statistical model is wrinkled with many assumptions. Similarly, the predictions themselves have large uncertainty (as reflected by the large upper-credible limits). A rigorous quantitative treatment often allows us to analyze a problem with clarity and objectivity, but we recommend focusing more on the qualitative takeaway messages from this exercise rather than concentrating on the exact numerical projections or quoting them with certainty.

3.1. Short-Term Forecast of Cumulative Case Counts in India

Under national lockdown (March 25–April 14), our predicted cumulative number of COVID-19 cases in India on April 30 are 19,625 and 19,503 (upper 95% CrI 130,326 and 129,422) assuming a 1- or 2-week delay (i.e., either a quick or a slow adherence), respectively, in people’s adherence to lockdown restrictions and a gradual, moderate resumption of daily activities post-lockdown (Figure 4, Appendix Figure A3). In comparison, the predicted cumulative number of cases under “no intervention” and the “intervention involving social distancing and travel bans without lockdown” are 222,000 and 53,000 (upper 95% CrI of nearly 1.4 million and 0.3 million), respectively. Under quick adherence, these figures correspond to a relative reduction of 91% and 63% of cases due to lockdown with moderate return compared to “no intervention” and “social distancing and travel ban.” The relative reduction in cases between two scenarios (often from the least to the most intense intervention) is calculated as the difference between an estimate (on a particular day, e.g., April 30) under the social distancing and travel ban scenario and under the lockdown with moderate return scenario and then divided by the estimate under the social distancing and travel ban scenario.

Figure 4. Short-term daily growth in cumulative case counts in India assuming a 1-week delay in people’s adherence to restrictions. Observed data are shown for days up to April 14. Predicted future case counts for April 15 until April 30 are based on observed data until April 14 using the eSIR (extended susceptible-infected-removed) model. The dashed horizontal line represents the upper 95% credible limit for estimates under “lockdown with moderate release” scenario. Corresponding graph following a 2-week delay schedule can be found in Appendix Figure A3.

We are reporting only the upper credible limit here and elsewhere since the lower credible limits are very small and uninformative due to the large uncertainty in our predictions arising from many unknowns. We also believe that our point estimates are at best underestimates due to potential surveillance bias (underreporting and/or misdiagnosis of case counts) and our model not taking into account the population density, age-sex composition, and regional contact network structure of the whole nation. Increase in testing and community transmission may lead to a spike in a single day and that may increase the projections substantially upward. Regardless of the exact numbers, it is clear that the 21-day lockdown will likely have a strong relative effect on reducing the predicted number of cases in the short term when compared to weaker interventions.

3.2. Long-Term Impact of Lockdown on the Outbreak in India

We took a close look at what might be coming in the next 2 months, based on what we have seen in other countries and an epidemiologic model that has been gainfully employed to assess the effect of interventions in Hubei province (L. Wang et al., 2020). We estimate that roughly 388,000 (upper 95% CrI 2.4 million), 7.5 million (upper 95% CrI 104 million), and 18.5 million (upper 95% CrI 196 million) cases are prevented on May 15, June 15, and July 15, respectively, by instituting a 21-day lockdown with quick adherence and a cautious return compared to perpetual social distancing and travel ban (without lockdown) (Figure 5). This corresponds to relative reductions in cases of 93%, 96%, and 91%, respectively, compared to perpetual social distancing.

Without some measures of suppression after lockdown is lifted, the impact of lockdown in bringing down the case counts (the now ubiquitous term, ‘flattening the curve’) can be negated by as early as the first week of June. In fact, in Figure 5a, the preintervention (‘normal’) curve first passes the social distancing and travel ban curve on June 5. In particular, if people immediately go back to preintervention (‘normal’) activities post-lockdown, a surge in the predicted case counts is expected in the long term beyond what we would have seen if there were only social distancing and travel ban measures without any lockdown (27 million when post-lockdown activity returns to preintervention levels versus 26 million under social distancing and travel ban without a lockdown period on July 31; Figure 5). Longer lockdowns would delay this crossover, but a normal (preintervention) return post-lockdown would surpass social distancing and travel ban (if these scenarios continued perpetually). Long-term forecasting under slow adherence (2-week delay) can be seen in Appendix Figure A4.

Figure 5. Long-term daily growth in case counts in India per 100,000 people assuming a 1-week delay and how that is affected by different non-pharmaceutical intervention strategies. Predicted cumulative (a) and incident (b) case counts from May 1 to July 31 from the eSIR(extended susceptible-infected-removed) model are shown, based on observed data until April 14. Corresponding plots for slow adherence are in Appendix Figure A4.

We present posterior density and trace plots for the underlying model parameters β, γ,\ \beta,\ \gamma, and R0R_{0}, posterior distributions for the predictions \text{Y\ }and the latent proportions θ\theta for the I and R compartments over time, and estimates and posterior distribution of the daily prevalence of active cases over time or dθtIdt\frac{d\theta_{t}^{I}}{\text{dt}} . These are contained in Appendix Figure A5.

3.3. Relative Impact of Duration of Lockdown on Predicted Case Counts

We took the quick adherence epidemiologic models and compared the 21-day lockdown with hypothetical 28-, 42-, and 56-day lockdown scenarios (Figure 6). When comparing a 21-day lockdown with a hypothetical lockdown of longer duration, we find that 28-, 42-, and 56-day lockdowns can approximately prevent 733,000 (upper 95% CrI 6.8 million), 1.4 million (upper 95% CrI 9.8 million), and 1.6 million (upper 95% CrI 10.3 million) cases by June 15, respectively. These numbers correspond to a relative reduction in cases of 45%, 86% and 96%, respectively. A 28-day lockdown does not appear to have a substantial impact on cumulative case counts when compared to a 21-day lockdown. From an epidemiologic perspective, there appears to be some evidence that suggests a 42- or 56-day lockdown would have a more meaningful impact on reducing cumulative COVID-19 case counts in India. Our models suggest that some form of post-lockdown suppression (e.g., extension of social distancing measures, limits of gathering size, etc.) is necessary to observe long-term benefits of the lockdown period. We note that longer lockdown periods are also accompanied by increasing costs to individuals, such as economic costs, mental health issues, and other public health exacerbation costs and must be considered in policymaking.

Figure 6. Cumulative (a) and incidence (b) graphs for forecasting models assuming a 1-week delay under 21-, 28-, 42-, and 56-day lockdown scenarios using observed data through April 14. Corresponding plots for slow adherence are in Appendix Figure A4.

Lockdown duration study under the slow adherence (2-week delay) scenario can be found in Appendix Figure A6. The implied R0R_{0} plots can be found in Appendix Figure A7.

3.4. Sensitivity Analyses

We did explore some alternative assumptions and conducted thorough sensitivity analysis before settling on the models presented above. In one example, we assumed that there are actually 10 times the number of reported cases to date to reflect potential underreporting of cases due to lack of testing. We note that our predictions of case counts indeed go up and the effect of underreporting of cases is more palpable with long-term projections (see Table 2, underreporting). In another scenario, we assumed these cases occurred in metropolitan areas to reflect a potential intensification of case clustering. In our primary analyses, we assumed that the cumulative case counts across the country represent equal contributions from all the regions, and using the whole population of India as a scaling factor to compute initial inputs for Y1:t0IY_{1:t_{0}}^{I} and Y1:t0RY_{1:t_{0}}^{R}. This may lead to extremely small proportions, which may in turn yield underestimated outputs from the eSIR model. Changing the total population size from that of India to that of representative (large) cities from the hub states (the states of Kerala, Maharashtra, and Karnataka, and the national capital region of Delhi) is a simple but intuitive way to potentially do away with the aforementioned underestimation. We note that this substantially reduces the width of the credible intervals (see Table 2, case-clustering). In yet a third scenario, we hypothesized that the prior mean of R0R_{0} is set at 3.0 or 4.0 instead of 2.0 (i.e., a single infected individual would infect 3 or 4 susceptible individuals, on average, instead of 2). In most of our analyses (Table 2), the posterior mean for R0R_{0} was seen to be from 1.8 and 2.4, irrespective of whether a higher/lower starting (prior) mean was used. We observe that a prior mean of 4.0 for R0R_{0} sways the posterior R0R_{0} estimate substantially (posterior mean 3.38). As more data accumulate, we will expect the effect of the prior on the posterior estimates to diminish. The posterior distributions of the prevalence in each compartment and latent proportions under these changing scenarios are available in Appendix Figure A8.

Table 2. Comparison of Estimated Projections and Posterior Estimates of Model Parameters Across Different Sensitivity Analysis Scenarios Under 21-Day Lockdown With Moderate Return, Using Observed Data Until April 14

Sensitivity Analysis



Posterior Estimates


May 1

May 15










[1.05, 4.20]


[0.05, 0.39]


[0.03, 0.19]







[1.47, 4.70]


[0.07, 0.26]


[0.03, 0.10]

Prior mean for R0=2R_0 = 2






[0.87, 3.26]


[0.06, 0.59]


[0.04, 0.35]

Prior mean for R0=3R_0 = 3






[1.41, 4.07]


[0.09, 0.60]


[0.04, 0.30]

Prior mean for R0=4R_0 = 4






[2.09, 5.27]


[0.10, 0.63]


[0.03, 0.23]

Note. Prior SD for R0R_0 is 1.0.
* Observed case counts are multiplied by 10, prior mean for R0=2R_0=2
** Assume that the cases happen in metro hotspots, use population size N=32 million instead of national population 1.34 billion, prior mean for R0=2R_0=2

In summary, these sensitivity analysis scenarios did not appreciably change our conclusions in broad qualitative terms, though the exact quantitative projections for case counts are quite sensitive to such choices. We note that the estimate of basic reproduction number R0R_{0} is more robust to underreporting issues because counts in all compartments of our eSIR model are assumed to be underreported. Since underreported case counts affect all our hypothetical intervention scenarios in a similar way, the relative comparison of interventions and the associated conclusion remain valid in a qualitative sense. In all cases, our confidence in these projections decreases markedly the farther into the future we try to forecast. It is extremely important to update these models as new data arise.

3.5. Model Calibration

To check the calibrating properties of our model, we truncated the data to certain dates and tried assessing the quality of the case-count predictions with essentially adding one more week of data for predicting active cases at a future date (Table 3 and Appendix Figure A9). We do notice the projected case counts change significantly with more data and improve (become closer to the observed) with more data. Our projections always underestimate the observed counts. This phenomenon is also due to more testing being done each week. However, the observed number of infected cases is always within the 95% prediction credible interval provided by our model. This again reveals the large uncertainty in our predictions.

Table 3. Comparison of Model Projections Using Observed Data up to Different Dates Assuming a 21-Day Lockdown With Moderate Return



Projected Counts

[Upper Credible Interval]

           Posterior Estimates [95% CrI]


April 15

May 1










Used data up to April 1






[0.84, 3.47]


[0.05, 0.70]


[0.03, 0.40]

Used data up to April 7






[0.80, 3.22]


[0.05, 0.52]


[0.03, 0.32]

Used data up to

April 14






[0.87, 3.26]


[0.04, 0.35]


[0.04, 0.35]

Note. All prediction scenarios assume a prior mean of R0=2R_0=2

4. Discussion

Our projections using current daily data on case counts until April 14 in India show that the lockdown, if implemented correctly in the end, has a high chance of reducing the number of COVID-19 cases in the short term and buying India invaluable time to prepare its health care and disease-monitoring system. In the long term, we need to have some measures of suppression in place after the lockdown is lifted to prevent a massive surge in the number of cases that can quickly overwhelm an already overstretched Indian health care system resulting in increased fatalities. Specific vulnerable populations will be at higher risk of severity and fatality from COVID-19 infection: older persons and persons with preexisting medical conditions (e.g., high blood pressure, heart disease, lung disease, cancer, diabetes, immunocompromised persons) (Centers for Disease Control and Prevention [CDC], 2020; WHO, 2020a). Appendix Table A1 provides a description of the approximate number of individuals in these high-risk categories in India. Beyond the fragile population characterized by health and economic indicators, we must remember that health care workers and first responders at the frontline of this pandemic are among the most vulnerable (C. Wang et al., 2020). Though we have focused on forecasting and modeling of COVID-19 case counts in this article, we recognize that this is only one component of the problem. Long-term surveillance and management of the COVID-19 crisis is needed with not just public health in mind but also to take care of the economic, social, and psychological impact that it will have on the people of India.

4.1. Limitations

Our statistical modeling and forecasts are not without limitations. We have limited number of data points and a wide time-window to extrapolate for the long-term forecasts. The uncertainty in our predictions is largely due to many unknowns arising from model assumptions, population demographics, the number of COVID-19 diagnostic tests administered per day, testing criteria, accuracy of the test results, and heterogeneity in implementation of different government-initiated interventions and community-level protective measures across the country. We have neither accounted for age-structure, contact patterns, or spatial information to finesse our predictions (Klein et al., 2020; Mandal et al., 2020; Singh & Adhikari, 2020) nor considered the possibility of a latent number of true cases, only a fraction of which are ascertained and observed (C. Wang et al., 2020). Increase in frequency and scale of testing, and community transmission of the SARS-CoV-2 virus may lead to a spike in a single day and that can shift the projection curve substantially upward. COVID-19 hotspots in India are not uniformly spread across the country, and state-level forecasts (S. Das, 2020) may be more meaningful for state-level policymaking. We are assuming that the implementation and effects of public health interventions and policies are the same everywhere in India by treating India as a homogeneous unit.

The eSIR model treats the entire group of people within a single compartment as homogenous and exchangeable. We also assume that all subjects who were not infected are susceptible. Certainly, this overlooks the possibilities of people moving between states and different subsets of infected and susceptible populations having greater or lesser likelihood of coming into contact with one another. To account for all such potential factors, we need more-nuanced modeling. One potential way is to break up the interaction component into further subcompartments; however, sparsity of current data in each subcompartment is an issue. Another way that has been pursued in a recent study is to inform the SIR modeling via introducing contact networks at the initial stage (Bhattacharyya et al., 2020). However, it is worth noting that any such approach would need more-granular and reliable data containing individual details of the confirmed cases, including their location and travel history. Even though such data are available from some self-reporting–based repositories (Kaggle, 2020), the quality and detail of the information provided are quite heterogeneous and, thus, how to best utilize such data remains a question. We see the tremendous role of data and its transparency of collection and reporting in finessing our predictions. Accurate and consistent reporting of case counts and deaths due to COVID-19 are extremely critical. Future opportunities for improving our model include incorporating contagion network, age-structure, and test imperfection and estimating SEIR model and true fatality/death rates. Future research would benefit from easily accessible hospitalization data, accurate recording of deaths in death records, and availability of ecological-level covariate data. Regardless of the caveats in our study, our analyses show the impact and necessity of lockdown and of suppressed activity post-lockdown in India. Though the exact numerical projections are perhaps far from the truth, the qualitative inference on the relative effects of the interventions are still valuable and valid since all projections are subject to similar biases.

One ideological limitation of considering only the perspective of controlling COVID-19 transmission in our model is the inability to account for excess deaths due to other causes during this period (chronic disease and mental health–related diseases in particular), or the flexibility to factor in reduction in mortality/morbidity from some other infectious or flu-like illnesses, traffic accidents, or the health benefits of reduced air pollution levels. A more expansive framework of a cost–benefit analysis is needed as we gather more data and build an integrated landscape of changes in population-attributable risks due to various disease categories.

4.2. Testing

A reviewer of this article suggested giving a sense of the testing data from India and how that may affect our conclusions. India’s priorities in testing have changed multiple times over the past few weeks. On March 17, India proposed testing all people who recently traveled internationally and developed symptoms (fever, cough, difficulty in breathing, etc.) of COVID-19 within 14 days of return, all symptomatic contacts of laboratory-confirmed positive cases, or all symptomatic health care workers managing respiratory distress. On March 20, India revised this testing strategy to include all symptomatic health care workers, all hospitalized patients with severe acute respiratory illness (fever and cough and/or shortness of breath), and asymptomatic direct and high-risk contacts of a confirmed case to be tested once between day 5 and day 14 of coming into contact. The testing strategy was again revised on April 9 to include testing of all symptomatic people in hotspots/clusters and in large migration gatherings or evacuee centers. These testing strategies are obtained from the Indian Council of Medical Research (

We looked a little deeper into the issue of testing bias using data from Our World in Data (Hasell et al., n.d.). Appendix Figure A10-A shows that, even though the number of tests in India has increased in recent days, the proportion of daily discovered positive cases still remain stable (at about 4%) and do not yet show an obvious increasing trend like in other countries such as the United States and United Kingdom. We also plot Iceland and South Korea on this figure as they have performed remarkably in administering a large number of tests per detected case and serve as examples for the world. We also looked at the proportion of the population tested in 61 countries around the world (Appendix Figure A10-B). While most advanced countries have tested around 1 to 3% of the population, India has tested roughly 0.06% population and it will take weeks, if not months, for India to reach testing 1–3% of the population. In the absence of rapid and large-scale testing, informative proxies or surrogates can be tracked through syndromic surveillance, temperature reporting, and monitoring hospital admissions and medical claims due to respiratory and flu like illnesses. This additional information will strengthen the prediction models. In absence of these models, we rely on sensitivity analysis for underreporting as presented in Table 2.

4.3. Our Data Science Product

Finally, in our strong commitment to reproducibility and dissemination of our research, we have made the code for our predictions available at GitHub and created an interactive and dynamic R Shiny app to visualize the observed data and create predictions under hypothetical scenarios with quantification of uncertainties. These forecasts are updated daily as new data come in. We hope these products will remain our contribution and service as data scientists during this tragic global catastrophe, and the model and methods will be used to analyze data from other countries.

5. Conclusion

Our epidemiologic and mathematical calculations make a convincing case for enforcing national lockdown in the largest democracy in the world, acting early, before the growth of COVID-19 infections in India starts to accelerate. We observe the public health benefit of extending the lockdown by 3 to 5 weeks in our projections. Measures of suppression are needed post-lockdown to acquire maximal long-term benefits from the lockdown. We also illustrate the critical role of epidemiologic forecasting in aiding policy decisions through this modeling exercise. We highlight the importance of conducting model checks, sensitivity analysis, and uncertainty quantification.

We realize that these draconian public health measures come at a tremendous price to social and economic health that can last for months or even years after the restrictions on social mobility are lifted. Our general message to the public is to proceed with prudence and caution and adhere to effective public health interventions until there are rapid and reliable home testing kits; there are none yet (Food and Drug Administration [FDA], 2020), FDA approved drugs (WHO, 2020c), and a vaccine (Craven, 2020).

Disclosure Statement

No conflicts of interest to declare. The research was supported by the University of Michigan Precision Health Initiative, University of Michigan School of Public Health, University of Michigan Institute for Health Care Policy and Innovation, Michigan Institute of Data Science, and the University of Michigan Rogel Cancer Center.


We the authors would like to thank the University of Michigan Advanced Research Computing Services, in particular Professor Ravi Pendse, for enabling daily updates to our models and allocating us abundant computational resources. We would also like to thank Professor Matthew Fox from the Boston University School of Public Health for his valuable comments on our R Shiny App. This work originated at a time when the authors in this study team came together to provide their service as quarantined data scientists. We thank our families and friends for their support during these unprecedented times. The initial report made a profound impact through media outlets; we remain grateful to journalists across the world who reported our findings. In the middle of a global pandemic, we found inspiration in the heroic battles by frontline health workers, we found hope in science, we found strength in the power of common people and we found solace in the magic of human kindness. We wish the government and the people of India the very best in their fight against the coronavirus. We are all in this together.


Beaubien, J. (2020, March 26). How South Korea reined in the outbreak without shutting everything down. NPR.

Bhattacharyya, R., Mohammed, S., Baladandayuthapani, V., Banerjee, S., & Nanda, U. (2020, April). Regional contact networks and the pandemic spread of COVID-19 in India. Medium.

Bryner, J. (2020, March 14). 1st known case of coronavirus traced back to November in China. Live Science.

Centers for Disease Control and Prevention. (2020). Coronavirus Disease 2019 (COVID-19): Are you at higher risk for severe illness?

Central Intelligence Agency. (2020). CIA World Factbook: Age Structure.

Chatterjee, K., Chatterjee, K., Kumar, A., & Shankar, S. (2020). Healthcare impact of COVID-19 epidemic in India: A stochastic mathematical model. Medical Journal Armed Forces India, 76(2), 147–155.

Chen, N., Zhou, M., Dong, X., Qu, J., Gong, F., Han, Y., Qiu, Y., Wang, J., Liu, Y., Wei, Y., Xia, J., Yu, T., Zhang, X., & Zhang, L. (2020). Epidemiological and clinical characteristics of 99 cases of 2019 novel coronavirus pneumonia in Wuhan, China: a descriptive study. The Lancet, 395(10223), 507–513.

covid19india. (2020). India COVID-19 tracker.

Craven, J. (2020, April 10). COVID-19 vaccine tracker. Regulatory Affairs Professionals Society.

Das, P. (2020, April 1). Epidemiologic models show we need aggressive measures in the early phase...lockdown buys us time. The Times of India.

Das, S. (2020). Prediction of COVID-19 disease progression in India: Under the effect of national lockdown. ArXiv.

Deb, S., & Majumdar, M. (2020). A time series method to analyze incidence pattern and estimate reproduction number of COVID-19. ArXiv.

Dhanwant, J. N., & Ramanathan, V. (2020). Forecasting COVID 19 growth in India using Susceptible-Infected-Recovered (S.I.R) model. ArXiv.

Dong, E., Du, H., & Gardner, L. (2020). An interactive web-based dashboard to track COVID-19 in real time. The Lancet Infectious Diseases. Advance online publication.

Ellis-Petersen, H. (2020, March 25). Overcome by anxiety: Indians in lockdown many can ill afford. The Guardian.

Food and Drug Administration. (2020, March 20). Coronavirus (COVID-19) update: FDA alerts customers about unauthorized fraudulent COVID-19 Tests.

Gan, N. (2020, March 24). China to lift lockdown on Wuhan, ground zero of coronavirus pandemic. CNN.

Ghosal, D. (2020, March 23). India faces spike in coronavirus cases, says study, in test for health system. Reuters.

Guan, W., Ni, Z., Hu, Y., Liang, W., Ou, C., He, J., Liu, L., Shan, H., Lei, C., Hui, D. S. C., Du, B., Li, L., Zeng, G., Yuen, K.-Y., Chen, R., Tang, C., Wang, T., Chen, P., Xiang, J., … Zhong, N. (2020). Clinical characteristics of coronavirus disease 2019 in China. New England Journal of Medicine, 382(18), 1708–1720.

Gupta, R., & Ram, C. V. S. (2019). Hypertension epidemiology in India: emerging aspects. Current Opinion in Cardiology, 34(4), 331-341.

Gupta, S., & Shankar, R. (2020). Estimating the number of COVID-19 infections in Indian hot-spots using fatality data. ArXiv.

Hasell, J., Ortiz-Ospina, E., Mathieu, E., Ritchie, H., Beltekian, D., Macdonald, B., & Roser, M. (n.d.). Our World in Data COVID-19 testing dataset. Our World in Data. Retrieved April 14, 2020, from

IMS Institute. (2013). Understanding Healthcare Acces in India.

Indian Council of Medical Research. (2020). SARS-CoV-2 (COVID-19) Testing: Status Update 14 April 2020 9:00 PM IST.

International Diabetes Federation. (2020). IDF SEA members.

Johns Hopkins University Center for Systems Science and Engineering. (2020). JHU CSSE COVID-19 Github. Retrieved April 14, 2020, from

Jones, J. (2020, March 28). Spanish PM announces stricter lockdown measures to tackle coronavirus. Reuters.

Kaggle.  (2020). Novel Corona Virus 2019 dataset.

Karimi, N., & Krauss, J. (2020, March 15). Iran reports more than 100 new virus deaths as fears mount. Associated Press.

Klein, E., Lin, G., Tseng, K., Schueller, E., Kappor, G., & Laxminarayan, R. (2020). COVID-19 for India updates.

Lawler, D. (2020, March 23). Timeline: How Italy’s coronavirus crisis became the world’s deadliest. Axios.

Li, Q., Guan, X., Wu, P., Wang, X., Zhou, L., Tong, Y., Ren, R., Leung, K. S. M., Lau, E. H. Y., Wong, J. Y., Xing, X., Xiang, N., Wu, Y., Li, C., Chen, Q., Li, D., Liu, T., Zhao, J., Liu, M., … Feng, Z. (2020). Early transmission dynamics in Wuhan, China, of novel coronavirus–infected pneumonia. New England Journal of Medicine, 382(13), 1199–1207.

Lin, X. (2020). Analysis of 25,000 lab-confirmed COVID-19 cases in Wuhan: Epidemiological characteristics and non-pharmaceutical intervention effects.

Mandal, S., Bhatnagar, T., Arinaminpathy, N., Agarwal, A., Chowdhury, A., Murhekar, M., Gangakhedkar, R., & Sarkar, S. (2020). Prudent public health intervention strategies to control the coronavirus disease 2019 transmission in India: A mathematical model-based approach. Indian Journal of Medical Research, 151(2), 190–199.

Microsoft Corporation. (n.d.). Microsoft Bing COVID-19 tracker. Retrieved March 19, 2020, from

Mkhatshwa, T., & Mummert, A. (2010). Modeling super-spreading events for infectious diseases: Case study SARS. ArXiv.

National Institute of Cancer Prevention and Research. (2020). Cancer Statistics.

Noronha, G. (2020, April 10). India could see a reduction in the number of coronavirus cases by next week: Study. The Economic Times.

Oke, J., & Henegan, C. (2020, April). Global COVID-19 case fatality rates. CEBM Research.

Pan, A., Liu, L., Wang, C., Guo, H., Hao, X., Wang, Q., Huang, J., He, N., Yu, H., Lin, X., Wei, S., & Wu, T. (2020). Association of public health interventions with the epidemiology of the COVID-19 outbreak in Wuhan, China. JAMA, 323(19), 1915–1923.

Plummer, M., Stukalov, A., & Denwood, M. (n.d.). rjags: Bayesian Graphical Models using MCMC.

Prabhakaran, D., Singh, K., Roth, G. A., Banerjee, A., Pagidipati, N. J., & Huffman, M. D. (2018). Cardiovascular diseases in India compared with the United States. Journal of the American College of Cardiology, 72(1), 79-95.

Prinja, S., Bahuguna, P., Gupta, I., Chowdhury, S., & Trivedi, M. (2019). Role of insurance in determining utilization of healthcare and financial risk protection in India. PloS one14(2).

Ranjan, R. (2020). Predictions for COVID-19 outbreak in India using epidemiological models. MedRxiv.

Ray, D., Bhattacharyya, R., Wang, L., Salvatore, M., Mohammed, S., Halder, A., Zhou, Y., Song, P., Purkayastha, S., Bose, D., Banerjee, M., Baladandayuthapani, V., Ghosh, P., & Mukherjee, B. (2020, March 21). Predictions and role of interventions for COVID-19 outbreak in India. Medium.

Root, A. (2020, April). Nearly a third of Americans with COVID-19 are hospitalized: Here are the latest numbers. Barron’s.

Ryu, S., & Chun, B. C. (2020). An interim review of the epidemiological characteristics of 2019 novel coronavirus. Epidemiology and Health, 42, Article e2020006.

Salvatore, M., Ray, D., Du, J., Wang, L., Das, S., Kleinsasser, M., Rix, A., Barker, D., & Mukherjee, B. (2020, April). Unlocking the 40-day national lockdown in India: There is no magic key. Medium.

Salvatore, M., Wang, L., Bhattacharyya, R., Purkayastha, S., Mohammed, S., Halder, A., Barker, D., Kleinsasser, M., Rix, A., Banerjee, M., Baladandayuthapani, V., Ray, D., & Mukherjee, B. (2020, April 3). Historic 21-day lockdown, predictions for lockdown effects and the role of data in this crisis of virus in India. Medium.

Salvi, S., Kumar, G. A., Dhaliwal, R. S., Paulson, K., Agrawal, A., Koul, P. A., ... & Christopher, D. J. (2018). The burden of chronic respiratory diseases and their heterogeneity across the states of India: the Global Burden of Disease Study 1990–2016. The Lancet Global Health, 6(12), e1363-e1374.

Sardar, T., Nadim, S. S., & Chattopadhyay, J. (2020). Assessment of 21 days lockdown effect in some states and overall India: A predictive mathematical study on COVID-19 outbreak. ArXiv.

Senapati, A., Rana, S., Das, T., & Chattopadhyay, J. (2020). Impact of intervention on the spread of COVID-19 in India: A model based study. ArXiv.

Sindhu, J., Reddy, K. K., Satyanarayana, N., Devaraya, S., & Fathima, A. (2019). Hospital utilization statistics: Thirty-five year trend analysis, a measure of operational efficiency of a tertiary care teaching institute in South India. IOSR Journal of Dental and Medical Sciences, 18(4), 49–55.

Singh, R., & Adhikari, R. (2020). Age-structured impact of social distancing on the COVID-19 epidemic in India. ArXiv.

Taylor, D. B. (2020, April 14). How the coronavirus pandemic unfolded: a timeline. The New York Times.

Times of India. (2020, March). Coronavirus: Does India have enough ventilators, hospital beds? The Times of India.

United Kingdom Department of Health & Social Care. (2020). Coronavirus action plan: a guide to what your can expect across the UK.

United Nations. (2019). United Nations World Populations Prospects 2019 Data Query.

Wang, C., Liu, L., Hao, X., Guo, H., Wang, Q., Huang, J., He, N., Yu, H., Lin, X., Pan, A., Wei, S., & Wu, T. (2020). Evolving epidemiology and impact of non-pharmaceutical interventions on the outbreak of coronavirus disease 2019 in Wuhan, China. MedRxiv.

Wang, L., Zhou, Y., He, J., Zhu, B., Wang, F., Tang, L., Eisenberg, M., & Song, P. X. (2020). An epidemiological forecast model and software assessing interventions on COVID-19 epidemic in China. MedRxiv.

Wikipedia. (2020). COVID-19 pandemic in Germany.

World Bank. (n.d.). Data for India, United States, China.

World Bank. (2020). Hospital beds (per 1,000 people).

—. (2020). India (2017).

World Health Organization. (2020a). Q&A on coronaviruses (COVID-19).

World Health Organization. (2020b, March 11). WHO Director-General’s opening remarks at the media briefing on COVID-19—11 March 2020.

World Health Organization. (2020c, March 27). WHO Director-General’s opening remarks at the media briefing on COVID-19—27 March 2020.


Table A1.  Proportion of Population in Specifically Vulnerable Subgroups at Potentially High Risk of COVID-19 Severity Risk in India



(in millions)






Prinja et al. 2019

Population over 65


2020 (est.)

CIA World Factbook

Hypertension (men)*



Gupta & Ram 2019

Hypertension (women)*



Gupta & Ram 2019

People with cardiovascular disease*



Prabhakaran et al. 2018

Population with COPD*



Salvi et al. 2018

Population with asthma*



Salvi et al. 2018

Develop cancer by age 75 (men)**




Develop cancer by age 75 (men)**




Population with diabetes (adult)



Access to inpatient department facilities***



IMS Institute 2013

Access to outpatient department***



IMS Institute 2013

based on 2020 est. of 1.38 billion from UN Department of Economic and Social Affairs

* age-standardized; ** risk; *** defined as within 5-kilometer distance of home or work

Abbrev.: COPD, chronic obstructive pulmonary disease; IDF, International Diabetes Federation; NICPR, National Institute of Cancer Prevention and Research

Table A2. Intervention Landscape of Countries Severely Affected by COVID-19


Date of 1st case


Crude fatality rate

Active cases


Nov. 17, 2019*

Lockdown in Wuhan on 01/22, extended to neighboring cities in Hubei province on 01/23. Wuhan lockdown to be lifted on 04/08.1



South Korea

Jan. 19, 2020**

Tested widely for the virus, isolated cases and quarantined suspected cases. Figures indicate that this has helped suppress transmission of the virus. The country appears to have reined in the outbreak without some of the strict lockdown strategies deployed elsewhere in the world.2



United States

Jan. 20, 2020

On 01/31, restricted travel from China; expanded restrictions to other countries on 02/29. On 03/03, CDC lifted all restrictions on testing. On 03/15, CDC recommends no gatherings of 50 people or more. Stay-at-home directives issued at state-level.3




Jan. 24, 2020

On 03/17, France imposed a nationwide lockdown, prohibiting gatherings of any size and postponing the second round its municipal elections. The lockdown was one of Europe’s most stringent. While residents were told to stay home, officials allowed people to go out for fresh air but warned that meeting a friend on the street or in a park would be punishable with a fine. 3




Jan. 28, 2020

Germany has a National Pandemic Plan, with three stages. In the first stage (containment) health authorities are focusing on identifying contact persons, who are put in personal quarantine and are monitored and tested. In the second stage (protection) the strategy will change to using measures to protect vulnerable persons from becoming infected. The final stage (mitigation) will try to avoid spikes of intensive treatment in order to maintain medical services.4



United Kingdom

Jan. 31, 2020

Citizens advised to stay at home. Violators to be fined with the exception of a few special circumstances, in order to contain the spread of the disease. The government is supporting and coordinating with research institutes to explore treatment and curative options.5




Jan. 31, 2020

Flights to China suspended and a national emergency declared on 01/31 after two cases confirmed in Rome. Schools and universities closed on 03/04. By 03/09, the entire nation placed under lockdown, with restaurants, bars closing on 03/11 and factories closing on 03/22. All non-essential production halted.6




Feb. 1, 2020

State of emergency declared on 03/14, allowing authorities to authorities to confine infected people and ration goods. Originally planned to last until 03/29; has been extended to 04/12. Schools, bars, restaurants and shops selling non-essential items have been shut since March 14 and most of the population is house bound.7     




Feb. 19, 2020

On 02/28, the Iranian authorities closed schools, canceled Friday prayers and moved to restrict visitors from China. On 03/15, the official leading Iran's response to the new coronavirus acknowledged Sunday that the pandemic could overwhelm health facilities in his country, which is battling the worst outbreak in the Middle East.8



¶ Date of 1st case data obtained from JHU CSSE time series data on COVID-19 (except for China & South Korea)
† Microsoft bing COVID-19 tracker (, as of 1:00 PM EST April 10, 2020)

Table A3. Comparison of Types of Infectious Disease Models, Specifically Those Used to Study COVID-19 in India

Model type


Research question



Exponential model

Gupta & Shankar (2020)

Provide estimate of the infected population using death counts

Simple model helpful for scant data; modeled epidemic hotspots separately

Not accounted for population demographics (limited by data), non-pharmaceutical (NP) intervention effects; requires infection fatality rate

Poisson log-linear model

Das (2020)

Short-term prediction of future case counts; estimate

Simple model helpful for short- & medium-term forecasts using scant data; accounted for quadratic effect of time

Not accounted for population demographics (limited by data), hotspots, NP intervention effects; surveillance bias

Autoregressive–moving-average model

Deb & Majumdar (2020)

Analyze the trend pattern of incidence; estimate

Accounted for quadratic effect of time, lockdown effect; captured time dependence incidence pattern

Not accounted for population demographics (limited by data), hotspots; surveillance bias

Susceptible-infected-recovered (SIR) model

Ranjan (2020)

Long-term prediction of future case counts; estimate

Classical epidemiologic model used; accounted for social distancing effects

Not accounted for population demographics (limited by data), hotspots; surveillance bias; used first few weeks of data

SIR model

Dhanwant & Ramanathan (2020)

Long-term prediction of future case counts

Classical epidemiologic model used; split observed data into training and test data; training data used to learn the transmission rate; incorporated lockdown effect

Not accounted for population demographics (limited by data), hotspots; surveillance bias; lockdown training data not used to learn about transmission rate under lockdown

Age-structured SIR model

Singh &Adhikari (2020)

Study progress of the disease and impact of social distancing measures; estimate

Extended epidemiologic model accounting for age distribution, social contact, social distancing effect

Not accounted for other population demographics (limited by data), hotspots; surveillance bias; complex model given the scant count data and spotty individual-level data

Susceptible-Exposed-Infectious-Recovered (SEIR) model

Mandal et al. (2020)

Identify NP intervention strategies that can help control the outbreak

Extended epidemiologic model with an added compartment for quarantine; accounted for other NP interventions, and connectivity between two places

Not accounted for population demographics (limited by data); surveillance bias; complex model given the scant count data; lockdown effect not studied; studied four cities only

Expanded SEIR model

Chatterjee et al. (2020)

Assess the impact

on healthcare resources; study the effect of different NP interventions

Extended epidemiologic model with added subcompartments for quarantined, recovered, and death; accounted for different NP interventions; accounted for age groups

Not accounted for other population demographics (limited by data), hotspots; surveillance bias; complex model given the scant count data and spotty individual-level data; hospitalization-related parameters based on UK data; lockdown effect not studied

Expanded SEIR model

Senapati et al. (2020)

Assess the effect of different NP interventions; estimate

Extended epidemiologic model with added subcompartments for asymptomatic cases, quarantined, hospitalized, recovered, and death

Not accounted for other population demographics (limited by data), hotspots; surveillance bias; complex model given the scant count data; lockdown effect not studied

Expanded SEIR model

Sardar et al. (2020)

Assess long-term effect of 21-day lockdown; estimate

Extended epidemiologic model with added subcompartments for asymptomatic cases, lockdown, hospitalized, recovered, and death; accounted for transmission

variability between symptomatic and asymptomatic groups; modeled hotspots and overall India

Not accounted for other population demographics (limited by data); surveillance bias; complex model given the scant count data

Figure A1. The SIR model with (A) or without (B) considering human intervention by introducing a transmission rate modifier π(t)\mathbf{\pi}\left( \mathbf{t} \right).

Figure A2. Implied R0\mathbf{R}_{\mathbf{0}} schedules corresponding to the hypothetical scenarios under slow adherence.

Figure A3. Short-term daily growth in cumulative case counts in India assuming a 2-week delay in people’s adherence to restrictions. Observed data are shown for days up to April 14. Predicted future case counts for April 15 until April 30 are based on observed data until April 14 using the eSIR model.

Figure A4. Long-term daily growth in case counts in India per 100,000 people assuming a 2-week delay and how that is affected by different non-pharmaceutical intervention strategies. Predicted cumulative (a) and incident (b) case counts from April 30 to July 31 from the eSIR model are shown, based on observed data until April 14.

a. Trace plots and posterior density plots for β\beta

b. Trace plots and posterior density plots for γ\gamma

c. Trace plots and posterior density plots for R0R_{0}

d. Posterior distribution for the predictions Y and the latent proportions θ\theta for the I (infected) compartment

e. Posterior distribution for the predictions Y and the latent proportions θ\theta for the R (removed) compartment

f. Estimates and posterior distribution of the daily prevalence of active cases over time or dθtIdt\frac{d\theta_{t}^{I}}{\text{dt}}

Figure A5. Trace plots and posterior density plots for the underlying model. Parameters β\text{\ β} (a), γ\gamma (b), and R0R_{0} (c), posterior distributions for the predictions \text{Y\ }and the latent proportions θ\theta for the I (d), and R (e) compartments over time, and estimates and posterior distribution of the daily prevalence of active cases over time or dθtIdt\frac{d\theta_{t}^{I}}{\text{dt}} (f). These plots correspond to the 21-day lockdown with moderate return scenario under quick adherence.

Figure A6. Cumulative (a) and incidence (b) graphs for forecasting models assuming a 2-week delay under 21-, 28-, 42-, and 56-day lockdown scenarios using observed data through April 14.

Figure A7. Implied R0\mathbf{R}_{\mathbf{0}} schedules corresponding to quick and slow adherence for the hypothetical lockdown duration scenarios.

a. Scenario with 10 times the number of reported cases (e.g., underreporting)

b. Scenario using metro population (e.g., to mimic case-clustering)

c. Scenario with prior mean of R0=2R_{0} = 2

d. Scenario with prior mean of R0=3R_{0} = 3

e. Scenario with prior mean of R0=4R_{0} = 4

Figure A8. Posterior distributions of the projected case counts and latent proportions under sensitivity scenarios.

Figure A9. Model Calibration: Relative comparison of predictions using observed data up to a certain date (April 1, 7, and 14). Observed data (gray) is provided through April 30.


Figure A10. Daily testing patterns in selected countries (A); Testing numbers and proportions for 61 countries around the world affected by COVID-19 (B).

This article is © 2020 by Debashree Ray, Maxwell Salvatore, Rupam Bhattacharyya, Lili Wang, Jiacong Du, Shariq Mohammed, Soumik Purkayastha, Aritra Halder, Alexander Rix, Daniel Barker, Michael Kleinsasser, Yiwang Zhou, Debraj Bose, Peter Song, Mousumi Banerjee, Veerabhadran Baladandayuthapani, Parikshit Ghosh, and Bhramar Mukherjee. The article is licensed under a Creative Commons Attribution (CC BY 4.0) International license (, except where otherwise indicated with respect to particular material included in the article. The article should be attributed to the author identified above.

No comments here
Why not start the discussion?