TIME SERIES MODELS BASED ON GROWTH CURVES WITH APPLICATIONS TO FORECASTING CORONAVIRUS

p.kattuman@jbs.cam.ac.uk Abstract. Time series models are developed for predicting future values of a variable which when cumulated is subject to an unknown saturation level. Such models are relevant for many disciplines, but here attention is focused on the spread of epidemics and the applications are for Coronavirus. The time series models are relatively simple but are such that their specification can be assessed by standard statistical test procedures. In the generalized logistic class of models, the logarithm of the growth rate of the cumulative series depends on a time trend. Allowing this trend to be time-varying introduces further flexibility and enables the effects of changes in policy to be tracked and evaluated.


Introduction
The progress of an epidemic typically starts off with the number of cases following an exponential growth path. Over time the growth rate falls and the total number of cases approaches a final level -the 'leveling of the curve.' Complex behavioural modeling of the progress of the disease, for example, by using the 'semi-mechanistic Bayesian hierarchical model' implemented by the team at Imperial College London, depends on many assumptions and unknowns; see Flaxman et al. (2020). Simple and transparent time series models may offer an alternative way of making predictions of the trajectory of the epidemic; see, for example, Chowell et al. (2016, section 2), where such approaches are called 'phenomenological'. Similar issues arise in economics where there is a contrast between calibrated and Bayesian models based on economic theory and data-based time series models. Avery et al. (2020) give an excellent review of these issues.
The progress towards an upper bound or saturation level can be taken on board with a sigmoid curve, such as the logistic (1.1) µ(t) = µ/(1 + γ 0 e −γt ), γ 0 , γ, µ > 0, − ∞ < t < ∞, where µ is the final level, γ is a rate of progress parameter, γ 0 takes account of the initial conditions and e is Euler's number. Logistic curves can be shown to arise from a model of a simple epidemic; see, for example, Daley and Gani (1999, ch. 2). More generally, sigmoid curves are used in many disciplines for a variety of applications, such as estimating the demand for new products and population growth of mammals subject to space and resource limitations. An early and influential discussion of growth curves was given by Gregg et al. (1964). More recently Panik (2014) describes growth curves and statistical methods for fitting them. Here we concentrate on an extension of the logistic curve called the generalized logistic (GL) or Richards curve; see Panik (2014, pp. 78-80). The Gompertz curve emerges as an important special case. Figure 1 shows two Gompertz growth curves together with the corresponding changes as given by the first derivatives. The effect of what turns out to be a key parameter, γ, on the peak is evident.
By formulating a statistical model, parameters such as γ 0 , µ and γ can be estimated from observations, denoted Y t , made at discrete times t = 1, ..., T, on µ(t). One option is to work directly with the level, by basing a time series model on a deterministic trend, as in (1.1); for example, Meade and Islam (1995) report fitting a variety of growth curves to telecommunications data, while Ciufolini and Paolozzi (2020) report fitting a deterministic trend to total Coronavirus cases in Italy. The methods discussed in Panik (2014, ch. 4) are restricted to deterministic curves, possibly with a first-order autoregressive disturbance term. Our view is that, as with most economic and social time series, a deterministic trend fitted to the level is too inflexible. Instead we prefer to work with the change or the growth rate, with specification of the model informed by an assumption that the total follows a growth curve. The saturation level can be continually updated as new observations become available. A second advantage of working with the difference or the growth rate is that when logarithms are taken 1 , estimation of the basic models derived from the GL class can be carried out by least squares regression, as proposed in Harvey (1984). Thus the method is viable even for a small number of observations. Finally the deterministic time trend in the estimating equations can be replaced by a stochastic one. This is particularly effective with a Gompertz function. The flexible trend, which can be estimated by the Kalman filter, as in the STAMP package of Koopman et al. (2020), allows parameters to evolve over time. As a result the model can adapt to significant events and changes in policy. Figure 2 shows predictions made for new cases of Coronavirus in Germany. The predictions, which include a day of the week effect, were made using data up to the end of March and show a downward trend even though the series was only just reaching what subsequently turned out to be its peak. Moving beyond the peak signals that R t , the effective reproduction number at time t, has dipped below one; see Flaxman et al. (2020) and Aronson et al. (2020).
When numbers are small, as is the case with deaths at the beginning or end of an epidemic, there is a strong argument for adopting a negative binomial distribution. Models formulated under 1 One implication is that the forecasts of the change will tend asymptotically towards zero and never become negative. In some applications, Yt can go down as well as up and in these situations the methods proposed here do not apply. Instead flexibility can be brought into the deterministic growth curve by allowing parameters, such as µ, to change over time. Young and Ord (1989)  an assumption of Gaussianity need to be modified accordingly and we show how this may be done using the score-driven approach described in Harvey (2013) and implemented in the TSL computer package of Lit et al. (2020). Section 2 describes the GL class of growth curves, while Section 3 discusses estimation. These methods are then applied in Section 4 to data on Coronavirus in UK hospitals and in Germany. Forecasts are set out and evaluated. The effects of policy, primarily the lockdown imposed on March 21st in the UK, are assessed in Section 5. There is concern about a potential second wave of infections as restrictions start to be eased and ways in which this might be monitored are discussed in Section 6. Section 7 concludes.

Growth curves
Let µ(t) ≥ 0 be a monotonically increasing function defined over the real line. The rate of change or 'incidence curve' is dµ(t)/dt ≥ 0. The generalized logistic is where γ is a growth rate parameter. The parameter κ must be positive for there to be an upper asymptote; allowing κ to be negative gives the class of general modified exponential (GME) growth curves. The logistic is obtained by setting κ = 1, while letting κ → ∞ yields the Gompertz curve. When γ 0 is determined by the value of the curve at t = 0, it is (2.2) γ 0 = κ (µ/µ(0)) 1/κ − 1 .

2.1.
Where is the peak? The point of inflexion on the growth curve is the point at which the number of new cases, dµ(t)/dt, peaks. Differentiating the logarithm of dµ(t)/dt = g(t)µ(t) yields the condition where g g (t) is the growth rate of the growth rate. It follows from (2.4) that the point of inflexion in the GL occurs when g(t) = γ/ρ. The corresponding value of µ(t) is The change declines more slowly than it ascends when κ > 1; see, for example, Figure 1. When γ 0 is determined by µ(0), as in (2.2), the peak is at (2.9) t * = ln γ 0 /γ, γ 0 > 1, so it comes forward as γ increases.

Statistical distributions and epidemics.
Writing a GL growth curve as µ(t) = µF (t) allows F (t) to be interpreted as the cumulative distribution function (CDF) of the log of a Dagum distribution; see Kleiber and Kotz (2003, pp. 212-213). Hence the corresponding probability density function (PDF), f (t), is a special case of an Exponential Beta of the Second Kind (EGB2) distribution; see McDonald and Xu (1995). The Gompertz distribution written using (2.10) has f (t) = γγ 0 F (t) exp(−γt). Figure 1 shows two Gompertz PDFs with the red bold dotted curve corresponding to γ 0 = 20 and γ = 0.1 and the blue dotted curve corresponding to γ 0 = 20 and γ = 0.15. The effect of increasing γ is to raise the peak and bring it forward. Since dµ(t)/dt = µf (t) the peak is brought down by a lower µ.
When estimating a deterministic growth curve, some researchers, for example Ciufolini and Paolozzi (2020), prefer to use a sigmoid defined in terms of the Gaussian error function, which is the CDF of a Gaussian variate. Similarly, Murray (2020) use the Gaussian error function to model the logarithm of total Coronavirus cases in US states. It is difficult to see why the Gaussian error function, which has no closed form, might be preferred to the more flexible GL curve.
It follows from (2.5) that where µ(t) = µF (t). In a simple epidemic, dµ(t)/dt is proportional to a logistic growth curve, . Allowing κ to be other than one gives more flexibility and is a useful generalization if the model provides a good fit to the data. Indeed complex mechanistic models of epidemics, with the population assigned to compartments labeled Susceptible, Infected, and Recovered (SIR), often produce incidence curves that are positively skewed. An example is the model of Giordano et al. (2020) which is based on a system of eight differential equations.

Statistical modeling
In the observational model, the cumulative total at time t − 1 replaces µ(t) and the (positive) where the disturbance terms ε t are assumed to be independently and identically distributed with mean zero and constant variance, σ 2 ε , that is, ε t ∼ IID(0, σ 2 ε ). Subtracting ln Y t−1 from both sides gives the form corresponding to (2.4), namely where g t = y t /Y t−1 , although it may also be defined as ∆ ln Y t . The parameters ρ, δ and γ can therefore be estimated by regression. If ρ takes a specified value, ln y t − ρ ln Y t−1 is simply regressed on a constant and time trend. The estimators of δ and γ are then efficient when the disturbance is Gaussian, that is, ε t ∼ NID(0, σ 2 ε ). The observational model for the Gompertz curve is or the simple time trend regression 3.1. A time-varying trend: the dynamic Gompertz model. A stochastic trend may be introduced into (3.4) to give the dynamic Gompertz model , and the normally distributed irregular, level and slope disturbances, ε t , η t and ζ t , respectively, are mutually independent. When σ 2 η = σ 2 ζ = 0, the trend is deterministic, that is, δ t = δ − γt with δ = δ 0 . On the one hand, when only σ 2 ζ is zero, the slope is fixed and the trend reduces to a random walk with drift. On the other hand, allowing σ 2 ζ to be positive, but setting σ 2 η = 0 gives an integrated random walk (IRW) trend, which when estimated tends to be relatively smooth. The degree of smoothness depends on the signal-noise ratio, q ζ = σ 2 ζ /σ 2 ε . The STAMP package of Koopman et al. (2020) can estimate a stochastic trend using techniques based on state space models and the Kalman filter. The Kalman filter outputs estimates of the state vector (δ t , γ t ) . Estimates of the state at time t conditional on information up to and including time t are denoted (δ t t , γ t t ) and given by the contemporaneous filter; the predictive filter, which outputs (δ t+1 t γ t+1 t ) , estimates the state at time t + 1 from the same information set. It may sometimes be useful to review past movements by the smoother, which is the estimate of the state at time t based on all T observations in the series. Estimation of the unknown variance parameters is by maximum likelihood (ML). Tests for normality and residual serial correlation are based on the standardized innovations, that is, one-step ahead prediction errors, v t = y t − δ t t−1 , t = 3, ..., T.
A stochastic trend can be introduced into the more general GL model. However, unless ρ is fixed, it may be hard to estimate in small samples.
The Kalman filter can be by-passed by adopting the reduced form, which comes from the innovations form of the Kalman filter; see Harvey (2013, pp. 3, 71). For the GL curve, the steady-state innovations form is The filter for an IRW has α 2 = α 2 1 /(2 − α 1 ) with 0 ≤ α 1 ≤ 1; see Harvey (2013, pp. 78-79). The implied value of q ς is The parameter α 1 thus plays a role similar to that of the signal-noise ratio, q ζ , and can be estimated by ML. When it is zero, the model reverts to (3.2).

3.2.
Forecasts. Forecasts of future observations and an estimate of the final level can be obtained from the predictive recursions For the Gompertz model, where ρ = 1, the forecasts for the growth rate are simply so (3.11) and (3.12) yield (3.14) (1 + exp(δ T ) exp(−γj)).

A future point of inflexion is given at
An assumption of normality for the disturbances in (3.1) implies that, conditional on information at time t − 1, y t is log-normal. Thus there may be a case for adding 0.5σ 2 to δ T , where σ 2 is the Higher order terms can be neglected when g T + is small. Then j=1 g T + = δT j=1 exp(−γ ) → δT /(exp γ − 1)) as → ∞ prediction error variance. (When there is no stochastic trend, σ 2 = σ 2 ε .) Predictive distributions of future observations may be obtained by simulation.
3.3. Models for the growth rate 3 . A stochastic model for the logistic curve can be based on (2.6), as in Levenbach and Reuter (1976), by adding a serially independent Gaussian disturbance term so that where µ t|t−1 is an estimate of µ t , such as Y t−1 , based on information at time t − 1 and ε t is a serially independent Gaussian disturbance with mean zero and constant variance, σ 2 ε , that is ε t ∼ NID(0, σ 2 ε ). Regressing g t on Y t−1 gives estimates of the key parameters γ and µ. Generalization of this approach is based on (2.6) but leads to a nonlinear equation which requires a search over the range of κ if it is unknown. In the Gompertz case, estimation is as straightforward as for the logistic model because Models based on the logarithm of the growth rate are preferred because they have better statistical properties. For example, the disturbance term is less likely to be heteroscedastic.
3.4. Small numbers: the negative binomial distribution. When y t is small, it may be better to specify its distribution, conditional on past values, as discrete. The usual choice is the negative binomial which, when parameterized in terms of a time-varying mean, ξ t t−1 , and a fixed positive shape parameter, υ, has probability mass function (PMF) with Var t−1 (y t ) = ξ t t−1 + ξ 2 t t−1 /υ. An exponential link function ensures that ξ t t−1 remains positive and at the same time yields an equation similar to (3.1): A stochastic trend may be introduced into the model, as in sub-section 3.2, by developing a filter for the time-varying trend similar in structure to that of (3.6). Because the observations are not Gaussian, the dynamic conditional score (DCS) framework described in Harvey (2013, pp. 77-79) is used to give where δ t t−1 is like the filter for the IRW in the Gaussian model, that is, but with u t = y t /ξ t t−1 −1, which is the conditional score for ln ξ t t−1 , that is υ(y t −ξ t t−1 )/(υ+ξ t t−1 ), divided by the information quantity. The dynamic Gompertz model has ρ = 1. Estimation is by maximum likelihood. Predictions of future observations and the final level can be obtained from (3.9) and (3.10).

Forecasting Coronavirus in the UK and Germany
We began working on this project at the beginning of April. At that time Coronavirus was not as far advanced in the UK as in Italy, and our initial exploration was focussed on Italy. Figure 3 shows new cases, as measured by hospital admissions on the European Centre for Disease Prevention and Control (ECDC) website 4 , in the UK and Italy. Italy is seen to be clearly ahead and it is apparent that the decline is slower than the rise. This asymmetry is not consistent with a simple logistic model, (1.1), and later analysis confirmed this to be the case for most other countries, including the UK.  Economic and social time series are typically subject to periodic variation, due to seasonal, day of the week and other effects. Preliminary analysis of data for hospital admissions and deaths in Italy indicate a day of the week pattern and this is confirmed for the UK; one reason is that laboratory confirmation for the virus tends to slow down during weekends. Day of the week effects are initially included in the equations by the introduction of a single harmonic cycle with period seven; this possesses only two parameters as opposed to the six needed for a full set of dummy variables. Although we estimated many models in early April, the first results reported here for the UK are those obtained on April 13th, that is, with data up to the 12th, starting on March 5th. The models were updated on April 20th and 27th. Unfortunately, the data available to us after April 29th were not consistent with the earlier data as they were not confined to just tests carried out on hospital admissions. 4.1. Models fitted to new cases in the UK. Table 1 shows April 13th estimates for the GL, (3.2), the Gompertz, where ρ is set to one, and the dynamic Gompertz where the trend is an IRW. The results for the logistic model are not reported because it gives an inferior fit and is rejected by a 't-test' on ρ; the standard error is in parentheses. The diagnostics presented are the Durbin-Watson (DW) statistic for residual serial correlation, a portmanteau Q−statistic for serial correlation based on P autocorrelations, the Bowman-Shenton normality test statistic and a heteroscedasticity statistic consisting of the squares of the last third of the residuals over the first third.
There is a substantive reduction in the prediction error variance for all models when daily effects are included. The peak day is Friday, the same day as was found for Italy. The results for the GL are shown in the penultimate column and may be compared with those in the first column. The likelihood ratio statistic is 8.48 which is significant at the 5% level of a χ 2 2 distribution. Figure 4 shows the fit and residuals from the GL. Figure 5 shows the histogram and correlogram for the GL residuals but using data up to April 20th.

4.2.
Forecasts and forecast evaluation. The forecasts of new cases made on April 13th are shown in Figure 6, together with the actual values up to May 29th. The upper line is the dynamic Gompertz, while the lower lines are the GL with and without the daily effect. The actual outcome turns out to lie between the dynamic Gompertz and the GL. As with the German predictions shown in Figure 2, the important point is that for both models the forecasts are moving downwards even though the observations have barely peaked.  Predictions from the GL made on April 20th show little change from those made on April 13th. The final level under current policies is estimated to be 186,000. By contrast the dynamic Gompertz predictions change significantly. They are still higher than the GL predictions but the final level predicted (by the dynamic Gompertz model) is now 237,790, whereas on April 13th it was 308,960. The dynamic Gompertz predictions made on April 27th show little movement; the final predicted level is 253,800. Figure 7 shows the dynamic Gompertz predictions without the daily component, but the daily component was included when the models were estimated. The flexibility of the  dynamic model resides in its ability to adapt to a situation in which the observations are falling less rapidly than indicated by the static GL. likelihood is smaller at 16.43. There is no evidence of residual serial correlation in either model and although the heteroscedasticity statistic indicates a diminishing variance, its value seems to be heavily influenced by just one observation (March 9th) near the beginning of the series. The fit is shown in Figure 8. The forecasts shown earlier in Figure 2 are quite remarkable in their accuracy over the next 36 days, that is, up to May 6th; see also Figure 9. The observations had not yet started to go down on April 1st, yet the sigmoid nature of the underlying growth curve means that the forecasts pick up the subsequent downward movement.
Fitting the dynamic Gompertz using data up to and including May 6th shows the model to be stable; see Table 2

4.4.
Deaths. Deaths in Germany behave in a way similar to new cases: Figure 11 shows the log growth rates. There are some zero values in the first three weeks of March so the dynamic Gompertz is estimated using data from March 22nd. Because the numbers are smaller it is perhaps not surprising that the signal-noise ratio is estimated to be zero. In other words, we end up with the static Gompertz model.  Figure 11. Logarithm of growth rate for infected cases and deaths in Germany.
The score-driven negative binomial model, introduced in sub-section 3.4, can be estimated with the TSL package of Lit et al. (2020). Observations from 11th March up to, and including, May 6th, are used. There are some zero values. Setting the ρ parameter to unity gives α = 0 -corresponding to a deterministic trend -and γ = 0.071. The correlogram of the scores has a relatively high value at lag seven, indicating a daily effect. Modeling a fixed daily effect with dummy variables produces the fit in Figure 12, with the significantly increased log-likelihood. The parameter estimates are γ = 0.070, δ T = −4.14 and υ = 13.25. It is reassuring that the values of γ and δ T are consistent with those reported for new cases. The total number of deaths on May 6th was 6996 and the final total is predicted to be 8714. When we revised our paper on June 19th the total was 8946 and daily deaths had fallen to very low numbers.
Up to April 27th, the UK dataset on deaths from Coronavirus was restricted to deaths in hospitals. After April 28th, deaths in the community were included and the earlier figure revised. Although the relationship between deaths and new cases before April 28th is similar to that in Germany, there is a disconnect after that and the data on new cases are of very little help in predicting deaths.
Fitting a negative binomial model, with daily effects, to the series on UK deaths from March 17th to May 14th lead to an estimated α of 0.34 (0.05). The changing trend is consistent with the Gaussian models fitted to new cases; the implied value of q ς is 0.007. The slope at the end of the series is 0.042 while υ = 29.91 (6.13). Figure 13 shows the forecasts of the underlying trend up to one month ahead. New observations after the forecasts were made are marked by dots.  Figure 13. Forecasts (bold -blue) of daily deaths in the UK for one month ahead (using data up to and including May 14th).

The effect of policy interventions
The effect of significant interventions, which may be a policy change, such as the introduction of a lockdown, or an external event, such as the arrival of a cruise ship in a small port, may be modeled by intervention (dummy) variables. The difficulty is that the pattern of the response is rarely known and so it becomes difficult to obtain a meaningful estimate of the final effect. Nevertheless some notion of the response can be obtained by making forecasts at the time the policy is thought to have become effective 7 and comparing these forecasts with the actual outcome. In order to investigate 7 Flaxman et al. (2020) provide information on the incubation period.

JUST ACCEPTED
this possibility, the dynamic Gompertz model was estimated on UK data up to and including March 31 (10 days after lockdown) using a fixed q ζ of 0.001 (as estimated later with a larger sample and reported in Table 1). No day of the week effect was included, because the series is rather short. Figure 14 shows there is considerable overprediction. However, if the signal-noise ratio is increased to q ζ = 0.01, so the most recent observations receive more weight, the predictions are excellent. The final level is now approximately 290,000 as against 1.8 million. New cases are at their maximum, with a value of 5,400, on April 18th, whereas with q ς = 0.001, they do not peak until May 20th, when they reach 21,500. Although the variation in predictions is huge, the same is true of many of the large models where the output can be very sensitive to the assumptions made. Information about the pattern of the response can also be obtained by fitting a dynamic Gompertz model and graphing estimates of the slope. Figure 15 shows filtered and smoothed estimates of slope in such a model based on data up to April 29th. The filtered estimates are most informative as they show the evolving changes in the slope after the lockdown of 21st March. As can be seen, the big falls occur at the beginning of April, with little or no change after mid-April. Figure 8 shows a very similar movement in Germany, but taking place a few weeks earlier.
Growth curves can be used to parameterize a gradual response to an intervention. A permanent change is captured with the CDF and a temporary one with the PDF. A logistic CDF gives a response curve W (t) = 1/(1 + γ I 0 exp(−γ I (t − t I )), where t I is the median. The I superscript distinguishes the parameters γ I 0 and γ I from the ones used to model the time series itself. With t L and t U denoting the beginning and the end of the time span during which gradual response to the intervention occurs, the intervention dummies are defined by w t = 0 for t < t L , w t = W (t) for t = t L , t L + 1, ..., t I , ..., t U and w t = 1 for t = t U + 1, ..., T, or even just by w t = W (t) for t = 1, ..., T. With the Gompertz CDF the response is W (t) = exp(−γ I 0 exp(−γ I (t − t I )); in this case the point of inflexion, corresponding to the maximum change, comes before the median of the time span between t L and t U . Figure 16 shows the CDFs for logistic and Gompertz distributions with the median set to seven; for logistic γ I = 0.5 and for Gompertz γ I = 0.2.  If the effect of a policy is to change γ in the model, a slope intervention is needed in (3.2). Thus but unless the sample is moderately large, ρ will need to be fixed. When the full effect is realised, the slope on the time trend will have moved from γ to γ + β. A positive β will lower the growth rate, g t , the peak of the incidence curve and the final level. Fitting the static model in (5.1) to new cases in the UK with ρ = 1 and a logistic intervention, starting on March 26th and ending on April 12th, gives an estimate of β equal to 0.020 (0.004) and an estimate of γ also equal to 0.020. The picture in Figure 17 is not unreasonable but the estimate of the overall effect is 0.041 which may be a slight underestimate because minus one times the final slope in Figure 15 is close to 0.05. When the slope is allowed to be stochastic, β T + γ is estimated at 0.054. However, a stochastic slope risks some confounding with the intervention variable and in fact the estimate of β is reduced to 0.014 (0.006). Although neither model is completely satisfactory, both give a significant coefficient for the intervention variable, albeit after a degree of data mining.  Figure 17. Estimates of logarithm of growth rate of total cases in UK with a logistic intervention and daily effect.

A second wave?
With the relaxation of a lockdown, the unwelcome prospect of a second wave of infections arises. Dynamic GL models can monitor this possibility by tracking changes in the growth rate of the incidence curve. This growth rate, g y (t), depends on a potentially time-varying γ parameter, γ(t), and the growth rate, g(t). Differentiating the logarithm of dµ(t)/dt gives (6.1) g y (t) = g(t) + g g (t) and for GL curves, it follows from (2.4) that g g (t) = (ρ − 1)g(t) − γ(t). In the discrete time dynamic Gompertz model, (3.5), the negative of the growth rate of the growth rate, g g (t), is tracked by the filtered estimates of the slope, that is, γ t|t , while the growth rate itself is tracked by the exponent of the filtered level, that is, g t|t = exp(δ t|t ). Figure 18 shows γ t|t and g t|t for Germany from a dynamic Gompertz model, together with the daily number of new cases. The maximum is identified as April 3rd and after this date g t|t < γ t|t ; see (2.7). The possible onset of a second wave is raised if at some point γ t|t starts to fall below g t|t , or below ρg t|t in the general case. This would signal that the reproduction number, R t , has moved up above one. The filtered estimates, g t|t and γ t|t , are obtained by discounting of past observations, with the rate of discounting depending on the signal-noise ratio, q ς . When a new policy is implemented, q ς may need to increase so that past observations are discounted at a faster rate, as illustrated in Figure 14. In these circumstances, the only viable tracking option may be to try a range of q ς values, bearing in mind the risk of triggering a false alarm. However, if the effects of a policy are  Figure 18. New cases in Germany together with filtered growth rate, g t|t−1 , and its rate of change, γ t|t−1 (using data up to May 7th).
spread over a period of time, as with a gradual relaxation of a lockdown, a value of q ς estimated with the complete sample may be perfectly satisfactory.
If γ t|t decreases, some idea of sampling variability is needed to assess the implications. This may be obtained from the standard deviation of γ t|t , as given by the Kalman filter. The variation in g t|t is harder to determine, but since it is typically much smoother than γ t|t it is likely to be small in comparison to the variation in γ t|t .

Conclusions
A new class of time series models is developed for predicting future values of a variable which when cumulated is subject to an unknown saturation level. Such situations arise in a wide range of disciplines. The models provide a simple and viable alternative, or complement, to the forecasts produced by large scale mechanistic models.
Generalized logistic growth curves provide the basis for our models. Estimating equations for the logarithm of the growth rate of the cumulative variable provide a good fit to the data, as assessed by standard statistical tests. Such models feature a time trend which can be made timevarying using the Kalman filter. The Gompertz model is a special case which works particularly well with a stochastic trend and, when this modification is made, the fit is often better than with an unrestricted generalized logistic model with a deterministic trend. The dynamic Gompertz can adapt to changing conditions and tracking the slope, especially the filtered slope, can be informative about the effect of interventions.
Additional components can be included in the models. These include seasonal or day of the week effects. The latter turned out to be relevant for new cases and deaths in our application to coronavirus in the UK and Germany. The dynamic Gompertz model worked well for both countries but was particularly impressive for German new cases. Deaths were successfully modelled by a Gompertz model with a negative binomial conditional distribution and a dynamic equation driven by the conditional score.
Estimating a model up to the point at which the effect of an intervention is likely to make itself felt and then making (unconditional) forecasts provides information about the effects of the intervention. The effectiveness of this approach is illustrated by fitting a dynamic Gompertz model to UK data and then investigating the effect of the Coronavirus lockdown. Further insight into the effect of the lockdown is given by tracking the filtered estimate of the growth rate of the growth rate. The impact of the lockdown can be estimated ex post by including a growth curve as an explanatory variable, thereby allowing the intervention to have a gradual response.
Finally, we suggest that the possibility of a second wave can be monitored by tracking the filtered estimates of new cases or deaths given by our model. Of course, if these methods are to be useful in practice they require reliable up-to-date observations on new cases, preferably at a disaggregated level. Implementing viable tracking procedures and relating them to current methods for tracking the reproduction number, R, is the next phase in our research program.
Disclosure Statement. The authors have no conflicts of interest to declare. countries where testing is limited. Biases also arise from imperfect or delayed reporting and errors in reporting.
Overall, deaths are more likely to be reported accurately by healthcare providers, but sources of bias remain. Time delays in reporting and the misattribution of deaths are not uncommon. In addition, national reporting guidelines during the early days are often focussed on hospital deaths to the exclusion of deaths in community and care homes.
We focus on the UK and Germany due to our familiarity with the health data infrastructures and validation procedures in these countries. The data we use in this paper can be accessed from: https://www.ecdc.europa.eu/en/publications-data/download-todays-data-geographic-distribution-covid-19-cases-worldwide.