Overview

Using the daily incidence curve and a collection of festive or anomalous days EpiInvert estimates a time varying reproduction number and a restored incidence curve by inverting the renewal equation :

\((1) \ \ \ \ i_t=\sum_k i_{t-k}R_{t-k}\Phi_k\)

through a variational model as described in PNAS, 2021 and Biology, 2022. See
Rt comparison for a comparison with other methods which compute the reproduction number.

A festive or anomalous day is any day where we know “a priori” that the registered number of cases is biased. Typically, in those days, one observes a sharp decrease in the number of registered incidence that is compensated by increased incidence numbers in the next few days. This bias is corrected by redistributing the number of cases in the festive day and the next 2 days.

On top of the festive day bias, there is a strong administrative weekly bias introduced by the way the countries registered the new cases each day of the week. This weekly bias is corrected using 7-day quasi-periodic multiplicative correction factors. We use the following notation :

\((2) \ \ \ \ i^f_t \ \ \text{is the festive day bias free incidence}\)

\((3) \ \ \ \ q_t \ \ \text{ is the 7-day quasi-periodic multiplicative correction factor}\)

\((4) \ \ \ \ i^b_t=i^f_tq_t \ \ \text{is the festive + weekly biases free incidence}\)

Once the festive day and weekly biases are corrected in the incidence curve the difference between the incidence curve and its expected value using the renewal equation is modeled by

\((5) \ \ \ \ i^b_{t}=i^r_t+\varepsilon_{t}(i_{t}^r)^a\)

where

\((6) \ \ \ \ i^r_t=\sum_k i^b_{t-k}R_{t-k}\Phi_k.\)

In a nutshell, the proposed variational model is based on estimating all the variables involved in order to minimize the difference between the incidence and its expected value using the renewal equation.

The power a in the equation (5) is computed experimentally by linear regression (in t) applied to

\((7) \ \ \ \ (log(|i^b_{t}-i^r_t|),log(i^r_t)).\)

The normalized error of the model is given by

\((8) \ \ \ \ \epsilon_t=\frac{i^b_t-i^r_t}{(i^r_t)^a}.\)

In Biology, 2022, it is shown experimentally that this normalized error is well approximated by an exponential distributed white noise.

To model the serial interval, EpiInvert allows a shifted log-normal parametric formulation. The shift can be negative, reflecting the fact that secondary cases may present symptoms earlier than the primary case. The user can also provide a non-parametric serial interval given by a numeric vector.

Package installation

You can install the development version of EpiInvert from GitHub with:

 install.packages("devtools")
 devtools::install_github("lalvarezmat/EpiInvert")

Examples

We attach some required packages

library(EpiInvert)
library(ggplot2)
library(dplyr)
library(grid)

Loading stored data on COVID-19 daily incidence up to 2022-05-05 for France, Germany, the USA and the UK:

data(incidence)
tail(incidence)
#>           date   FRA    DEU    USA    UK
#> 828 2022-04-30 49482  11718  23349     0
#> 829 2022-05-01 36726   4032  16153     0
#> 830 2022-05-02  8737 113522  81644    32
#> 831 2022-05-03 67017 106631  61743 35518
#> 832 2022-05-04 47925  96167 114308 16924
#> 833 2022-05-05 44225  85073  72158 12460

Loading some festive days for the same countries:

data(festives)
head(festives)
#>          USA        DEU        FRA         UK
#> 1 2020-01-01 2020-01-01 2020-01-01 2020-01-01
#> 2 2020-01-20 2020-04-10 2020-04-10 2020-04-10
#> 3 2020-02-17 2020-04-13 2020-04-13 2020-04-13
#> 4 2020-05-25 2020-05-01 2020-05-01 2020-05-08
#> 5 2020-06-21 2020-05-21 2020-05-08 2020-05-25
#> 6 2020-07-03 2020-06-01 2020-05-21 2020-06-21

Example 1

We show the execution of EpiInvert using Germany data. The first parameter is a numerical vector with the daily incidence, the second parameter is the date of the last incidence value and the third parameter is a character vector with the festive days (this parameter is not mandatory)

res <- EpiInvert(incidence$DEU,"2022-05-05",festives$DEU)

Plotting the results:

EpiInvert_plot(res)

plot of chunk fig1

EpiInvert return a list with the following elements:

Example 2

EpiInvert execution for France using 365 days in the past. If you are not constrained by the computational cost of the algorithm, you can choose a large value of this parameter (for instance 9999), to ensure that EpiInvert will use the whole available sequence in the estimation.

res <- EpiInvert(incidence$FRA,"2022-05-05",festives$FRA,
                 select_params(list(max_time_interval = 365)))

Plot of the incidence between “2021-12-15” and “2022-01-15”. Observe that the festive days bias correction only modifies the original incidence in the festive days and the following 2 days.

 EpiInvert_plot(res,"incid","2021-12-15","2022-01-15")

plot of chunk fig2

Example 3

EpiInvert execution for UK using a non-parametric serial interval shifted -2 days

load data of a serial interval

data(si_distr_data)
head(si_distr_data)
#> [1] 3.285609e-06 3.401902e-04 3.904441e-03 1.543537e-02 3.466818e-02
#> [6] 5.608451e-02
res <- EpiInvert(incidence$UK,"2022-05-05",festives$UK,
       select_params(list(si_distr = si_distr_data,
       shift_si_distr=-2)))

Plot of the serial interval used (including the shift)

 EpiInvert_plot(res,"SI")

plot of chunk fig3

Example 4

EpiInvert execution for the USA changing the default values of the parametric serial interval (using a shifted log-normal)

res <- EpiInvert(incidence$USA,"2022-05-05",festives$USA,
       select_params(list(mean_si = 11,sd_si=6,shift_si=-1)))

Plot of the reproduction number Rt including an empiric 95\% confidence interval of the variation of EpiInvert Rt estimation as a function of the number of future days available.

 EpiInvert_plot(res,"R")

plot of chunk fig4

Updated version of the incidence curves

To load an updated version of the incidence file we use in these examples you can execute:

incidence <- read.csv(url("https://www.ctim.es/covid19/incidence.csv"))
tail(incidence)
#>           date   FRA   DEU    USA    UK
#> 842 2022-05-12 36047 68999 112112 14800
#> 843 2022-05-13 32773 61859  81683  6587
#> 844 2022-05-14 30459  6151  16092     0
#> 845 2022-05-15 22844  2305  30890     0
#> 846 2022-05-16  5936 86252 145288 23820
#> 847 2022-05-17 43727 72051 112487  8596

you can introduce in the EpiInvert call the last date of the incidence using the dates included in the incidence file:

res <- EpiInvert(incidence$USA,incidence$date[length(incidence$date)],festives$USA)
 EpiInvert_plot(res)
#> Warning: Removed 2 row(s) containing missing values (geom_path).

plot of chunk fig5