Know more

Our use of cookies

Cookies are a set of data stored on a user’s device when the user browses a web site. The data is in a file containing an ID number, the name of the server which deposited it and, in some cases, an expiry date. We use cookies to record information about your visit, language of preference, and other parameters on the site in order to optimise your next visit and make the site even more useful to you.

To improve your experience, we use cookies to store certain browsing information and provide secure navigation, and to collect statistics with a view to improve the site’s features. For a complete list of the cookies we use, download “Ghostery”, a free plug-in for browsers which can detect, and, in some cases, block cookies.

Ghostery is available here for free: https://www.ghostery.com/fr/products/

You can also visit the CNIL web site for instructions on how to configure your browser to manage cookie storage on your device.

In the case of third-party advertising cookies, you can also visit the following site: http://www.youronlinechoices.com/fr/controler-ses-cookies/, offered by digital advertising professionals within the European Digital Advertising Alliance (EDAA). From the site, you can deny or accept the cookies used by advertising professionals who are members.

It is also possible to block certain third-party cookies directly via publishers:

Cookie type

Means of blocking

Analytical and performance cookies

Realytics
Google Analytics
Spoteffects
Optimizely

Targeted advertising cookies

DoubleClick
Mediarithmics

The following types of cookies may be used on our websites:

Mandatory cookies

Functional cookies

Social media and advertising cookies

These cookies are needed to ensure the proper functioning of the site and cannot be disabled. They help ensure a secure connection and the basic availability of our website.

These cookies allow us to analyse site use in order to measure and optimise performance. They allow us to store your sign-in information and display the different components of our website in a more coherent way.

These cookies are used by advertising agencies such as Google and by social media sites such as LinkedIn and Facebook. Among other things, they allow pages to be shared on social media, the posting of comments, and the publication (on our site or elsewhere) of ads that reflect your centres of interest.

Our EZPublish content management system (CMS) uses CAS and PHP session cookies and the New Relic cookie for monitoring purposes (IP, response times).

These cookies are deleted at the end of the browsing session (when you log off or close your browser window)

Our EZPublish content management system (CMS) uses the XiTi cookie to measure traffic. Our service provider is AT Internet. This company stores data (IPs, date and time of access, length of the visit and pages viewed) for six months.

Our EZPublish content management system (CMS) does not use this type of cookie.

For more information about the cookies we use, contact INRA’s Data Protection Officer by email at cil-dpo@inra.fr or by post at:

INRA
24, chemin de Borde Rouge –Auzeville – CS52627
31326 Castanet Tolosan CEDEX - France

Dernière mise à jour : Mai 2018

Menu Logo Principal Oniris

Home page

Epidemiological modelling and its use to manage COVID-19

Insights into mechanistic models, by the DYNAMO team

Over the next few weeks, we will present some key elements of epidemiological modelling through short educational articles. These articles will help you to better understand and decipher the assumptions underlying the epidemiological models that are currently widely used, and how these assumptions can impact predictions regarding the spread of pathogens, particularly SARS-CoV-2. The objective is to discover the advantages and limitations of mechanistic modelling, an approach that is at the core of the DYNAMO team's work. The examples of models will be inspired by models used in crisis, but sometimes simplified to make them accessible.

#8 - The link to data: how are models calibrated?

The relevance of the predictions of epidemiological models is often limited by the uncertainty of their parameter values. But how are these parameters estimated? And how is the onset of the epidemic defined?

Different information sources can be used :

  • the scientific literature,
  • experimental and historical data,
  • monitoring data on the current epidemic dynamics (number of individuals tested positive, number of individuals developing severe clinical signs, number of deaths due to the disease). Since the beginning of the COVID-19 epidemic, all of this information has been collected and centralised.

Observable processes (such as duration of the symptomatic state or at the hospital) are readily available. More uncertain, often unobservable parameters (transmission rate, latency duration) need to be estimated. For this purpose, various inference methods exist, each with their advantages and disadvantages. Inference methods that maximise the likelihood of the model are often used (for example, see this link). However, the likelihood of a model cannot always be assessed, especially when the model is dynamic, with a large number of variables, and stochastic, or if the available data are spatiotemporal, incomplete, censored, or imperfect. Likelihood-free methods have been developed to overcome this methodological problem.

Although in our example, maximizing likelihood is possible, let us use the likelihood-free methods to explain their use. These methods are preferred in the DYNAMO team because the epidemiological systems we usually study are complex. Here we used the ABC-SMC (Approximate Bayesian Computation - Sequential Monte Carlo) method. ABC methods are quite intuitive: (1) sets of parameters are generated by drawing parameter values in prior distributions; (2) a simulation of the model is carried out for each set of parameters and compared to the real data via summary statistics (i.e. a simplified representation of the data); (3) the sets of parameters with a distance between simulated and observed summary statistics below a tolerance threshold are selected — the smaller the distance, the more likely the set of parameters; (4) the parameter values are then estimated (posterior distributions). A disadvantage of these methods is the very large number of simulations to be carried out, resulting in substantial calculation costs.

Let's take again the model of article #6 with lockdown from 16 March, and estimate 4 of its parameters: β (the transmission rate), σ (the multiplying factor reducing the excretion of Ip, Ia, Ips), 1/ε (the average duration of latency), and the date of introduction (which should be seen here as the date of perennial establishment of the infection in the population). The data used correspond to the hospital data of the COVID-19 epidemic, more specifically the number of deaths over time, represented in the model by the M-state. The amount of data available may impacts the estimates, especially if there are few data. For the example, we estimated the parameters by mobilizing the available data at 3 dates: one week before lockdown (t = 68 days), the day before lockdown (t = 75 days), and one week after lockdown (t = 82 days). Post-lockdown data were not used. These scenarios lead to parameter estimates that are sufficiently different to predict contrasting epidemic dynamics, clearly illustrating the need to update the models very regularly, especially if they are used for health management purposes.

Values of estimated parameters (mean and 90% confidence interval)

Scenario

β

σ

1/ε

date of intro.

t = 68

1.89 [1.09 ; 2.76]

0.48 [0.35 ; 0.79]

3.3 [1.2 ; 4.7]

34 [33 ; 35]

t = 75

1.48 [0.77 ; 2.33]

0.43 [0.12 ; 0.84]

3.3 [1.0 ; 4.9]

19 [16 ; 21]

t = 82

1.48 [0.85 ; 2.40]

0.47 [0.17 ; 0.85]

3.3 [1.1 ; 4.8]

22 [20 ; 24]

In addition to the model parameters, the initial conditions of the model can also be estimated, highlighting in our example an establishment of infection in the population about one month before the first deaths.

Distribution de la date d'introduction du virus, estimée en fonction de 3 jeux de données

A posterior distributions of the date of establishment of the infection according to the amount of usable data.

Using data available as of March 10 (in blue) gives an estimate of the introduction date as of February 5 (t=34 days). With more data available (in green then yellow), the date of introduction is estimated to be around January 24 (t=22 days).

Nombre de nouveaux décès par jour
Nombre de décès cumulés par jour

Model predictions in number of new deaths (left) and cumulative number of deaths (right), depending on the amount of data that can be used to estimate model parameters. The other parameters and simulation conditions are the same as in article #6.

By using the data available one week after containment (t=82 days, in yellow) to estimate the model input parameters, the simulations give a result closer to the observed data than in other cases (where less data are available).

We hope that this article has convinced you that it is essential that models and observational data work together! Updating the models in real time when new cases occur is a necessary challenge, especially at the beginning of an epidemic, to improve the predictive quality of the models and thus the confidence in their predictions. However, inference methods are not infallible. The results also depend on the assumptions (structure of the model), the type and quality of the data... Moreover, the more parameters to be estimated, the more complex it becomes. Calibrating a model is a process that can take a long time, and must be coupled with analyses to verify that the available data actually enable to estimate the desired parameters.

Article #9 will take a step back from the previous articles to discuss why (and how) to mobilize a mechanistic modelling approach in epidemiology.