The COVID-19 Model Challenges

Does democracy save lives? Do female political leaders respond more effectively to public health crises? Are ethnically diverse societies more vulnerable to COVID-19? How do you think political and social features of countries relate to cumulative COVID-19 deaths? Work with data we have assembled to build statistical models that predict COVID-19 mortality across and within countries. Take our model-building challenges!

New pedagogical materials will be made available in early 2022 to assist statistics and data analysis instructors in using the Model Challenges in the classroom in the first half of 2022. Please contact us for more information.

What are the challenges?

The challenge for each team is to build a statistical model using political and social variables to predict COVID-19 mortality numbers as of 31 August 2021. The overarching goal is to harness the collective capacity of social scientists to better understand cross-national and subnational patterns of COVID-19 mortalities. We offer four separate model challenges: one that uses cross-national data and three separate sub-national challenges — for India, Mexico, and the United States — where we ask entrants to build statistical models to predict COVID-19 mortalities at the state level. You may submit only one model per challenge but may enter as many of the four challenges as you wish.

How will we rank challenge submissions?

We aim to use a principled way to generate an aggregate model that combines and ranks the full set of legible model submissions. To do this, we will use a stacking approach. This is a method for generating a meta-model that weights predictions from many models to generate a new and more accurate predictive model. A single model gets a lot of weight if its predictions are useful for generating an overall prediction considering the predictions from other models. For each challenge, you may submit both a general and a parameterized model. A general model provides a structure linking COVID-19 deaths to social and political predictors with actual parameter values and predictions to be calculated in the future. A parameterized model is a general model that also includes guesses about parameter values. A parameterized model indicates how strong a relationship is, for example, whereas a general model does not. Separate weights will be generated for general and parameterized model predictions. We use the weights that the stacking analysis places on each model to rank all legible submitted models.

How do you take the challenges?

Teams or individuals are invited to submit general and, optionally, parameterized models. You create a general model by choosing up to three social or political variables to predict cumulative COVID-19 deaths as of 31 August 2021. Doing this allows you to submit your model to the general model challenge. You may also select your preferred functional form from a set of standard options or you may customize your functional form for general model submission. As you work, you will see how your variables perform on data as of 16 November 2020. You create a parameterized model by providing guesses for values for the parameters for your model. If you do this, you will also be entered into the parameterized model challenge. For your model to be judged legible, you will be asked to describe the rationale underlying your model choices. The whole exercise can take 20 minutes (or longer if you wish.)

What data can you use?

We encourage all entrants to use data from our central data repository, where we have gathered measures of state capacity, political institutions, political priorities, and social structures. You also have the option to add your own measures to our repository. When you download data from our repository (available at the Data tab at the top of this page), you will receive instructions on how to submit your own variables in order to merge them with ours. Please be sure that you have rights to any data that you submit.

Why should you participate in the challenges?

Participation offers multiple benefits. First, you participate in building collective social scientific capacity. Submission of a legible model will advance understanding of how well social scientists can predict truly important outcomes that are in part affected by social processes and political institutions. Second, we hope you will find the model challenges fun — even if challenging! Finally, when we write up the results of the model challenges and forecasting exercise that follows, contributors of the 10 models submitted in Round One held in 2020-21 that receive the most weights from the stacking exercise in each challenge will be asked if they wish to be included as co-authors. In addition, all those who submit a legible model will, with permission, be publicly acknowledged.

What happens when?

Round One was open from 30 November 2020 until 15 January 2021 for submission of global and subnational models predicting COVID-19 deaths as of 31 August 2021.

From 1 to 28 February 2021 social scientists were invited to forecast the success of a subset of submitted models.

On 30 September 2021 or as soon as possible thereafter, we released the rankings of the full set of legible models submitted in Round One to the COVID-19 Model Challenges.

Round Two will be open from 16 to 31 December 2021.

More detailed information about the timeline can be found HERE

Who created the challenges?

We are a team of social scientists who came together across institutions and countries to design this platform to help us learn from and share knowledge across social science disciplines. We organized ourselves through a steering committee chaired by Miriam Golden (EUI) and Alexandra Scacco (WZB). Institutional affiliations are provided for informational purposes only. The project is not sponsored by nor housed in a specific institution. About us

Where can you find more information?

For detailed information about how to submit a model, how we will assess model performance, how to participate in the forecasting phase of the study, and for answers to other questions, please refer to the FAQs.

Global

Download

India

Download

Mexico

Download

US

Download

Outcome = Cumulative reported deaths per million (logged)

1. For what population do you want to build a model?

2. How many predictors do you want to include in your model?

1

2

3

You are required to enter at least one political or social predictor in order to be considered for a challenge.

3. (Optional) You may add your own predictors, merging on our ID variable before uploading.
Browse...

To add your own predictors, please go to the Data tab, download the relevant dataset, and follow the instructions provided. They will guide you to correctly merge your data with ours.

Click here to see how the predictors in our repository are defined.

Estimation

We assume a simple linear model by default. But you can choose a nonlinear or interactive model instead or even specify an arbitrary custom function.

What type of model do you want to specify?

What type of custom model do you want to submit?

general

general and parameterized

In order to be included in a model challenge, we ask you provide (English-language) text below describing the logic or rationale linking the predictors you selected to COVID-19 deaths as of 31 August 2021. We would like to know why you think the set of predictors you chose matters for the outcome. We encourage you to reference relevant political science literature.

Click on the red example links to see the kinds of arguments we have in mind. References are not required in your explanatory logic.

Example 1

Theory Z (Author 1990, 1991) has argued that ... This account was successful in explaining government responses during crises X and Y. Though not examined in a health context to our knowledge, model A (Author B 2005) suggests that similar logics are likely to hold in this case. X1 is reasonably proxied by measure X1' and X2 is reasonably proxied by X2'. Theory ZZ suggests that these two will complement each other, so we include an interactive term in the model.

Example 2

Author A’s (2007) study of responses to the AIDS crisis in country Q highlighted the importance of feature X1. Author B’s (2010) book emphasized the importance of X2 in improving public compliance with health directives during the recent public health crisis in country R. Author C’s (2015) work in country S suggests that the effects of X1 and X2 are actually conditioned by X3. These arguments have not been assessed in a cross-country setting, but we believe the logic should travel. Though imperfect, we think measures X1' and X2' are appropriate to test the arguments advanced by authors A and B. We interact X3' with X1' and X2' to capture Author C's claim.

Please provide a logic for your model:

The COVID-19 Model Challenges

Global

India

Mexico

US

Outcome = Cumulative reported deaths per million (logged)

FAQs

Variable Definition

About us

About us

Data protection policy

Contact us