Description
Challenge 25 - Regional Reanalysis for Europe with Machine Learning
Stream 2 - Machine Learning for Earth Science
Goal
The main objective of this challenge is to develop a downscaling technique using Machine Learning (ML) tools such as to be able to generate finer spatial reanalysis information from a coarser grid-mesh reanalysis data.
Mentors and skills
- Mentors: Mohanad Albughdadi, Matthew Chantry, Andras Horanyi, Cornel Soci
- Skills required:
- Theoretical and practical knowledge of Machine Learning tools
- Experience handling large datasets
- Writing of programmes for Machine Learning
- Experience on plotting the results
Note: Only nationals from European Union (EU) Member States and countries associated with EU’s Digital Europe Programme (currently Iceland, Norway, and Lichtenstein) are eligible to participate (see Terms and Conditions).
Challenge description
ECMWF/C3S’s flagship global reanalysis is ERA5, which is covering the period 1940 to the present and has 31km of resolution on a global coverage (also includes a lower resolution uncertainty information). While ERA5 is very much appreciated by the users (more than 100 000 registered users in the CDS), they are also very much interested in accessing higher resolution and enhanced details in various parts of the Globe. For this, CERRA (Copernicus Regional Reanalysis for Europe) provides detailed information at 5.5km spatial horizontal resolution (CERRA also includes ensemble uncertainty information on 11km horizontal resolution). CERRA covers the period from September 1984 to June 2021. We will provide assistance to pre-process the ERA5 and CERRA datasets from the CDS.
The CERRA reanalysis includes a data assimilation system and a limited-area numerical weather prediction model. They have been used to produce high-resolution data using lateral boundary conditions from ERA5. The value of the regional reanalyses with respect to global reanalysis comes from the additional surface observations assimilated, the improved (i.e. more local details) description of the surface characteristics and the use of higher resolution tailor-made regional numerical weather prediction models.
The ultimate goal of this challenge in Code for Earth is to produce a model capable of downscaling ERA5 using regional forcings (orography or land-sea mask) towards the CERRA high-resolution analysis. A successful model would be capable of producing accurate CERRA estimates much faster than running the regional reanalysis system. As a proof of concept, we will target a limited number of parameters, starting with 2m temperature. The results would be compared to the original CERRA dataset and several baseline models. Possibly methodologies for downscaling could be conditional Generative Adversarial Models or Diffusion models.
A stretch goal of the challenge would be to provide ensemble-based uncertainty estimation to the downscaling fields. That might be achieved using the ensemble uncertainty information available from ERA5 and/or CERRA. A further stretch goal of the project could be to use sparse, noisy & synthetic observations from CERRA as an additional predictor, thus mimicking the use of observations in producing CERRA from ERA5.
From the practical point of view the challenge might consist of the following steps (take these points as guidelines):
- Training and validation of the Machine Learning model to map from ERA5 fields to CERRA fields on the above grid. Thereby creating the fine-scale structure of CERRA.
- Evaluate the produced dataset for the period 2019 July – 2021 June and provide data possibly up to present.
- Compare the produced dataset to the CERRA data (objective and subjective verification) and discuss the strengths and weaknesses of the proposed method and the new data.
Links and references
- ECMWF is operating the Copernicus Climate Change Service (C3S) operated by ECMWF on behalf of the European Union.
- Copernicus Climate Data Store, the interface and home for all the data provided by C3S.
- The proposed challenge is directly linked to the following datasets in the CDS:
- Popular science introduction to global and regional reanalyses in general
- General information about CERRA
- Example of downscaling earth system data using cGAN
- Example of downscaling earth system data using diffusion models
- Recent summary of super-resolution field