{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": {}, "source": [ "# A01 - ERA5 data download\n", "\n", "This notebook downloads ERA5 air temperature from the ECMWF Data Stores API for a\n", "given bounding box and period. It uses `pooch` to manage a local cache and avoid\n", "re-downloading files.\n", "\n", "**TODO**: try using Earthmover's arraylake as an alternative approach to access ERA5 data https://docs.earthmover.io/sample-data/era5" ] }, { "cell_type": "code", "execution_count": null, "id": "1", "metadata": {}, "outputs": [], "source": [ "import os\n", "from os import path\n", "\n", "import pooch\n", "import xarray as xr\n", "from ecmwf.datastores import Client" ] }, { "cell_type": "markdown", "id": "2", "metadata": {}, "source": [ "Provide `ecmwf_key` below or set `ECMWF_DATASTORES_KEY`/`CDSAPI_KEY` in the environment.\n", "The default `ecmwf_url` points to the CDS API base (`https://cds.climate.copernicus.eu/api`)." ] }, { "cell_type": "code", "execution_count": null, "id": "3", "metadata": { "tags": [ "parameters" ] }, "outputs": [], "source": [ "region = [8.34, 47.28, 8.67, 47.54]\n", "# select study period\n", "start_year = 2022\n", "end_year = 2024\n", "# months to consider when querying the data\n", "start_month = 6\n", "end_month = 8\n", "\n", "# ECMWF Data Stores Service\n", "ecmwf_key = None # optional override (e.g., \":\")\n", "ecmwf_url = \"https://cds.climate.copernicus.eu/api\"\n", "era5_dataset = \"reanalysis-era5-single-levels\"\n", "era5_variable = \"2m_temperature\"\n", "cache_dir = None # optional, otherwise use pooch default\n", "\n", "# output file\n", "era5_filename = (\n", " f\"{'-'.join([str(coord) for coord in region])}_era5_{era5_variable}_\"\n", " f\"m{start_month:02d}-m{end_month:02d}_{start_year}-{end_year}.nc\"\n", ")\n", "dst_dir = \"data\"\n", "dst_filepath = path.join(dst_dir, era5_filename)" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "Now we will download (or retrieve it from a local cache) our requested ERA5 data:" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "_ecmwf_key = ecmwf_key or os.getenv(\"ECMWF_DATASTORES_KEY\") or os.getenv(\"CDSAPI_KEY\")\n", "\n", "if _ecmwf_key is None:\n", " print(\"ECMWF API key not provided; skipping ERA5 download\")\n", " era5_ds = None\n", "else:\n", " west, south, east, north = region\n", " era5_request = {\n", " \"product_type\": \"reanalysis\",\n", " \"variable\": era5_variable,\n", " \"year\": [str(y) for y in range(start_year, end_year + 1)],\n", " \"month\": [f\"{m:02d}\" for m in range(start_month, end_month + 1)],\n", " \"day\": [f\"{d:02d}\" for d in range(1, 32)],\n", " \"time\": [f\"{h:02d}:00\" for h in range(24)],\n", " \"area\": [north, west, south, east],\n", " \"data_format\": \"netcdf\",\n", " \"download_format\": \"unarchived\",\n", " }\n", "\n", " client = Client(url=ecmwf_url, key=_ecmwf_key)\n", "\n", " def _era5_downloader(url, output_file, pooch_obj, check_only=False):\n", " if check_only:\n", " return\n", " client.retrieve(era5_dataset, era5_request, target=str(output_file))\n", "\n", " era5_path = pooch.retrieve(\n", " url=ecmwf_url,\n", " known_hash=None,\n", " fname=era5_filename,\n", " path=cache_dir,\n", " downloader=_era5_downloader,\n", " )\n", " era5_ds = xr.open_dataset(era5_path)\n", "\n", "if era5_ds is not None:\n", " # rename vaid_time to time\n", " era5_ds = era5_ds.rename({\"valid_time\": \"time\"})\n", " # and dump it to a netcdf file:\n", " era5_ds.to_netcdf(dst_filepath)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" } }, "nbformat": 4, "nbformat_minor": 5 }