{ "cells": [ { "cell_type": "markdown", "id": "0", "metadata": {}, "source": [ "# Overview\n", "\n", "This example notebook showcases the main features of Meteora. To that end, we will download and process meteorological observations from [the Automated Surface/Weather Observing Systems (ASOS/AWOS) program](https://www.ncei.noaa.gov/products/land-based-station/automated-surface-weather-observing-systems), which comprises more than 900 automated weather stations in the United States.\n", "\n", "More precisely, we will use the `METARASOSIEMClient` to stream [METAR](https://madis.ncep.noaa.gov/madis_OMO.shtml) from the [Iowa Environmental Mesonet](https://mesonet.agron.iastate.edu/request/asos/1min.phtml)." ] }, { "cell_type": "code", "execution_count": null, "id": "1", "metadata": {}, "outputs": [], "source": [ "import contextily as cx\n", "\n", "from meteora import clients, climate_indices, settings, units" ] }, { "cell_type": "markdown", "id": "2", "metadata": {}, "source": [ "## Meteora clients\n", "\n", "Meteora is essentially a collection of \"client\" classes that allow [processing data from different providers](../supported-providers.html) in a standardized interface. The following sections will go through the main aspects of a Meteora client.\n", "\n", "### Selecting the client's region of interest\n", "\n", "All clients are instantiated with at least the `region` argument, which defines the spatial extent of the required data. The `region` argument uses the [pyregeon](https://pyregeon.readthedocs.io/en/latest) library and can be either:\n", "\n", "- A string with a place name (Nominatim query) to geocode.\n", "- A sequence with the west, south, east and north bounds.\n", "- A geometric object, e.g., shapely geometry, or a sequence of geometric objects. In such a case, the region will be passed as the `data` argument of the GeoSeries constructor.\n", "- A geopandas geo-series or geo-data frame.\n", "- A filename or URL, a file-like object opened in binary (`'rb'`) mode, or a `Path` object that will be passed to `geopandas.read_file`.\n", "\n", "In this case, we will use the country of Switzerland as defined by a query to the [Nominatim API](https://nominatim.org) (via [osmnx](https://github.com/gboeing/osmnx)):" ] }, { "cell_type": "code", "execution_count": null, "id": "3", "metadata": {}, "outputs": [], "source": [ "region = \"Switzerland\"" ] }, { "cell_type": "markdown", "id": "4", "metadata": {}, "source": [ "We can now instantiate our client:" ] }, { "cell_type": "code", "execution_count": null, "id": "5", "metadata": {}, "outputs": [], "source": [ "client = clients.METARASOSIEMClient(region)" ] }, { "cell_type": "markdown", "id": "6", "metadata": {}, "source": [ "### Stations locations and metadata\n", "\n", "The list of stations maintained by the provider within the selected region can be accessed using the `stations_gdf` property:" ] }, { "cell_type": "code", "execution_count": null, "id": "7", "metadata": {}, "outputs": [], "source": [ "client.stations_gdf.head()" ] }, { "cell_type": "markdown", "id": "8", "metadata": {}, "source": [ "which is essentially a geopandas data frame that includes station metadata including the location, so we can, e.g., plot it in a map:" ] }, { "cell_type": "code", "execution_count": null, "id": "9", "metadata": {}, "outputs": [], "source": [ "ax = client.stations_gdf.plot()\n", "cx.add_basemap(ax, crs=client.stations_gdf.crs, attribution=False)" ] }, { "cell_type": "markdown", "id": "10", "metadata": {}, "source": [ "*(C) OpenStreetMap contributors, Tiles style by Humanitarian OpenStreetMap Team hosted by OpenStreetMap France*\n", "\n", "### Variables\n", "\n", "The list of variables and their metadata is shown in the `variables_df` property:" ] }, { "cell_type": "code", "execution_count": null, "id": "11", "metadata": {}, "outputs": [], "source": [ "client.variables_df" ] }, { "cell_type": "markdown", "id": "12", "metadata": {}, "source": [ "## Getting a time series of measurements\n", "\n", "Given a list of variables and time range, we can use the `get_ts_df` method to get a time series of station measurements:" ] }, { "cell_type": "code", "execution_count": null, "id": "13", "metadata": {}, "outputs": [], "source": [ "variables = [\"tmpf\", \"dwpf\", \"relh\"]\n", "start = \"2021-08-13\"\n", "end = \"2021-08-16\"\n", "\n", "ts_df = client.get_ts_df(variables, start=start, end=end)\n", "ts_df" ] }, { "cell_type": "markdown", "id": "14", "metadata": {}, "source": [ "### Selecting date range\n", "\n", "While some providers only allow access to the most recent data, e.g., latest 24 hours, others allow querying data for a specific date range. In the latter case, the `start` and `end` arguments can be used to select the date range, which can be any object that can be converted to a [pandas `Timestamp` object](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html), i.e., a string, integer, float or a datetime object from the datetime module or numpy.\n", "\n", "### Selecting variables\n", "\n", "When accessing to time series data (e.g., the `get_ts_df` method of each client), the `variables` argument is used to select the variables to retrieve. The `variables` argument can be either:\n", "\n", "- a string or integer with variable name or code according to the provider's nomenclature, or\n", "- a string referring to an essential climate variable (ECV) using the Meteora nomenclature. These canonical ECV keys are defined in `meteora.settings` as `ECV_*` constants, you can see a copy here:\n", "\n", "```python\n", "# precipitation\n", "ECV_PRECIPITATION = \"precipitation\" # Precipitation\n", "# pressure\n", "ECV_PRESSURE = \"pressure\" # Pressure (surface)\n", "# radiation budget\n", "ECV_RADIATION_SHORTWAVE = \"radiation_shortwave\" # Incoming short-wave radiation\n", "ECV_RADIATION_LONGWAVE_INCOMING = (\n", " \"radiation_longwave_incoming\" # Incoming long-wave radiation\n", ")\n", "ECV_RADIATION_LONGWAVE_OUTGOING = (\n", " \"radiation_longwave_outgoing\" # Outgoing long-wave radiation\n", ")\n", "# temperature\n", "ECV_TEMPERATURE = \"temperature\" # Air temperature (usually at 2m above ground)\n", "# water vapour\n", "ECV_DEW_POINT_TEMPERATURE = (\n", " \"dew_point_temperature\" # Dew point temperature (usually at 2m above ground)\n", ")\n", "ECV_RELATIVE_HUMIDITY = \"relative_humidity\" # Water vapour/relative humidity\n", "# wind\n", "ECV_WIND_SPEED = \"wind_speed\" # Surface wind speed\n", "ECV_WIND_DIRECTION = \"wind_direction\" # Surface wind direction\n", "\n", "```\n", "\n", "See the guidelines by the [World Meteorological Organization](https://gcos.wmo.int/site/global-climate-observing-system-gcos/essential-climate-variables) on ECVs (on the category \"Atmosphere\" > \"Surface\") for more information.\n", "\n", "In the returned time series data frames, **the variable labels will be the same as they have been passed to the `get_ts_df` method**. Therefore, if we were to assemble data frames for multiple clients (each with its own nomenclature), it may be better to use the common Meteora nomenclature, e.g., pass the variables as in:" ] }, { "cell_type": "code", "execution_count": null, "id": "15", "metadata": {}, "outputs": [], "source": [ "variables = [\"temperature\", \"relative_humidity\", \"wind_speed\"]\n", "\n", "ts_df = client.get_ts_df(variables, start=start, end=end)\n", "ts_df" ] }, { "cell_type": "markdown", "id": "16", "metadata": {}, "source": [ "### Units\n", "\n", "As many probably noticed, the above temperatures are in Fahrenheit degrees. The time series data frames returned by `get_ts_df` include units metadata (based on [pint](https://pint.readthedocs.io) and [pint-pandas](https://pint-pandas.readthedocs.io)) which can be accessed through the \"units\" key of the data frame's `attrs` attribute:" ] }, { "cell_type": "code", "execution_count": null, "id": "17", "metadata": {}, "outputs": [], "source": [ "ts_df.attrs[\"units\"]" ] }, { "cell_type": "markdown", "id": "18", "metadata": {}, "source": [ "You can convert the data frame to Meteora's canonical ECV units (defined in `settings.ECV_UNIT_DICT`) with `units.convert_units` when needed:" ] }, { "cell_type": "code", "execution_count": null, "id": "19", "metadata": {}, "outputs": [], "source": [ "ts_df_metric = units.convert_units(\n", " ts_df,\n", " settings.ECV_UNIT_DICT,\n", ")\n", "ts_df_metric" ] }, { "cell_type": "markdown", "id": "20", "metadata": {}, "source": [ "This allows to easily combine data from multiple providers.\n", "\n", "We can operate with the resulting objects as we would with any pandas/geopandas data frame, e.g., we can plot the stations by mean temperature over the requested period:" ] }, { "cell_type": "code", "execution_count": null, "id": "21", "metadata": {}, "outputs": [], "source": [ "t_mean_label = \"T$_{mean}$ [°C]\"\n", "\n", "ax = client.stations_gdf.assign(\n", " **{t_mean_label: ts_df_metric.groupby(\"station_id\")[\"temperature\"].mean()}\n", ").plot(\n", " t_mean_label,\n", " cmap=\"coolwarm\",\n", " legend=True,\n", " legend_kwds={\"label\": t_mean_label, \"shrink\": 0.4},\n", ")\n", "cx.add_basemap(ax, crs=client.stations_gdf.crs, attribution=False)" ] }, { "cell_type": "markdown", "id": "22", "metadata": {}, "source": [ "*(C) OpenStreetMap contributors, Tiles style by Humanitarian OpenStreetMap Team hosted by OpenStreetMap France*\n", "\n", "## Computing climate indices" ] }, { "cell_type": "markdown", "id": "23", "metadata": {}, "source": [ "Meteora integrates with `xclim` through the `meteora.climate_indices` module, so we can compute indices directly from a station time series data. For instance, we can compute the number of tropical nights for each station, i.e., the number of days where the daily minimum temperature stays above a threshold:" ] }, { "cell_type": "code", "execution_count": null, "id": "24", "metadata": {}, "outputs": [], "source": [ "tn_df = climate_indices.tn_days_above(ts_df, thresh=\"20 degC\")\n", "tn_df" ] }, { "cell_type": "markdown", "id": "25", "metadata": {}, "source": [ "## Where to go from here\n", "\n", "- See [why meteora](why-meteora.ipynb) to understand the motivation behind the library and the importance of the spatial coverage meteorological information.\n", "- Explore further climate indices and xclim in the dedicated [climate indices notebook](climate-indices.ipynb).\n", "- Detect heatwave periods and extract their corresponding weather data for further inspection as reviewed in the [heatwave detection notebook](heatwave-detection.ipynb).\n", "- If you use citizen weather stations (CWS), review the quality control (QC) methods implemented in meteora in the [Netatmo QC notebook](netatmo-qc.ipynb).\n", "- Learn about the different data structures for meteorological data, their strengths and their weaknesses in the [data structures](data-structures.ipynb)." ] } ], "metadata": { "jupytext": { "cell_metadata_filter": "-all" }, "kernelspec": { "display_name": "Python (Pixi)", "language": "python", "name": "pixi-kernel-python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.13.11" }, "pixi-kernel": { "environment": "doc" } }, "nbformat": 4, "nbformat_minor": 5 }