{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "0",
   "metadata": {},
   "source": [
    "# Overview\n",
    "\n",
    "This example notebook showcases the main features of Meteora. To that end, we will download and process meteorological observations from [the Automated Surface/Weather Observing Systems (ASOS/AWOS) program](https://www.ncei.noaa.gov/products/land-based-station/automated-surface-weather-observing-systems), which comprises more than 900 automated weather stations in the United States.\n",
    "\n",
    "More precisely, we will use the `METARASOSIEMClient` to stream [METAR](https://madis.ncep.noaa.gov/madis_OMO.shtml) from the [Iowa Environmental Mesonet](https://mesonet.agron.iastate.edu/request/asos/1min.phtml)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1",
   "metadata": {},
   "outputs": [],
   "source": [
    "import contextily as cx\n",
    "\n",
    "from meteora import clients, climate_indices, settings, units"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2",
   "metadata": {},
   "source": [
    "## Meteora clients\n",
    "\n",
    "Meteora is essentially a collection of \"client\" classes that allow [processing data from different providers](../supported-providers.html) in a standardized interface. The following sections will go through the main aspects of a Meteora client.\n",
    "\n",
    "### Selecting the client's region of interest\n",
    "\n",
    "All clients are instantiated with at least the `region` argument, which defines the spatial extent of the required data. The `region` argument uses the [pyregeon](https://pyregeon.readthedocs.io/en/latest) library and can be either:\n",
    "\n",
    "- A string with a place name (Nominatim query) to geocode.\n",
    "- A sequence with the west, south, east and north bounds.\n",
    "- A geometric object, e.g., shapely geometry, or a sequence of geometric objects. In such a case, the region will be passed as the `data` argument of the GeoSeries constructor.\n",
    "- A geopandas geo-series or geo-data frame.\n",
    "- A filename or URL, a file-like object opened in binary (`'rb'`) mode, or a `Path` object that will be passed to `geopandas.read_file`.\n",
    "\n",
    "In this case, we will use the country of Switzerland as defined by a query to the [Nominatim API](https://nominatim.org) (via [osmnx](https://github.com/gboeing/osmnx)):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "3",
   "metadata": {},
   "outputs": [],
   "source": [
    "region = \"Switzerland\""
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4",
   "metadata": {},
   "source": [
    "We can now instantiate our client:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "5",
   "metadata": {},
   "outputs": [],
   "source": [
    "client = clients.METARASOSIEMClient(region)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6",
   "metadata": {},
   "source": [
    "### Stations locations and metadata\n",
    "\n",
    "The list of stations maintained by the provider within the selected region can be accessed using the `stations_gdf` property:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "7",
   "metadata": {},
   "outputs": [],
   "source": [
    "client.stations_gdf.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8",
   "metadata": {},
   "source": [
    "which is essentially a geopandas data frame that includes station metadata including the location, so we can, e.g., plot it in a map:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "9",
   "metadata": {},
   "outputs": [],
   "source": [
    "ax = client.stations_gdf.plot()\n",
    "cx.add_basemap(ax, crs=client.stations_gdf.crs, attribution=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "10",
   "metadata": {},
   "source": [
    "*(C) OpenStreetMap contributors, Tiles style by Humanitarian OpenStreetMap Team hosted by OpenStreetMap France*\n",
    "\n",
    "### Variables\n",
    "\n",
    "The list of variables and their metadata is shown in the `variables_df` property:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "11",
   "metadata": {},
   "outputs": [],
   "source": [
    "client.variables_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "12",
   "metadata": {},
   "source": [
    "## Getting a time series of measurements\n",
    "\n",
    "Given a list of variables and time range, we can use the `get_ts_df` method to get a time series of station measurements:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "13",
   "metadata": {},
   "outputs": [],
   "source": [
    "variables = [\"tmpf\", \"dwpf\", \"relh\"]\n",
    "start = \"2021-08-13\"\n",
    "end = \"2021-08-16\"\n",
    "\n",
    "ts_df = client.get_ts_df(variables, start=start, end=end)\n",
    "ts_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "14",
   "metadata": {},
   "source": [
    "### Selecting date range\n",
    "\n",
    "While some providers only allow access to the most recent data, e.g., latest 24 hours, others allow querying data for a specific date range. In the latter case, the `start` and `end` arguments can be used to select the date range, which can be any object that can be converted to a [pandas `Timestamp` object](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Timestamp.html), i.e., a string, integer, float or a datetime object from the datetime module or numpy.\n",
    "\n",
    "### Selecting variables\n",
    "\n",
    "When accessing to time series data (e.g., the `get_ts_df` method of each client), the `variables` argument is used to select the variables to retrieve. The `variables` argument can be either:\n",
    "\n",
    "- a string or integer with variable name or code according to the provider's nomenclature, or\n",
    "- a string referring to an essential climate variable (ECV) using the Meteora nomenclature. These canonical ECV keys are defined in `meteora.settings` as `ECV_*` constants, you can see a copy here:\n",
    "\n",
    "```python\n",
    "# precipitation\n",
    "ECV_PRECIPITATION = \"precipitation\"  # Precipitation\n",
    "# pressure\n",
    "ECV_PRESSURE = \"pressure\"  # Pressure (surface)\n",
    "# radiation budget\n",
    "ECV_RADIATION_SHORTWAVE = \"radiation_shortwave\"  # Incoming short-wave radiation\n",
    "ECV_RADIATION_LONGWAVE_INCOMING = (\n",
    "    \"radiation_longwave_incoming\"  # Incoming long-wave radiation\n",
    ")\n",
    "ECV_RADIATION_LONGWAVE_OUTGOING = (\n",
    "    \"radiation_longwave_outgoing\"  # Outgoing long-wave radiation\n",
    ")\n",
    "# temperature\n",
    "ECV_TEMPERATURE = \"temperature\"  # Air temperature (usually at 2m above ground)\n",
    "# water vapour\n",
    "ECV_DEW_POINT_TEMPERATURE = (\n",
    "    \"dew_point_temperature\"  # Dew point temperature (usually at 2m above ground)\n",
    ")\n",
    "ECV_RELATIVE_HUMIDITY = \"relative_humidity\"  # Water vapour/relative humidity\n",
    "# wind\n",
    "ECV_WIND_SPEED = \"wind_speed\"  # Surface wind speed\n",
    "ECV_WIND_DIRECTION = \"wind_direction\"  # Surface wind direction\n",
    "\n",
    "```\n",
    "\n",
    "See the guidelines by the [World Meteorological Organization](https://gcos.wmo.int/site/global-climate-observing-system-gcos/essential-climate-variables) on ECVs (on the category \"Atmosphere\" > \"Surface\") for more information.\n",
    "\n",
    "In the returned time series data frames, **the variable labels will be the same as they have been passed to the `get_ts_df` method**. Therefore, if we were to assemble data frames for multiple clients (each with its own nomenclature), it may be better to use the common Meteora nomenclature, e.g., pass the variables as in:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "15",
   "metadata": {},
   "outputs": [],
   "source": [
    "variables = [\"temperature\", \"relative_humidity\", \"wind_speed\"]\n",
    "\n",
    "ts_df = client.get_ts_df(variables, start=start, end=end)\n",
    "ts_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "16",
   "metadata": {},
   "source": [
    "### Units\n",
    "\n",
    "As many probably noticed, the above temperatures are in Fahrenheit degrees. The time series data frames returned by `get_ts_df` include units metadata (based on [pint](https://pint.readthedocs.io) and [pint-pandas](https://pint-pandas.readthedocs.io)) which can be accessed through the \"units\" key of the data frame's `attrs` attribute:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "17",
   "metadata": {},
   "outputs": [],
   "source": [
    "ts_df.attrs[\"units\"]"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "18",
   "metadata": {},
   "source": [
    "You can convert the data frame to Meteora's canonical ECV units (defined in `settings.ECV_UNIT_DICT`) with `units.convert_units` when needed:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "19",
   "metadata": {},
   "outputs": [],
   "source": [
    "ts_df_metric = units.convert_units(\n",
    "    ts_df,\n",
    "    settings.ECV_UNIT_DICT,\n",
    ")\n",
    "ts_df_metric"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "20",
   "metadata": {},
   "source": [
    "This allows to easily combine data from multiple providers.\n",
    "\n",
    "We can operate with the resulting objects as we would with any pandas/geopandas data frame, e.g., we can plot the stations by mean temperature over the requested period:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "21",
   "metadata": {},
   "outputs": [],
   "source": [
    "t_mean_label = \"T$_{mean}$ [°C]\"\n",
    "\n",
    "ax = client.stations_gdf.assign(\n",
    "    **{t_mean_label: ts_df_metric.groupby(\"station_id\")[\"temperature\"].mean()}\n",
    ").plot(\n",
    "    t_mean_label,\n",
    "    cmap=\"coolwarm\",\n",
    "    legend=True,\n",
    "    legend_kwds={\"label\": t_mean_label, \"shrink\": 0.4},\n",
    ")\n",
    "cx.add_basemap(ax, crs=client.stations_gdf.crs, attribution=False)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "22",
   "metadata": {},
   "source": [
    "*(C) OpenStreetMap contributors, Tiles style by Humanitarian OpenStreetMap Team hosted by OpenStreetMap France*\n",
    "\n",
    "## Computing climate indices"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "23",
   "metadata": {},
   "source": [
    "Meteora integrates with `xclim` through the `meteora.climate_indices` module, so we can compute indices directly from a station time series data. For instance, we can compute the number of tropical nights for each station, i.e., the number of days where the daily minimum temperature stays above a threshold:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "24",
   "metadata": {},
   "outputs": [],
   "source": [
    "tn_df = climate_indices.tn_days_above(ts_df, thresh=\"20 degC\")\n",
    "tn_df"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "25",
   "metadata": {},
   "source": [
    "## Where to go from here\n",
    "\n",
    "- See [why meteora](why-meteora.ipynb) to understand the motivation behind the library and the importance of the spatial coverage meteorological information.\n",
    "- Explore further climate indices and xclim in the dedicated [climate indices notebook](climate-indices.ipynb).\n",
    "- Detect heatwave periods and extract their corresponding weather data for further inspection as reviewed in the [heatwave detection notebook](heatwave-detection.ipynb).\n",
    "- If you use citizen weather stations (CWS), review the quality control (QC) methods implemented in meteora in the [Netatmo QC notebook](netatmo-qc.ipynb).\n",
    "- Learn about the different data structures for meteorological data, their strengths and their weaknesses in the [data structures](data-structures.ipynb)."
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "cell_metadata_filter": "-all"
  },
  "kernelspec": {
   "display_name": "Python (Pixi)",
   "language": "python",
   "name": "pixi-kernel-python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.11"
  },
  "pixi-kernel": {
   "environment": "doc"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}