Data API¶

This tutorial is separated into three main parts: the first two parts shows how to find and get data to do impact calculations and should be enough for most users. The third part provides more detailed information on how the API is built.

Contents¶

Finding Datasets
- Data types and data type groups
- Datasets and Properties
Basic impact calculation
- Wrapper functions to open datasets as CLIMADA objects
- Calculate the impact
Technical Information
- Server
- Client
- Metadata
- Download

Finding datasets¶

[1]:

from climada.util.api_client import Client
client = Client()

Data types and data type groups¶

The datasets are first separated into ‘data_type_groups’, which represent the main classes of CLIMADA (exposures, hazard, vulnerability, …). So far, data is available for exposures and hazard. Then, data is separated into data_types, representing the different hazards and exposures available in CLIMADA

[2]:

import pandas as pd
data_types = client.list_data_type_infos()

dtf = pd.DataFrame(data_types)
dtf.sort_values(['data_type_group', 'data_type'])

[2]:

	data_type	data_type_group	status	description	properties
3	crop_production	exposures	active	None	[{'property': 'crop', 'mandatory': True, 'desc...
0	litpop	exposures	active	None	[{'property': 'res_arcsec', 'mandatory': False...
5	centroids	hazard	active	None	[]
2	river_flood	hazard	active	None	[{'property': 'res_arcsec', 'mandatory': False...
4	storm_europe	hazard	active	None	[{'property': 'country_iso3alpha', 'mandatory'...
1	tropical_cyclone	hazard	active	None	[{'property': 'res_arcsec', 'mandatory': True,...

Datasets and Properties¶

For each data type, the single datasets can be differentiated based on properties. The following function provides a table listing the properties and possible values. This table does not provide information on properties that can be combined but the search can be refined in order to find properties to query a unique dataset. Note that a maximum of 10 property values are shown here, but many more countries are available for example.

[3]:

litpop_dataset_infos = client.list_dataset_infos(data_type='litpop')

[4]:

all_properties = client.get_property_values(litpop_dataset_infos)

[5]:

all_properties.keys()

[5]:

dict_keys(['res_arcsec', 'exponents', 'fin_mode', 'spatial_coverage', 'country_iso3alpha', 'country_name', 'country_iso3num'])

Refining the search:¶

[6]:

# as datasets are usually available per country, chosing a country or global dataset reduces the options
# here we want to see which datasets are available for litpop globally:
client.get_property_values(litpop_dataset_infos, known_property_values = {'spatial_coverage':'global'})

[6]:

{'res_arcsec': ['150'],
 'exponents': ['(0,1)', '(1,1)', '(3,0)'],
 'fin_mode': ['pop', 'pc'],
 'spatial_coverage': ['global']}

[7]:

#and here for Switzerland:
client.get_property_values(litpop_dataset_infos, known_property_values = {'country_name':'Switzerland'})

[7]:

{'res_arcsec': ['150'],
 'exponents': ['(3,0)', '(0,1)', '(1,1)'],
 'fin_mode': ['pc', 'pop'],
 'spatial_coverage': ['country'],
 'country_iso3alpha': ['CHE'],
 'country_name': ['Switzerland'],
 'country_iso3num': ['756']}

Basic impact calculation¶

We here show how to make a basic impact calculation with tropical cyclones for Haiti, for the year 2040, rcp4.5 and generated with 10 synthetic tracks. For more technical details on the API, see below.

Wrapper functions to open datasets as CLIMADA objects¶

The wrapper functions client.get_hazard()¶

gets the dataset information, downloads the data and opens it as a hazard instance

[8]:

tc_dataset_infos = client.list_dataset_infos(data_type='tropical_cyclone')
client.get_property_values(tc_dataset_infos, known_property_values = {'country_name':'Haiti'})

[8]:

{'res_arcsec': ['150'],
 'climate_scenario': ['rcp26', 'rcp45', 'rcp85', 'historical', 'rcp60'],
 'ref_year': ['2040', '2060', '2080'],
 'nb_synth_tracks': ['50', '10'],
 'spatial_coverage': ['country'],
 'tracks_year_range': ['1980_2020'],
 'country_iso3alpha': ['HTI'],
 'country_name': ['Haiti'],
 'country_iso3num': ['332'],
 'resolution': ['150 arcsec']}

[9]:

client = Client()
tc_haiti = client.get_hazard('tropical_cyclone', properties={'country_name': 'Haiti', 'climate_scenario': 'rcp45', 'ref_year':'2040', 'nb_synth_tracks':'10'})
tc_haiti.plot_intensity(0)

[9]:

<GeoAxesSubplot:title={'center':'TC max intensity at each point'}>

../_images/tutorial_climada_util_api_client_17_1.png

The wrapper functions client.get_litpop_default()¶

gets the default litpop, with exponents (1,1) and ‘produced capital’ as financial mode. If no country is given, the global dataset will be downloaded.

[10]:

litpop_default = client.get_property_values(litpop_dataset_infos, known_property_values = {'fin_mode':'pc', 'exponents':'(1,1)'})

[11]:

litpop = client.get_litpop_default(country='Haiti')

Get the default impact function for tropical cyclones¶

[12]:

from climada.entity.impact_funcs import ImpactFuncSet, ImpfTropCyclone

imp_fun = ImpfTropCyclone.from_emanuel_usa()
imp_fun.check()
imp_fun.plot()

imp_fun_set = ImpactFuncSet()
imp_fun_set.append(imp_fun)

litpop.impact_funcs = imp_fun_set

2022-01-31 22:30:21,359 - climada.entity.impact_funcs.base - WARNING - For intensity = 0, mdd != 0 or paa != 0. Consider shifting the origin of the intensity scale. In impact.calc the impact is always null at intensity = 0.

../_images/tutorial_climada_util_api_client_22_1.png

Calculate the impact¶

[13]:

from climada.engine import Impact
impact = Impact()
impact.calc(litpop, imp_fun_set, tc_haiti)

Getting other Exposures¶

[14]:

crop_dataset_infos = client.list_dataset_infos(data_type='crop_production')

client.get_property_values(crop_dataset_infos)

[14]:

{'crop': ['whe', 'soy', 'ric', 'mai'],
 'irrigation_status': ['noirr', 'firr'],
 'unit': ['USD', 'Tonnes'],
 'spatial_coverage': ['global']}

[15]:

rice_exposure = client.get_exposures(exposures_type='crop_production', properties = {'crop':'ric', 'unit': 'USD','irrigation_status': 'noirr'})

Technical Information¶

For programmatical access to the CLIMADA data API there is a specific REST call wrapper class: climada.util.client.Client.

Server¶

The CLIMADA data file server is hosted on https://data.iac.ethz.ch that can be accessed via a REST API at https://climada.ethz.ch. For REST API details, see the documentation.

Client¶

[16]:

Client?

Init signature: Client()
Docstring:
Python wrapper around REST calls to the CLIMADA data API server.

Init docstring:
Constructor of Client.

Data API host and chunk_size (for download) are configurable values.
Default values are 'climada.ethz.ch' and 8096 respectively.
File:           c:\users\me\polybox\workshop\climada_python\climada\util\api_client.py
Type:           type
Subclasses:

[17]:

client = Client()
client.chunk_size

[17]:

The url to the API server and the chunk size for the file download can be configured in ‘climada.conf’. Just replace the corresponding default values:

"data_api": {
    "host": "https://climada.ethz.ch",
    "chunk_size": 8192,
    "cache_db": "{local_data.system}/.downloads.db"
}

The other configuration value affecting the data_api client, cache_db, is the path to an SQLite database file, which is keeping track of the files that are successfully downloaded from the api server. Before the Client attempts to download any file from the server, it checks whether the file has been downloaded before and if so, whether the previously downloaded file still looks good (i.e., size and time stamp are as expected). If all of this is the case, the file is simply read from disk without submitting another request.

Metadata¶

Unique Identifiers¶

Any dataset can be identified with data_type, name and version. The combination of the three is unique in the API servers’ underlying database. However, sometimes the name is already enough for identification. All datasets have a UUID, a universally unique identifier, which is part of their individual url. E.g., the uuid of the dataset https://climada.ethz.ch/rest/dataset/b1c76120-4e60-4d8f-99c0-7e1e7b7860ec is “b1c76120-4e60-4d8f-99c0-7e1e7b7860ec”. One can retrieve their meta data by:

[18]:

client.get_dataset_info_by_uuid('b1c76120-4e60-4d8f-99c0-7e1e7b7860ec')

[18]:

DatasetInfo(uuid='b1c76120-4e60-4d8f-99c0-7e1e7b7860ec', data_type=DataTypeShortInfo(data_type='litpop', data_type_group='exposures'), name='LitPop_assets_pc_150arcsec_SGS', version='v1', status='active', properties={'res_arcsec': '150', 'exponents': '(3,0)', 'fin_mode': 'pc', 'spatial_coverage': 'country', 'date_creation': '2021-09-23', 'climada_version': 'v2.2.0', 'country_iso3alpha': 'SGS', 'country_name': 'South Georgia and the South Sandwich Islands', 'country_iso3num': '239'}, files=[FileInfo(uuid='b1c76120-4e60-4d8f-99c0-7e1e7b7860ec', url='https://data.iac.ethz.ch/climada/b1c76120-4e60-4d8f-99c0-7e1e7b7860ec/LitPop_assets_pc_150arcsec_SGS.hdf5', file_name='LitPop_assets_pc_150arcsec_SGS.hdf5', file_format='hdf5', file_size=1086488, check_sum='md5:27bc1846362227350495e3d946dfad5e')], doi=None, description="LitPop asset value exposure per country: Gridded physical asset values by country, at a resolution of 150 arcsec. Values are total produced capital values disaggregated proportionally to the cube of nightlight intensity (Lit^3, based on NASA Earth at Night). The following values were used as parameters in the LitPop.from_countries() method:{'total_values': 'None', 'admin1_calc': 'False','reference_year': '2018', 'gpw_version': '4.11'}Reference: Eberenz et al., 2020. https://doi.org/10.5194/essd-12-817-2020", license='Attribution 4.0 International (CC BY 4.0)', activation_date='2021-09-13 09:08:28.358559+00:00', expiration_date=None)

or by filtering:

Data Set Status¶

The datasets of climada.ethz.ch may have the following stati: - active: the deault for real life data - preliminary: when the dataset is already uploaded but some information or file is still missing - expired: when a dataset is inactivated again - test_dataset: data sets that are used in unit or integration tests have this status in order to be taken seriously by accident When collecting a list of datasets with get_datasets, the default dataset status will be ‘active’. With the argument status=None this filter can be turned off.

DatasetInfo Objects and DataFrames¶

As stated above get_dataset (or get_dataset_by_uuid) return a DatasetInfo object and get_datasets a list thereof.

[19]:

from climada.util.api_client import DatasetInfo
DatasetInfo?

Init signature:
DatasetInfo(
    uuid: str,
    data_type: climada.util.api_client.DataTypeShortInfo,
    name: str,
    version: str,
    status: str,
    properties: dict,
    files: list,
    doi: str,
    description: str,
    license: str,
    activation_date: str,
    expiration_date: str,
) -> None
Docstring:      dataset data from CLIMADA data API.
File:           c:\users\me\polybox\workshop\climada_python\climada\util\api_client.py
Type:           type
Subclasses:

where files is a list of FileInfo objects:

[20]:

from climada.util.api_client import FileInfo
FileInfo?

Init signature:
FileInfo(
    uuid: str,
    url: str,
    file_name: str,
    file_format: str,
    file_size: int,
    check_sum: str,
) -> None
Docstring:      file data from CLIMADA data API.
File:           c:\users\me\polybox\workshop\climada_python\climada\util\api_client.py
Type:           type
Subclasses:

Convert into DataFrame¶

There are conveinience functions to easily convert datasets into pandas DataFrames, get_datasets and expand_files:

[21]:

client.into_datasets_df?

Signature: client.into_datasets_df(dataset_infos)
Docstring:
Convenience function providing a DataFrame of datasets with properties.

Parameters
----------
dataset_infos : list of DatasetInfo
     as returned by list_dataset_infos

Returns
-------
pandas.DataFrame
    of datasets with properties as found in query by arguments
File:      c:\users\me\polybox\workshop\climada_python\climada\util\api_client.py
Type:      function

[22]:

from climada.util.api_client import Client
client = Client()
litpop_datasets = client.list_dataset_infos(data_type='litpop', properties={'country_name': 'South Georgia and the South Sandwich Islands'})
litpop_df = client.into_datasets_df(litpop_datasets)
litpop_df

[22]:

	data_type	data_type_group	uuid	name	version	status	doi	description	license	activation_date	expiration_date	res_arcsec	exponents	fin_mode	spatial_coverage	date_creation	climada_version	country_iso3alpha	country_name	country_iso3num
0	litpop	exposures	b1c76120-4e60-4d8f-99c0-7e1e7b7860ec	LitPop_assets_pc_150arcsec_SGS	v1	active	None	LitPop asset value exposure per country: Gridd...	Attribution 4.0 International (CC BY 4.0)	2021-09-13 09:08:28.358559+00:00	None	150	(3,0)	pc	country	2021-09-23	v2.2.0	SGS	South Georgia and the South Sandwich Islands	239
1	litpop	exposures	3d516897-5f87-46e6-b673-9e6c00d110ec	LitPop_pop_150arcsec_SGS	v1	active	None	LitPop population exposure per country: Gridde...	Attribution 4.0 International (CC BY 4.0)	2021-09-13 09:09:10.634374+00:00	None	150	(0,1)	pop	country	2021-09-23	v2.2.0	SGS	South Georgia and the South Sandwich Islands	239
2	litpop	exposures	a6864a65-36a2-4701-91bc-81b1355103b5	LitPop_150arcsec_SGS	v1	active	None	LitPop asset value exposure per country: Gridd...	Attribution 4.0 International (CC BY 4.0)	2021-09-13 09:09:30.907938+00:00	None	150	(1,1)	pc	country	2021-09-23	v2.2.0	SGS	South Georgia and the South Sandwich Islands	239

Download¶

The wrapper functions get_exposures or get_hazard fetch the information, download the file and opens the file as a climada object. But one can also just download dataset files using the method download_dataset which takes a DatasetInfo object as argument and downloads all files of the dataset to a directory in the local file system.

[23]:

client.download_dataset?

Signature:
client.download_dataset(
    dataset,
    target_dir=WindowsPath('C:/Users/me/climada/data'),
    organize_path=True,
)
Docstring:
Download all files from a given dataset to a given directory.

Parameters
----------
dataset : DatasetInfo
    the dataset
target_dir : Path, optional
    target directory for download, by default `climada.util.constants.SYSTEM_DIR`
organize_path: bool, optional
    if set to True the files will end up in subdirectories of target_dir:
    [target_dir]/[data_type_group]/[data_type]/[name]/[version]
    by default True

Returns
-------
download_dir : Path
    the path to the directory containing the downloaded files,
    will be created if organize_path is True
downloaded_files : list of Path
    the downloaded files themselves

Raises
------
Exception
    when one of the files cannot be downloaded
File:      c:\users\me\polybox\workshop\climada_python\climada\util\api_client.py
Type:      method

Cache¶

The method avoids superfluous downloads by keeping track of all downloads in a sqlite db file. The client will make sure that the same file is never downloaded to the same target twice.

Examples¶

[24]:

ds = litpop_datasets[0]
download_dir, ds_files = client.download_dataset(ds)
ds_files[0], ds_files[0].is_file()

[24]:

(WindowsPath('C:/Users/me/climada/data/exposures/litpop/LitPop_assets_pc_150arcsec_SGS/v1/LitPop_assets_pc_150arcsec_SGS.hdf5'),
 True)

climada

Table of Contents

Previous topic

Next topic

This Page