Miscellaneous CLIMADA conventions

Table of Contents

  • 1   Miscellaneous CLIMADA conventions

    • 1.1   Dependencies (python packages)

    • 1.2   Class inheritance

    • 1.3   Does it belong into CLIMADA?

    • 1.4   Paper repository

    • 1.5   Utility function

    • 1.6   Impact function renaming - if to impf

    • 1.7   Data dependencies

    • 1.8   Side Note on Parameters

Dependencies (python packages)

Python is extremely powerful thanks to the large amount of available libraries, packages and modules. However, maintaining a code with a large number of such packages creates dependencies which is very care intensive. Indeed, each package developer can and does update and develop continuously. This means that certain code can become obsolete over time, stop working altogether, or become incompatible with other packages. Hence, it is crucial to keep the philosophie:

As many packages as needed, as few as possible.

Thus, when you are coding, follow these priorities:

  1. Python standard library

  2. Funktions and methods already implemented in CLIMADA (do NOT introduce circulary imports though)

  3. Packages already included in CLIMADA

  4. Before adding a new dependency:

Hence, first try to solve your problem with the standard library and function/methods already implemented in CLIMADA (see in particular the util functions) then use the packages included in CLIMADA, and if this is not enough, propose the addition of a new package. Do not hesitate to propose new packages if this is needed for your work!

Class inheritance

In Python, a class can inherit from other classes, which is a very useful mechanism in certain circumstantce. However, it is wise to think about inheritance before implementing it. Very important, is that CLIMADA classes do not inherit from external library classes. For example, Exposure directly inherited from Geopandas. This caused problems in CLIMADA when the package Geopandas was updated.

CLIMADA classes shall NOT inherit classes from external modules

Does it belong into CLIMADA?

When developing for CLIMADA, it is important to distinguish between core content and particular applications. Core content is meant to be included into the climada_python repository and will be subject to a code review. Any new addition should first be discussed with one of the repository admins. The purpose of this discussion is to see

  • How does the planed module fit into CLIMADA?

  • What is an optimal architecture for the new module?

  • What parts might already exist in other parts of the code?

Applications made with CLIMADA, such as an ECA study can be stored in the paper repository once they have been published. For other types of work, consider making a separate repository that imports CLIMADA as an external package.

Paper repository

Applications made with CLIMADA which are published in the form of a paper or a report are very much encouraged to be submited to the climada/paper repository. You can either:

  • Prepare a well-commented jupyter notebook with the code necessary to reproduce your results and upload it to the climada/paper repository. Note however that the repository cannot be used for storing data files.

  • Upload the code necessary to reproduce your results to a separate repository of your own. Then, add a link to your repository and to your publication to the readme file on the climada/paper repository.

Notes about DOI

Some journals requires you to provide a DOI to the code and data used for your publication. In this case, we encourage to create a separate repository for your code and create a DOI using Zenodo or any specific service from your institution (e.g. ETH Zürich).

The CLIMADA releases are also identified with a DOI.

Utility function

In CLIMADA, there is a set of utility functions defined in climada.util. A few examples are:

  • convert large monetary numbers into thousands, millions or billions together with the correct unit name

  • compute distances

  • load hdf5 files

  • convert iso country numbers between formats

Whenever you develop a module or make a code review, be attentive to see whether a given functionnality has already been implemented as a utility function. In addition, think carefully whether a given function/method does belong in its module or is actually independent of any particular module and should be defined as a utility function.

It is very important to not reinvent the wheel and to avoid unnecessary redundancies in the code. This makes maintenance and debugging very tedious.

Impact function renaming - if to impf

In the original CLIMADA code, the impact function is often referred to as if or if_. This is easy to confuse with the conditional operator if. Hence, in future a transition from

if ———> impf

will be performed. Once the change is active, known developers will be notified and this message updated.

Data dependencies

Web APIs

CLIMADA relies on open data available through web APIs such as those of the World Bank, Natural Earth, NASA and NOAA. You might execute the test climada_python-x.y.z/test_data_api.py to check that all the APIs used are active. If any is out of service (temporarily or permanently), the test will indicate which one.

Manual download

As indicated in the software and tutorials, other data might need to be downloaded manually by the user. The following table shows these last data sources, their version used, its current availabilty and where they are used within CLIMADA:

Availability

Name

Version

Link

CLIMADA class

CLIMADA version

CLIMADA tutorial reference

OK

Fire Information for Resource Management System

`FIRMS <https://fi rms.modaps.eosdis. nasa.gov/download/ >`__

BushFire

>V1.2.5

climada_hazard_BushFire.ipynb

OK

Gridded Population of the World (GPW)

v4.11

GPW4.11

LitPop

> v1.2.3

climada_entity_LitPop.ipynb

FAILED

Gridded Population of the World (GPW)

v4.10

GPW1.10

LitPop

>= v1.2.0

climada_entity_LitPop.ipynb

Side Note on Parameters

Don’t use *args and **kwargs parameters without a very good reason.

There are valid use cases for this kind of parameter notation.
In particular *args comes in handy when there is an unknown number of equal typed arguments to be passed. E.g., the pathlib.Path constructor.
But if the parameters are expected to be structured in any way, it is just a bad idea.
[4]:
def f(x, y, z):
    return x + y + z

# bad in most cases
def g(*args, **kwargs):
    x = args[0]
    y = kwargs['y']
    s = f(*args, **kwargs)
    print(x, y, s)

g(1,y=2,z=3)
1 2 6
[ ]:
# usually just fine
def g(x, y, z):
    s = f(x, y, z)
    print(x, y, s)

g(1,y=2,z=3)

Decrease the number of parameters.

Though CLIMADA’s pylint configuration .pylintrc allows 7 arguments for any method or function before it complains, it is advisable to aim for less. It is quite likely that a function with so many parameters has an inherent design flaw.

There are very well designed command line tools with innumerable optional arguments, e.g., rsync - but these are command line tools. There are also methods like pandas.DataFrame.plot() with countless optional arguments and it makes perfectly sense.

But within the climada package it probably doesn’t. divide et impera!

Whenever a method has more than 5 parameters, it is more than likely that it can be refactored pretty easily into two or more methods with less parameters and less complexity:

[2]:
def f(a, b, c, d, e, f, g, h):
    print(f'f does many things with a lot of arguments: {a, b, c, d, e, f, g, h}')
    return sum([a, b, c, d, e, f, g, h])

f(1, 2, 3, 4, 5, 6, 7, 8)
f does many things with a lot of arguments: (1, 2, 3, 4, 5, 6, 7, 8)
[2]:
36
[3]:
def f1(a, b, c, d):
    print(f'f1 does less things with fewer arguments: {a, b, c, d}')
    return sum([a, b, c, d])

def f2(e, f, g, h):
    print(f'f2 dito: {e, f, g, h}')
    return sum([e, f, g, h])

def f3(x, y):
    print(f'f3 dito, but on a higher level: {x, y}')
    return sum([x, y])

f3(f1(1, 2, 3, 4), f2(5, 6, 7, 8))
f1 does less things with fewer arguments: (1, 2, 3, 4)
f2 dito: (5, 6, 7, 8)
f3 dito, but on a higher level: (10, 26)
[3]:
36
This of course pleads the case on a strictly formal level. No real complexities have been reduced during the making of this example.
Nevertheless there is the benefit of reduced test case requirements. And in real life real complexity will be reduced.