Testing and Continuous Integration

Content

  1. Testing CLIMADA

  2. Notes on Testing

    1. Basic Test Procedure

    2. Testing Types

    3. Unit Tests

    4. Integration Tests

    5. System Tests

    6. Error Messages

    7. Dealing with External Resources

    8. Test Configuration

  3. Continuous Integration

    1. Automated Tests

    2. Test Coverage

    3. Static Code Analysis

    4. Jenkins Projects Overview

1. Testing CLIMADA

  • Installation Test

    From the intallation directory run make install_test It lasts about 45 seconds. If it succeeds, CLIMADA is properly installed and ready to use.

  • Unit Tests

    From the intallation directory run make unit_test It lasts about 5 minutes and runs unit tests for all modules.

  • Integration Tests

    From the intallation directory run make integ_test It lasts about 45 minutes and runs extensive integration tests, during which also data from external resources is read. An open internet connection is required for a successful test run.

2. Notes on Testing

Any programming code that is meant to be used more than once should have a test, i.e., an additional piece of programming code that is able to check whether the original code is doing what it’s supposed to do.

Writing tests is work. As a matter of facts, it can be a lot of work, depending on the program often more than writing the original code.
Luckily, it essentially follows always the same basic procedure and a there are a lot of tools and frameworks available to facilitate this work.

In CLIMADA we use the Python in-built test runner unittest for execution of the tests and the Jenkins framework for continuous integration, i.e., automated test execution and code analysis.

Why do we write test?

  • The code is most certainly buggy if it’s not properly tested.

  • Software without tests is worthless. It won’t be trusted and therefore it won’t be used.

When do we write test?

  • Before implementation. A very good idea. It is called Test Driven Development.

  • During implementation. Test routines can be used to run code even while it’s not fully implemented. This is better than running it interactively, because the full context is set up by the test. By command line: python -m unittest climada.x.test_y.TestY.test_z Interactively: climada.x.test_y.TestY().test_z()

  • Right after implementation. In case the coverage analysis shows that there are missing tests, see Test Coverage.

  • Later, when a bug was encountered. Whenever a bug gets fixed, also the tests need to be adapted or amended.

2.A. Basic Test Procedure

  • Test data setup Creating suitable test data is crucial, but not always trivial. It should be extensive enough to cover all functional requirements and yet as small as possible in order to save resources, both in space and time.

  • Code execution The main goal of a test is to find bugs before the user encounters them. Ultimately every single line of the program should be subject to test. In order to achieve this, it is necessary to run the code with respect to the whole parameter space. In practice that means that even a simple method may require a lot of test code. (Bear this in mind when designing methods or functions: the number of required tests increases dramatically with the number of function parameters!)

  • Result validation After the code was executed the actual result is compared to the expected result. The expected result depends on test data, state and parametrization. Therefore result validation can be very extensive. In most cases it won’t be practical nor required to validate every single byte. Nevertheless attention should be paid to validate a range of results that is wide enough to discover as many thinkable discrepancies as possible.

2.B. Testing types

Despite the common basic procedure there are many different kinds of tests distinguished. (See WikiPedia:Software testing). Very commonly a distinction is made based on levels: - Unit Test: tests only a small part of the code, a single function or method, essentially without interaction between modules - Integration Test: tests whether different methods and modules work well with each other - System Test: tests the whole software at once, using the exposed interface to execute a program

2.C. Unit Tests

Unit tests are meant to check the correctness of program units, i.e., single methods or functions, they are supposed to be fast, simple and easy to write.
For each module in CL
  • Each module in CLIMADA has a counter part containing unit tests.
    Naming suggestion: climada.x.yclimada.x.test.test_y
  • Write a test class for each class of the module, plus a test class for the module itself in case it contains (module) functions.
    Naming suggestion: class Xclass TestX(unittest.TestCase), module climda.x.yclass TestY(unittest.TestCase)
  • Ideally, each method or function should have at least one test method.
    Naming suggestion: def xy()def test_xy(), def test_xy_suffix1(), def test_xy_suffix2()
    Functions that are created for the sole purpose of structuring the code do not necessarily have their own unit test.
  • Aim at having very fast unit tests!
    There will be hundreds of unit tests and in general they are called in corpore and expected to finish after a reaonable amount of time. Less than 10 milisecond is good, 2 seconds is the maximum acceptable duration.
  • A unit test shouldn’t call more than one climada method or function.
    The motivation to combine more than one method in a test is usually creation of test data. Try to provide test data by other means. Define them on the spot (within the code of the test module) or create a file in a test data directory that can be read during the test. If this is too tedious, at least move the data acquisition part to the constructor of the test class.
  • Do not use external resources in unit tests.
    Methods depending on external resources can be skipped from unit tests. SeeDealing with External Resources.

2.D. Integration Tests

Integration tests are meant to check the correctness of interaction between units of a module or a package.
As a general rule, more work is required to write integration tests than to write unit tests and they have longer runtime.
  • Write integration tests for all intended use cases.

  • Do not expect external resources to be immutable. If calling on external resources is part of the workflow to be tested, take into account that they may change over time.
    If the according API has means to indicate the precise version of the requested data, make use of it, otherwise, adapt your expectations and leave room for future changes.
    Example given: your function is ultimately relying on the current GDP retrieved from an online data provider, and you test it for Switzerland where it’s in about 700 Bio CHF at the moment. Leave room for future development, try to be on a reasonably save side, tolerate a range between 70 Bio CHF and 7000 Bio CHF.
  • Test location: Integration are written in modules climada.test.test_xy or in climada.x.test.test_y, like the unit tests.
    For the latter it is required that they do not use external resources and that the tests do not have a runtime longer than 2 seconds.

2.E. System Tests

Integration tests are meant to check whether the whole software package is working correctly.

In CLIMADA, the system test that checks the core functionality of the package is executed by calling make install_test from the installation directory.

2.F. Error Messages

When a test fails, make sure the raised exception contains all information that might be helpful to identify the exact problem.
If the error message is ever going to be read by someone else than you while still developing the test, you best assume it will be someone who is completely naive about CLIMADA.

Writing extensive failure messages will eventually save more time than it takes to write them.

Putting the failure information into logs is neither required nor sufficient: the automated tests are built around error messages, not logs.
Anything written to stdout by a test method is useful mainly for the developer of the test.

2.G. Dealing with External Resources

Methods depending on external resources (calls a url or database) are ideally atomic and doing nothing else than providing data. If this is the case they can be skipped in unit tests on safe grounds - provided they are tested at some point in higher level tests.

In CLIMADA there are the utility functions climada.util.files_handler.download_file and climada.util.files_handler.download_ftp, which are assigned to exactly this task for the case of external data being available as files.

Any other method that is calling such a data providing method can be made compliant to unit test rules by having an option to replace them by another method. Like this one can write a dummy method in the test module that provides data, e.g., from a file or hard coded, which be given as the optional argument.

[7]:
import climada
def x(download_file=climada.util.files_handler.download_file):
    filepath = download_file('http://real_data.ch')
    return Path(filepath).stat().st_size

import unittest
class TestX(unittest.TestCase):
    def download_file_dummy(url):
        return "phony_data.ch"

    def test_x(self):
        self.assertEqual(44, x(download_file=self.download_file_dummy))
  • When introducing a new external resource, add a test method in test_data_api.py.

2.H. Test Configuration

Use the configuration file climada.config in the installation directory to define file pathes and external resources used during tests (see the Constants and Configuration Guide).

3. Continuous Integration

The CLIMADA Jenkins server used for continuous integration is at (https://ied-wcr-jenkins.ethz.ch).

3.A. Automated Tests

On Jenkins tests are executed and analyzed automatically, in an unbiased environment. The results are stored and can be compared with previous test runs.
Jenkins has a GUI for monitoring individual tests, full test runs and test result trands.
Developers are requested to watch it. At first when they push commits to the code repository, but also later on, when other changes in data or sources may make it necessary to review and refactor code that once passed all tests. ##### Developer guidelines: - All tests must pass before submitting a pull request. - Integration tests don’t run on feature branches in Jenkins, therefore developers are requested to run them locally. - After a pull request was accepted and the changes are merged to the develop branch, integration tests may still fail there and have to be adressed.

3.B. Test Coverage

Jenkins also has an interface for exploring code coverage analysis result.
This shows which part of the code has never been run in any test, by module, by function/method and even by single line of code.

Ulitmately every single line of code should be tested.

  • Make sure the coverage of novel code is at 100% before submitting a pull request.

Be aware that the having a code coverage alone does not grant that all required tests have been written!
The following artificial exmple would have a 100% coverage and still obviously misses a test for y(False)
[27]:
def x(b:bool):
    if b:
        print('been here')
        return 4
    else:
        print('been there')
        return 0

def y(b:bool):
    print('been everywhere')
    return 1/x(b)


import unittest
class TestXY(unittest.TestCase):
    def test_x(self):
        self.assertEqual(x(True), 4)
        self.assertEqual(x(False), 0)

    def test_y(self):
        self.assertEqual(y(True), 0.25)

unittest.TextTestRunner().run(unittest.TestLoader().loadTestsFromTestCase(TestXY));
..
been here
been there
been everywhere
been here

----------------------------------------------------------------------
Ran 2 tests in 0.003s

OK

3.C. Static Code Analysis

At last Jenkins provides an elaborate GUI for pylint findings which is especially useful when working in feature branches.

Observe it!

  • High Priority Warnings are as severe as test failures and must be addressed at once.

  • Do not introduce new Medium Priority Warnings.

  • Try to avoid introducing Low Priority Warnings, in any case their total number should not increase.

3.D. Jenkins Projects Overview

  • climada_install_env

    Branch: develop Runs every day at 1:30AM CET

    • creates conda environment from scratch

    • runs core functionality system test (make install_test)

  • climada_ci_night

    Branch: develop Runs when climada_install_env has finished successfully

    • runs all test modules

    • runs static code analysis

  • climada_branches

    Branch: any Runs when a commit is pushed to the repository

    • runs all test modules outside of climada.test

    • runs static code analysis

  • climada_data_api

    Branch: develop Runs every day at 0:20AM CET

    • tests availability of external data APIs

  • climada_data_api

    Branch: develop No automated running

    • tests executability of CLIMADA tutorial notebooks.