Testing and Continuous Integration
Testing and Continuous Integration#
Notes on Testing
From the installation directory run
It lasts about 45 seconds. If it succeeds, CLIMADA is properly installed and ready to use.
From the installation directory run
It lasts about 5 minutes and runs unit tests for all modules.
From the installation directory run
It lasts about 15 minutes and runs extensive integration tests, during which also data from external resources is read. An open internet connection is required for a successful test run.
Notes on Testing#
Any programming code that is meant to be used more than once should have a test, i.e., an additional piece of programming code that is able to check whether the original code is doing what it’s supposed to do.
Writing tests is work. As a matter of facts, it can be a lot of work, depending on the program often more than writing the original code.
Luckily, it essentially follows always the same basic procedure and a there are a lot of tools and frameworks available to facilitate this work.
Why do we write test?
The code is most certainly buggy if it’s not properly tested.
Software without tests is worthless. It won’t be trusted and therefore it won’t be used.
When do we write test?
Before implementation. A very good idea. It is called Test Driven Development.
During implementation. Test routines can be used to run code even while it’s not fully implemented. This is better than running it interactively, because the full context is set up by the test.
By command line:
python -m unittest climada.x.test_y.TestY.test_z
Right after implementation. In case the coverage analysis shows that there are missing tests, see Test Coverage.
Later, when a bug was encountered. Whenever a bug gets fixed, also the tests need to be adapted or amended.
Basic Test Procedure#
Test data setup
Creating suitable test data is crucial, but not always trivial. It should be extensive enough to cover all functional requirements and yet as small as possible in order to save resources, both in space and time.
The main goal of a test is to find bugs before the user encounters them. Ultimately every single line of the program should be subject to test.
In order to achieve this, it is necessary to run the code with respect to the whole parameter space. In practice that means that even a simple method may require a lot of test code.
(Bear this in mind when designing methods or functions: the number of required tests increases dramatically with the number of function parameters!)
After the code was executed the actual result is compared to the expected result. The expected result depends on test data, state and parametrization.
Therefore result validation can be very extensive. In most cases it won’t be practical nor required to validate every single byte. Nevertheless attention should be paid to validate a range of results that is wide enough to discover as many thinkable discrepancies as possible.
Despite the common basic procedure there are many different kinds of tests distinguished. (See WikiPedia:Software testing). Very commonly a distinction is made based on levels:
Unit Test: tests only a small part of the code, a single function or method, essentially without interaction between modules
Integration Test: tests whether different methods and modules work well with each other
System Test: tests the whole software at once, using the exposed interface to execute a program
Unit tests are meant to check the correctness of program units, i.e., single methods or functions, they are supposed to be fast, simple and easy to write.
For each module in CL
Each module in CLIMADA has a counter part containing unit tests.
Write a test class for each class of the module, plus a test class for the module itself in case it contains (module) functions.
class TestX(unittest.TestCase), module
Ideally, each method or function should have at least one test method.
Functions that are created for the sole purpose of structuring the code do not necessarily have their own unit test.
Aim at having very fast unit tests!
There will be hundreds of unit tests and in general they are called in corpore and expected to finish after a reaonable amount of time.
Less than 10 milisecond is good, 2 seconds is the maximum acceptable duration.
A unit test shouldn’t call more than one climada method or function.
The motivation to combine more than one method in a test is usually creation of test data. Try to provide test data by other means. Define them on the spot (within the code of the test module) or create a file in a test data directory that can be read during the test. If this is too tedious, at least move the data acquisition part to the constructor of the test class.
Do not use external resources in unit tests.
Methods depending on external resources can be skipped from unit tests. See Dealing with External Resources.
Integration tests are meant to check the correctness of interaction between units of a module or a package.
As a general rule, more work is required to write integration tests than to write unit tests and they have longer runtime.
Write integration tests for all intended use cases.
Do not expect external resources to be immutable.
If calling on external resources is part of the workflow to be tested, take into account that they may change over time.
If the according API has means to indicate the precise version of the requested data, make use of it, otherwise, adapt your expectations and leave room for future changes.
Example given: your function is ultimately relying on the current GDP retrieved from an online data provider, and you test it for Switzerland where it’s in about 700 Bio CHF at the moment. Leave room for future development, try to be on a reasonably save side, tolerate a range between 70 Bio CHF and 7000 Bio CHF.
Integration are written in modules
climada.x.test.test_y, like the unit tests.
For the latter it is required that they do not use external resources and that the tests do not have a runtime longer than 2 seconds.
System tests are meant to check whether the whole software package is working correctly.
In CLIMADA, the system test that checks the core functionality of the package is executed by calling
make install_test from the installation directory.
When a test fails, make sure the raised exception contains all information that might be helpful to identify the exact problem.
If the error message is ever going to be read by someone else than you while still developing the test, you best assume it will be someone who is completely naive about CLIMADA.
Writing extensive failure messages will eventually save more time than it takes to write them.
Putting the failure information into logs is neither required nor sufficient: the automated tests are built around error messages, not logs.
Anything written to
stdout by a test method is useful mainly for the developer of the test.
Dealing with External Resources#
Methods depending on external resources (calls a url or database) are ideally atomic and doing nothing else than providing data. If this is the case they can be skipped in unit tests on safe grounds - provided they are tested at some point in higher level tests.
In CLIMADA there are the utility functions
climada.util.files_handler.download_ftp, which are assigned to exactly this task for the case of external data being available as files.
Any other method that is calling such a data providing method can be made compliant to unit test rules by having an option to replace them by another method. Like this one can write a dummy method in the test module that provides data, e.g., from a file or hard coded, which be given as the optional argument.
import climada def x(download_file=climada.util.files_handler.download_file): filepath = download_file('http://real_data.ch') return Path(filepath).stat().st_size import unittest class TestX(unittest.TestCase): def download_file_dummy(url): return "phony_data.ch" def test_x(self): self.assertEqual(44, x(download_file=self.download_file_dummy))
When introducing a new external resource, add a test method in
Use the configuration file
climada.config in the installation directory to define file paths and external resources used during tests (see the Constants and Configuration Guide).
The CLIMADA Jenkins server used for continuous integration is at (https://ied-wcr-jenkins.ethz.ch) .
On Jenkins tests are executed and analyzed automatically, in an unbiased environment. The results are stored and can be compared with previous test runs.
Jenkins has a GUI for monitoring individual tests, full test runs and test result trands.
Developers are requested to watch it. At first when they push commits to the code repository, but also later on, when other changes in data or sources may make it necessary to review and refactor code that once passed all tests.
All tests must pass before submitting a pull request.
Integration tests don’t run on feature branches in Jenkins, therefore developers are requested to run them locally.
After a pull request was accepted and the changes are merged to the develop branch, integration tests may still fail there and have to be addressed.
We adopted test automation via GitHub Actions in an experimental state. See GitHub Actions CI for details.
Jenkins also has an interface for exploring code coverage analysis result.
This shows which part of the code has never been run in any test, by module, by function/method and even by single line of code.
Ultimately every single line of code should be tested.
Make sure the coverage of novel code is at 100% before submitting a pull request.
Be aware that the having a code coverage alone does not grant that all required tests have been written!
The following artificial example would have a 100% coverage and still obviously misses a test for
def x(b:bool): if b: print('been here') return 4 else: print('been there') return 0 def y(b:bool): print('been everywhere') return 1/x(b) import unittest class TestXY(unittest.TestCase): def test_x(self): self.assertEqual(x(True), 4) self.assertEqual(x(False), 0) def test_y(self): self.assertEqual(y(True), 0.25) unittest.TextTestRunner().run(unittest.TestLoader().loadTestsFromTestCase(TestXY));
been here been there been everywhere been here
---------------------------------------------------------------------- Ran 2 tests in 0.003s OK
Static Code Analysis#
At last Jenkins provides an elaborate GUI for pylint findings which is especially useful when working in feature branches.
High Priority Warnings are as severe as test failures and must be addressed at once.
Do not introduce new Medium Priority Warnings.
Try to avoid introducing Low Priority Warnings, in any case their total number should not increase.
Jenkins Projects Overview#
Runs every day at 1:30AM CET
creates conda environment from scratch
runs core functionality system test (
Runs when climada_install_env has finished successfully
runs all test modules
runs static code analysis
Runs when a commit is pushed to the repository
runs all test modules outside of climada.test
runs static code analysis
Runs every day at 0:20AM CET
tests availability of external data APIs
No automated running
tests executability of CLIMADA tutorial notebooks.