Development and Git and CLIMADA¶
Table of Contents
1 development and Git and CLIMADA
1.2 Git and GitHub
1.4 Installing CLIMADA for development
1.5 Features and branches
1.6 Pull Requests
1.7 General tips and tricks
Git and GitHub¶
Git’s not that scary
95% of your work on Git will be done with the same handful of commands
(the other 5% will always be done with careful Googling)
Almost everything in Git can be undone by design (but use
Your favourite IDE (Spyder, PyCharm, …) will have a GUI for working with Git, or you can download a standalone one.
The Git Book is a great introduction to how Git works and to using it on the command line.
Consider using a GUI program such as “git desktop” or “Gitkraken” to have a visual git interface, in particular at the beginning. Your python IDE is also likely to have a visual git interface.
Feel free to ask for help
What I assume you know¶
I’m assuming you’re all familiar with the basics of Git.
What (and why) is version control
How to clone a repository
How to make a commit and push it to GitHub
What a branch is, and how to make one
How to merge two branches
The basics of the GitHub website
If you’re not feeling great about this, I recommend - sending me a message so we can arrange an introduction with CLIMADA - exploring the Git Book
Terms we’ll be using today¶
These are terms I’ll be using a lot today, so let’s make sure we know them
local versus remote
Our remote repository is hosted on GitHub. This is the central location where all updates to CLIMADA that we want to share end up. If you’re updating CLIMADA for the community, your code will end up here too.
Your local repository is the copy you have on the machine you’re working on, and where you do your work.
Git calls the (first, default) remote the
(It’s possible to set more than one remote repository, e.g. you might set one up on a network-restricted computing cluster)
push, pull and pull request
You push your work when you send it from your local machine to the remote repository
You pull from the remote repository to update the code on your local machine
A pull request is a standardised review process on GitHub. Usually it ends with one branch merging into another
Sometimes two people have made changes to the same bit of code. Usually this comes up when you’re trying to merge branches. The changes have to be manually compared and the code edited to make sure the ‘correct’ version of the code is kept.
Gitflow is a particular way of using git to organise projects that have - multiple developers - working on different features - with a release cycle
It means that - there’s always a stable version of the code available to the public - the chances of two developers’ code conflicting are reduced - the process of adding and reviewing features and fixes is more standardised for everyone
Gitflow is a convention, so you don’t need any additional software. - … but if you want you can get some: a popular extension to the git command line tool allows you to issue more intuitive commands for a Gitflow workflow. - Mac/Linux users can install git-flow from their package manager, and it’s included with Git for Windows
Gitflow works on the
develop branch instead of
The critical difference between Gitflow and ‘standard’ git is that almost all of your work takes place on the
developbranch, instead of the
mainbranch is reserved for planned, stable product releases, and it’s what the general public download when they install CLIMADA. The developers almost never interact with it.
Gitflow is a feature-based workflow¶
This is common to many workflows: when you want to add something new to the model you start a new branch, work on it locally, and then merge it back into
developwith a pull request (which we’ll cover later).
By convention we name all CLIMADA feature branches
Features can be anything, from entire hazard modules to a smarter way to do one line of a calculation. Most of the work you’ll do on CLIMADA will be a features of one size or another.
We’ll talk more about developing CLIMADA features later!
Gitflow enables a regular release cycle¶
A release is usually more complex than merging
So for this a
release-*branch is created from
develop. We’ll all be notified repeatedly when the deadline is to submit (and then to review) pull requests so that you can be included in a release.
The core developer team (mostly Emanuel) will then make sure tests, bugfixes, documentation and compatibility requirements are met, merging any fixes back into
On release day, the release branch is merged into
main, the commit is tagged as a release and the release notes are published on the GitHub at https://github.com/CLIMADA-project/climada_python/releases
Everything else is hotfixes¶
The other type of branch you’ll create is a hotfix.
Hotfixes are generally small changes to code that do one thing, fixing typos, small bugs, or updating docstrings. They’re done in much the same way as features, and are usually merged with a pull request.
The difference between features and hotfixes is fuzzy and you don’t need to worry about getting it right.
Hotfixes will occasionally be used to fix bugs on the
mainbranch, in which case they will merge into both
Some hotfixes are so simple - e.g. fixing a typo or a docstring - that they don’t need a pull request. Use your judgement, but as a rule, if you change what the code does, or how, you should be merging with a pull request.
Installing CLIMADA for development¶
Clone (or fork) the project on GitHub
From the location where you want to create the project folder, run in your terminal:
Install the packages in
climada_python/requirements/env_developer.yml(see install). You might need to install additional environments contained in
climada_python/requirementswhen using specific functionalities.
Features and branches¶
Planning a new feature¶
Here we’re talking about large features such as new modules, new data sources, or big methodological changes. Any extension to CLIMADA that might affect other developers’ work, modify the CLIMADA core, or need a big code review.
Smaller feature branches don’t need such formalities. Use your judgment, and if in doubt, let people know.
Talk to the group¶
Before starting coding a module, do not forget to coordinate with one of the repo admins (Emanuel, Chahan or David)
This is the chance to work out the Big Picture stuff that is better when it’s planned with the group - possible intersections with other projects, possible conflicts, changes to the CLIMADA core, additional dependencies (see Chahan’s presentation later)
Also talk with others from the core development team (see the GitHub wiki).
Bring it to a developers meeting - people may be able to help/advise and are always interested in hearing about new projects. You can also find reviewers!
Also, keep talking! Your plans will change :)
Planning the work¶
Does the project go in its own repository and import CLIMADA, or does it extend the main CLIMADA repository?
The way this is done is slowly changing, so definitely discuss it with the group.
Chahan will discuss this later!
Find a few people who will help to review your code.
Ask in a developers’ meeting, on Slack (for WCR developers) or message people on the development team (see the GitHub wiki).
Let them know roughly how much code will be in the reviews, and when you’ll be creating pull requests.
How can the work split into manageable chunks?
A series of smaller pull requests is far more manageable than one big one (and takes off some of the pre-release pressure)
Reviewing and spotting issues/improvements/generalisations early is always a good thing.
It encourages modularisation of the code: smaller self-contained updates, with documentation and tests.
Will there be any changes to the CLIMADA core?
These should be planned carefully
Will you need any new dependencies? Are you sure?
Chahan will discuss this later!
Working on feature branches¶
When developing a big new feature, consider creating a feature branch and merging smaller branches into that feature branch with pull requests, keeping the whole process separate from
develop until it’s completed. This makes step-by-step code review nice and easy, and makes the final merge more easily tracked in the history.
e.g. developing the big
feature/meteorite module you might write
feature/meteorite-hazard and merge it in, then
feature/meteorite-stochastic-events etc… before finally merging
develop. Each of these could be a reviewable pull request.
Make a new branch¶
For new features in Git flow:
git flow feature start feature_name
Which is equivalent to (in vanilla git):
git checkout -b feature/feature_name
Or work on an existing branch:
git checkout -b branch_name
Follow the python do’s and don’t and performance guides. Write small readable methods, classes and functions.¶
get the latest data from the remote repository and update your branch
see your locally modified files
add changes you want to include in the commit
git add climada/modified_file.py climada/test/test_modified_file.py
commit the changes
git commit -m "new functionality of .. implemented"
Make unit and integration tests on your code, preferably during development¶
We want every line of code that goes into the CLIMADA repository to be reviewed!
Code review: - catches bugs (there are always bugs) - lets you draw on the experience of the rest of the team - makes sure that more than one person knows how your code works - helps to unify and standardise CLIMADA’s code, so new users find it easier to read and navigate - creates an archived description and discussion of the changes you’ve made
When to make a pull request¶
When you’ve finished writing a big new class or method (and its tests)
When you’ve fixed a bug or made an improvement you want to merge
When you want to merge a change of code into
When you want to discuss a bit of code you’ve been working on - pull requests aren’t only for merging branches
Not all pull requests have to be into
develop - you can make a pull request into any active branch that suits you.
Pull requests need to be made latest two weeks before a release, see releases.
Step by step pull request!¶
Let’s suppose you’ve developed a cool new module on the
feature/meteorite branch and you’re ready to merge it into
Checklist before you start¶
Tutorial (if a complete new feature)
Updated dependencies (if need be)
Added your name to the AUTHORS file
(Advanced, optional) interactively rebase/squash recent commits that aren’t yet on GitHub.
Step by step pull request!¶
Make sure the
developbranch is up to date on your own machine
git checkout develop git pull
developinto your feature branch and resolve any conflicts
git checkout feature/meteorite git merge develop
In the case of more complex conflicts, you may want to speak with others who worked on the same code. Your IDE should have a tool for conflict resolution.
Check all the tests pass locally
make unit_test make integ_test
Perform a static code analysis using pylint with CLIMADA’s configuration
.pylintrc(in the climada root directory). Jenkins executes it after every push. To do it locally, your IDE probably provides a tool, or you can run
make lintand see the output in
Push to GitHub. If you’re pushing this branch for the first time, use
git push -u origin feature/meteorite
and if you’re updating a branch that’s already on GitHub:
Check all the tests pass on the WCR Jenkins server (https://ied-wcr-jenkins.ethz.ch). See Emanuel’s presentation for how to do this! You should regularly be pushing your code and checking this!
Create the pull request!
On the CLIMADA GitHub page, navigate to your feature branch (there’s a drop-down menu above the file structure, pointing by default to
Above the file structure is a branch summary and an icon to the right labelled “Pull request”.
Choose which branch you want to merge with. This will usually be
develop, but may be another feature branch for more complex feature development.
Give your pull request an informative title (like a commit message).
Write a description of the pull request. This can usually be adapted from your branch’s commit messages (you wrote informative commit messages, didn’t you?), and should give a high-level summary of the changes, specific points you want the reviewers’ input on, and explanations for decisions you’ve made. The code documentation (and any references) should cover the more detailed stuff.
Assign reviewers in the page’s right hand sidebar. Tag anyone who might be interested in reading the code. You should already have found one or two people who are happy to read the whole request and sign it off (they could also be added to ‘Assignees’).
Create the pull request.
Contact the reviewers to let them know the request is live. GitHub’s settings mean that they may not be alerted automatically. Maybe also let people know on the WCR Slack!
Talk with your reviewers
Use the comment/chat functionality within GitHub’s pull requests - it’s useful to have an archive of discussions and the decisions made.
Take comments and suggestions on board, but you don’t need to agree with everything and you don’t need to implement everything.
If you feel someone is asking for too many changes, prioritise, especially if you don’t have time for complex rewrites.
If the suggested changes and or features don’t block functionality and you don’t have time to fix them, they can be moved to Issues.
Chase people up if they’re slow. People are slow.
Once you implement the requested changes, respond to the comments with the corresponding commit implementing each requested change.
If the review takes a while, remember to merge
developback into the feature branch every now and again (and check the tests are still passing on Jenkins). Anything pushed to the branch is added to the pull request.
Once everyone reviewing has said they’re satisfied with the code you can merge the pull request using the GitHub interface. Delete the branch once it’s merged, there’s no reason to keep it. (Also try not to re-use that branch name later.)
developbranch on your local machine.
How to review a pull request¶
Decide how much time you can spare and the detail you can work in. Tell the author!
Use the comment/chat functionality within GitHub’s pull requests - it’s useful to have an archive of discussions and the decisions made.
Fix the big things first! If there are more important issues, not every style guide has to be stuck to, not every slight increase in speed needs to be pointed out, and test coverage doesn’t have to be 100%.
Make it clear when a change is optional, or is a matter of opinion
At a minimum - Make sure unit and integration tests are passing on Jenkins - (For complete modules) Run the tutorial on your local machine and check it does what it says it does - Check everything is fully documented
At least one reviewer needs to - Review all the changes in the pull request. Read what it’s supposed to do, check it does that, and make sure the logic is sound. - Check that the code follows the CLIMADA style guidelines
#TODO: link - If the code is implementing an algorithm it should be referenced in the documentation. Check it’s implemented correctly. - Try to think of edge cases and ways the code could break. See if there’s appropriate error handling in cases where the function might
behave unexpectedly. - (Optional) suggest easy ways to speed up the code, and more elegant ways to achieve the same goal.
There are a few ways to suggest changes - As questions and comments on the pull request page - As code suggestions (max a few lines) in the code review tools on GitHub. The author can then approve and commit the changes from GitHub pull request page. This is great for typos and little stylistic changes. - If you decide to help the author with changes, you can either push them to the same branch, or create a new branch and make a pull request with the changes back into the branch you’re reviewing. This lets the author review it and merge.
General tips and tricks¶
Ask for help with Git¶
Git isn’t intuitive, and rewinding or resetting is always work. If you’re not certain what you’re doing, or if you think you’ve messed up, send someone a message.
Don’t push or commit to
Almost all new additions to CLIMADA should be merged into the
developbranch with a pull request.
You won’t merge into the
mainbranch, except for emergency hotfixes (which should be communicated to the team).
You won’t merge into the
developbranch without a pull request, except for small documentation updates and typos.
The above points mean you should never need to push the
So if you find yourself on the
develop branches typing
git merge ... or
git push stop and think again - you should probably be making a pull request.
This can be difficult to undo, so contact someone on the team if you’re unsure!
Commit more often than you think, and use informative commit messages¶
Committing often makes mistakes less scary to undo
git reset --hard HEAD
Detailed commit messages make writing pull requests really easy
Yes it’s boring, but trust me, everyone (usually your future self) will love you when they’re rooting through the git history to try and understand why something was changed
Commit message syntax guidelines¶
Basic syntax guidelines taken from here https://chris.beams.io/posts/git-commit/ (on 17.06.2020)
Limit the subject line to 50 characters
Capitalize the subject line
Do not end the subject line with a period
Use the imperative mood in the subject line (e.g. “Add new tests”)
Wrap the body at 72 characters (most editors will do this automatically)
Use the body to explain what and why vs. how
Separate the subject from body with a blank line (This is best done with a GUI. With the command line you have to use text editor, you cannot do it directly with the git command)
Put the name of the function/class/module/file that was edited
When fixing an issue, add the reference gh-ISSUENUMBER to the commit message e.g. “fixes gh-40.” or “Closes gh-40.” For more infos see here https://docs.github.com/en/enterprise/2.16/user/github/managing-your-work-on-github/closing-issues-using-keywords#about-issue-references.
What not to commit¶
There are a lot of things that don’t belong in the Git repository: - Don’t commit data, except for config files and very small files for tests. - Don’t commit anything containing passwords or authentication credentials or tokens. (These are annoying to remove from the Git history.) Contact the team if you need to manage authorisations within the code. - Don’t commit anything that can be created by the CLIMADA code itself
If files like this are going to be present for other users as well, add them to the repository’s
.gitignore. #### Jupyter Notebook metadata
Git compares file versions by text tokens. Jupyter Notebooks typically contain a lot of metadata, along with binary data like image files. Simply re-running a notebook can change this metadata, which will be reported as file changes by Git. This causes excessive Diff reports that cannot be reviewed conveniently.
To avoid committing changes of unrelated metadata, open Jupyter Notebooks in a text editor instead of your browser renderer. When committing changes, make sure that you indeed only commit things you did change, and revert any changes to metadata that are not related to your code updates.
Several code editors use plugins to render Jupyter Notebooks. Here we collect the instructions to inspect Jupyter Notebooks as plain text when using them: - VSCode: Open the Jupyter Notebook. Then open the internal command prompt (
P on macOS) and type/select ‘View: Reopen Editor with Text Editor’
Log ideas and bugs as GitHub Issues¶
If there’s a change you might want to see in the code - something that generalises, something that’s not quite right, or a cool new feature - it can be set up as a GitHub Issue. Issues are pages for conversations about changes to the codebase and for logging bugs, and act as a ‘backlog’ for the CLIMADA project.
For a bug, or a question about functionality, make a minimal working example, state which version of CLIMADA you are using, and post it with the Issue.
How not to mess up the timeline¶
Git builds the repository through incremental edits. This means it’s great at keeping track of its history. But there are a few commands that edit this history, and if histories get out of sync on different copies of the repository you’re going to have a bad time.
Don’t rebase any commits that already exist remotely!
--forceanything that exists remotely unless you know what you’re doing!
Otherwise, you’re unlikely to do anything irreversible
You can do what you like with commits that only exist on your machine.
That said, doing an interactive rebase to tidy up your commit history before you push it to GitHub is a nice friendly gesture :)
Don’t fast forward merges¶
(This shouldn’t be relevant - all your merges into
develop should be through pull requests, which doesn’t fast forward. But:)
Don’t fast forward your merges unless your branch is a single commit. Use
git merge --no-ff ...
The exceptions is when you’re merging
develop into your feature branch.
Merge the remote
develop branch into your feature branch every now and again¶
This way you’ll find conflicts early
git checkout develop git pull git checkout feature/myfeature git merge develop
Create frequent pull requests¶
I said this already: - It structures your workflow - It’s easier for reviewers - If you’re going to break something for other people you all know sooner - It saves work for the rest of the team right before a release
Whenever you do something with CLIMADA, make a new local branch¶
You never know when a quick experiment will become something you want to save for later.
But don’t do everything in the CLIMADA repository¶
If you’re running CLIMADA rather than developing it, create a new folder, initialise a new repository with
git initand store your scripts and data there
If you’re writing an extension to CLIMADA that doesn’t change the model core, create a new folder, initialise a new repository with
git initand import CLIMADA. You can always add it to the model later if you need to.