Friday, October 14, 2016

Source Control, Git and GitHub

I am distributing my code via the web site GitHub. Although you could just download the latest version (or particular version) as a zip file, the best method is to download it using the Git source control system (open source -- free!). This makes it much easier to incorporate additions to my libraries, which will be growing rapidly.

This article explains some of the high-level concepts behind using GitHub and source control.

Source Control

A source control system is a necessity for multi-programmer projects. You have a repository of source code, to which you periodically commit changes. In other source control systems, the repository is centralised, while in git, you have your own local repository, and there are public repositories that users make accessible to others. (My repositories are open source, and anyone can read them.)


  • If the public repository has new code, you pull the updates to your local repository. If you have made changes to your local copy that conflicts with the changes to the public repository, you have the joy of merging the changes. (Merging is the messiest part of any source control system.)
  • Once you are ready to share your code, you push your code to a public repository you have write access to. (If you wanted to improve the code on my repositories, you would need to collaborate with me -- see below.)
If you remember to keep committing changes, you will have a good history of your work. If you break something, you can fairly easily roll back those changes. This is much better than trying to undo a mistake by restoring files from a backup. In other words, using source control is programming "best practice."

The Git project (https://git-scm.com/) has an excellent online manual by Scott Chacon and Ben Straub. As a result, I will not attempt to explain usage here.

My only suggestion is that people who are unfamiliar with vim (a Unix/Linux test editor) learn the command to get out of it: type ":wq" (no quote marks). Or else learn how to change the default editor for commit messages, (If you use PyCharm, you can do a lot of source control commands from within the development environment, but I found that I had better control using the command line.)

Integration with Github

There is an application (GitHub desktop) that gives you a GUI to control your interactions with repositories on GitHub. I prefer using the command line (the result of a misspent youth doing a Ph.D. on Unix systems). You just need to get the address of the repository, and then clone it to your working directory. 
 

Branches

The ease of creating branches is the main advantage of git over other source control systems. On SimplePricers, there are two main branches:
  1. master - the "release" version of code. It will always be fully functional.
  2. development - development code; new features are added into development, but they might not be fully functional.
You can switch between branches using the "checkout" command. All the source files will immediately jump to the corresponding version, So long as you commit your changes before doing a checkout (git will not let you change branches without dealing with changed files), you can easily switch between any version.

The normal procedure to add a new feature is to create a new branch (a "topic branch") that is a copy of the latest development version. You add your features to that branch, and test it there -- without affecting anyone else's development branch. Once complete, you merge in the topic branch to development.

For example,
  • Let's say I want to add a dividend discount model to SimplePricers. I create a "dividend" branch, that is initially a copy of "development."
  • I work on that version of the code, adding new files and functions. I may need to switch back and forth between other versions over this period.
  • Once satisfied, I merge the dividend branch into development.
  • I can now delete the dividend branch, as the work has been incorporated into development.
  • If this is a big enough feature, I can merge development into master, creating a new release of the production version of SimplePricers.
Collaboration

My code is open source, under the Apache 2.0 license. You are largely free to do what you want with it, although I still have copyright (see the license for more details). 

If you wanted to contribute to the project, there are three main ways of doing so.
  1. For small changes, you could contact me, and supply me with the code modifications. (This can even be done on GitHub via a "Gist"- a public code fragment. The GitHub system would identify me as the author of the change.
  2. For larger changes, you would need to do a "pull request." You clone the SimplePricers project on GitHub, add your changes, and then I would pull them to my local copy. If I agree with the changes, I then push them to my copy. Your changes would be tracked by Git/GitHub as being your contribution.
  3. I could grant some others write access to my repository, but I doubt that I would do that in the short term.
I am not intending to write a full-featured financial library. Instead, I want to have simple code that allows readers to replicate the examples in my writings. This code might be useful for someone who wants to learn programming, and proper programming work flow. For this purpose, it makes sense that I write almost all of the code, and readers just download copies.

However, it may be that someone like an academic might be interested in using this library for teaching purposes. I would be happy to collaborate on such a task. In such a case, the development might be mainly done by students, and I would just make sure that the project is properly set up.


(c) Brian Romanchuk 2016

No comments:

Post a Comment