Version control has become a powerful tool to back up data, keep your work organized, and collaborate with others, very useful in the academic life.
Revision control, also known as version control and source control, is the management of changes to documents, computer programs, large web sites, etc.
Changes are usually identified by a number or letter code, termed the "revision number", "revision level", or simply "revision". For example, an initial set of files is "revision 1". When the first change is made, the resulting set is "revision 2", and so on. Each revision is associated with a timestamp and the person making the change. Revisions can be compared, restored, and with some types of files, merged.
Version control systems (VCS) most commonly run as stand-alone applications, but revision control is also embedded in various types of software such as word processors and spreadsheets. Revision control allows for the ability to revert a document to a previous revision, which is critical for allowing editors to track each other's edits and correct mistakes (even if you work alone, version control is useful for that).
"Git has three main states that your files can reside in: committed, modified, and staged.
Committed means that the data is safely stored in your local database. Modified means that you have changed the file but have not committed it to your database yet. Staged means that you have marked a modified file in its current version to go into your next commit snapshot.
This leads us to the three main sections of a Git project: the Git directory, the working directory, and the staging area." (Pro Git Book)
The local repository consists of three "trees" maintained by Git:
The basic Git workflow is (Pro Git Book):
See http://git-scm.com/downloads for detailed instructions on how to install Git, in short:
On MS Windows, install it using http://git-scm.com/download/win.
On Linux (Debian/Ubuntu):
$ sudo apt-get install git
On Mac OS X, there are different ways:
$ brew install git
$ sudo port install git
You can also install a graphical user interface (GUI) for Git: GUI Clients.
In MS Windows, if you installed the official Git (cited above), you already installed a GUI (Git GUI), look at the Git folder. Anyway, if you are in MS Windows or Mac OS X and have a GitHub account, you may want to consider to use the GitHub GUI GUI because integrates easily with your GitHub account.
Let's see now a short tutorial on how to use Git and GitHub for version control with command lines (if you plan to work with a GUI client, just the concepts are important).
After you installed Git, you can check its version using a terminal window or the command
! in the IPython Notebook to access the system shell.
In MS Windows, however, if you installed Git with the recommended default options, the commands below will not work and the only terminal window (command prompt window) that works is the
Git Bash that was installed with Git. So, open
Git Bash and run the commands below (always without the
git version 2.1.1
And if you type
git you get a list of the most common commands in Git:
usage: git [--version] [--help] [-C <path>] [-c name=value] [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path] [-p|--paginate|--no-pager] [--no-replace-objects] [--bare] [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>] <command> [<args>] The most commonly used git commands are: add Add file contents to the index bisect Find by binary search the change that introduced a bug branch List, create, or delete branches checkout Checkout a branch or paths to the working tree clone Clone a repository into a new directory commit Record changes to the repository diff Show changes between commits, commit and working tree, etc fetch Download objects and refs from another repository grep Print lines matching a pattern init Create an empty Git repository or reinitialize an existing one log Show commit logs merge Join two or more development histories together mv Move or rename a file, a directory, or a symlink pull Fetch from and integrate with another repository or a local branch push Update remote refs along with associated objects rebase Forward-port local commits to the updated upstream head reset Reset current HEAD to the specified state rm Remove files from the working tree and from the index show Show various types of objects status Show the working tree status tag Create, list, delete or verify a tag object signed with GPG 'git help -a' and 'git help -g' lists available subcommands and some concept guides. See 'git help <command>' or 'git help <concept>' to read about a specific subcommand or concept.
After installation, we need to configure Git (this is only needed once and it will be used when we commit changes to the repository).
Let's do that using the following command lines:
Your email address for Git should be the same one associated with your GitHub account in case you plan to have a repository there.
To initialize a local Git repository in the current directory:
You would do that in case you want starting to track an existing local project in Git.
You can also specify a local a new local repository, with the command
git init repository-name.
To clone a remote repository, we can use two different protocols to transfer data:
SSH are cryptographic network protocols for secure data communication, remote command-line login, remote command execution, and other secure network services between two networked computers that connects, via a secure channel over an insecure network.
HTTPS is simpler to setup and
SSH requires a keypair generated on your computer and attached to your GitHub account. See this GitHub help to decide which one to use.
For instance, this is the command to clone repo of this notebook using
!git clone https://github.com/duartexyz/BMC
SSH (after you created your keypair and registered into you GitHub account):
The local repository will be created inside your current directory.
You can change the current directory to clone from there and there is no need to create a folder with the name of the repo you are cloning; Git will do that for you.
And you should not have your local repo inside a Dropbox folder because Dropbox can generate conflict files.
The Git commands will only work once you are in your working directory of your local repository.
Use the command
cd to change you current directory. In Linux or Mac OS X (change the directory to your case):
LICENSE.txt* README.md data/ functions/ images/ notebooks/
You can propose changes (add it to the Index) using the command
For instance, let's change the README.md file, and commit it:
!git add README.md
add is a multipurpose command; allows to track files, stage files, and mark merge conflicted files as resolved.
You can add everything in the current directory using the command "
git add -A".
To commit this change:
!git commit -m "Commit message"
Which commits everything in your staging area and uses inline commit message.
If the file to commit is not new, only changed, you can skip the
add command using the command
commit -a to automatically stages every currently tracked file and commits them:
!git commit –a –m "Commit message"
If you created/added a new file (untracked so far) you still need to add them to your staging area with the command
Now the file is committed to the HEAD, but not in the remote repository yet.
To send this changes to the remote repository, execute (substitute
master by the branch to push the change to if it's not the same repo you cloned from):
!git push origin master
You should do that in a regular terminal window because you may be prompted to enter your username and password for your GitHub account in case you didn't store the username and password in your OS (if you are using
If you are using
SSH, by default you will not be prompted to enter your credentials (because you created your keypair and registered into you GitHub account).
Anyway, if you are comitting to my repo, this will nor work for you because, I hope, you don't have my credentials.
You can create a branch (fork) of the BMC repository (go to its website and use the
Fork button in the upper-right corner) or create your own repo to experiment with it.
To update the local repository to the newest commit, execute in the working directory (this will fetch and merge remote changes):
To merge another branch into your active branch (e.g. master), use
git merge <branch>.
These two last commands tries to auto-merge changes.
This might not be possible because of conflicts between the different branches.
If that is the case, we will have to merge those conflicts manually by editing the files shown by git.
After changing, we need to mark them as merged with
git add <filename> or
git add ..
Before merging changes, we can preview them by using
git diff <source_branch> <target_branch>.
You should not manuualy remove or delete a file inside your repository, for that use the command
!git rm -help
usage: git rm [options] [--] <file>... -n, --dry-run dry run -q, --quiet do not list removed files --cached only remove from the index -f, --force override the up-to-date check -r allow recursive removal --ignore-unmatch exit with a zero status even if nothing matched
!git rm <filename>
To sync a fork of a repository to keep it up-to-date with the upstream repository (for example, if you forked the Demotu/BMC repo), according to the GitHub Help, you have to first configure a remote for a fork and then fetch the commits (the changes) from the upstream repository.
Follow these steps:
git remote -v
git remote add upstream https://github.com/demotu/BMC.git
git remote -v
git fetch upstream
git checkout master
If your local branch didn't have any unique commits, Git will instead perform a "fast-forward".
git merge upstream/master
To repeat the sync of a fork in the future, you will have only to change to your local project (step 2) and start from step 6.
git status: check the status of your files
git add: multipurpose command; track files, stage files, and mark merge conflicted files as resolved
git diff: compare working directory to staging area
git diff --cached: compare staged changes to last commit
git commit –m "message": commit everything in your staging area, uses inline commit message
git commit –a –m "message": automatically stage every currently tracked file and commits them (to skip “git add” command)
git rm [filename]: untrack the file and remove it from your working directory
git rm --cached [filename]: untrack the file, but keeps it in your working directory - useful if you forgot to include certain files in your .gitignore
git mv [orig_name] [new_name]: change the file's name
git logshow the commit history in reverse chronological order (i.e. most recent first) "Undoing Things" Commands
git commit --amend: overrides your most recent commit - i.e. it "undoes" your most recent with what's currently in your staging area
git reset HEAD [filename]: allows you to unstage a particular file; this file returns back to the modified state
git checkout -- [filename]: allows you to discard any changes you've made to the file since the last commit Note: use this command carefully - the discarded changes cannot be recovered
Remote Repository Commmands
git pull [remote-name] [branch-name]: automatically fetch data from the remote server (typically called "origin") and attempts to merge it into the code you're working on; branch-name is typically "master" if you haven't created your own branch
git push [remote-name] [branch-name]: push your code from the branch you're on (typically "master" if you haven't created your own branch) upstream to the remote server (typically called "origin")
Merging and Branching Commands
git merge [branch-name]: merge the specified branch with the current working directory
git branch: view all available branches
git branch [branch-name]: create a new branch
git checkout [branch-name]: set current working directory to branch-name
git checkout -b [branch-name]: create a new branch and set current working directory to it
git merge [branch-name]: merge branch-name into the current branch
git branch -d [branch-name]: delete the specified branch
Changing to Previous Commits Commands
git revert <prev_commit>: create a new commit with a reverse patch that cancels out everything after that previous commit
git checkout -b <branchname> <prev_commit>: return to a previous commit and create a branch using it