git
and GitHub¶git
≠ GitHub¶There's been some confusion about what's the difference between git
and GitHub, and specifically when you make changes using git
but don't see them on GitHub.
git
is the program you use for tracking changes¶This will be a sped-up version of the git
intro from Software Carpentry and will hit all the high points that you need to remember for working with git
in this class.
Let's start with a fresh new repository.
$ cd ~/code
$ mkdir biom262-git-test
$ cd biom262-git-test
$ git init
Initialized empty Git repository in /Users/olga/workspace-git/biom262-git-test/.git/
Let's see what git
knows about this file using git status
:
$ git status
On branch master
Initial commit
nothing to commit (create/copy files and use "git add" to track)
There's no files here yet. Just like in the git tutorial, we'll create a few empty files and then see how this changes the git
repo.
$ touch pipeline.sh
$ touch statistics.sh
$ touch plots.sh
We should now see a few files, both using ls
(which literally tells us which files are there) and git status
(which tells us what git
knows about the files)
$ ls
pipeline.sh plots.sh statistics.sh
$ git status
On branch master
Initial commit
Untracked files:
(use "git add <file>..." to include in what will be committed)
pipeline.sh
plots.sh
statistics.sh
nothing added to commit but untracked files present (use "git add" to track)
At this point, git
has no idea that these files matter to you. Where it says "Untracked files
", it shows a list of files that git
sees, but doesn't think you care about so it's not tracking their changes. Let's git add
and git commit
these files. I'm going to be lazy and add everything that ends in ".sh
".
$ git add *.sh
$ git status
On branch master
Initial commit
Changes to be committed:
(use "git rm --cached <file>..." to unstage)
new file: pipeline.sh
new file: plots.sh
new file: statistics.sh
Oh wait I don't want to commit plots.sh
yet. So let's remove it using git
's hint of git rm --cached <file>
to "unstage." You can think of "staging" as the "staging area" like if you're corralling cattle
So in TSCC you edited some files in a git
repository. You then do git status
and see
You used git
on TSCC to git add
and git commit
files
After you've git add
'd, git commit
'd and everything, GitHub (the place) still has no idea what you've done. It's only when you git push
and git pull
that you talk to GitHub and tell them you've made some updates.
$ git pull upstream master
$ git push origin week04
When you "Fork" a repository (repo), you make a copy that you personally can edit, commit, add, and push changes to. When you "Clone" a repository, you're copying all history of all code that was ever written in that project. If it's your project, then you have write access. Otherwise, no. Usually you don't have write access to things you're cloning.
LICENSE
= no fun¶One thing we didn't go over in class is that if your code is up somewhere publicly available without an explicit license, then technically nobody is allowed to use it. That's why using a license for your code (I recommend the UCSD-specific 3-Clause BSD license "UCSD Software Copyright Notice" we used for the first homework because it allows for the most people to use it - academics, companies, industry, non-profit, for-profit - without restrictions.
There's another license called the GPL that many people use and is (in my opinion) a little idealistic because it prevents the usage of the code in a properitary setting (i.e. Microsoft can't use it privately within the compnay). If you're interested in open source license, I encourage you to browse opensource.org.
Branches are great because you can work on one notebook in one branch and leave the other code completely untouched and then switch to another branch and work on some other notebook
Create a new branch:
git checkout -b newbranchname
Change to a different branch
git checkout otherbranch
IF you update your repository and you see a merge conflict, it'll look like the thing below. This is totally normal and we can fix it and there is no need to panic.
[ucsd-train01@tscc-0-63 biom262-2016]$ git pull upstream master
From github.com:biom262/biom262-2016
* branch master -> FETCH_HEAD
Auto-merging weeks/week04/0_alignment_expression_quantification.ipynb
CONFLICT (content): Merge conflict in weeks/week04/0_alignment_expression_quantification.ipynb
Automatic merge failed; fix conflicts and then commit the result.
You'll need to edit the file:
nano weeks/week04/0_alignment_expression_quantification.ipynb
And find the merge conflict that looks like:
<<<<<<< HEAD
Stuff that's in your version (whether you wrote it or it's an old version)
=======
Updates from "upstream"
>>>>>>> 537cb50... add clarification for alignment quantification
If it's an answer you want to keep, you'll want to keep the first section (your answer) and delete the first, but if it's updates from upstream you'll want to remove the first section and keep the second. Either way, at the end you don't want any of these lines - they should be removed. What you keep in between them is what you decide makes sense.
<<<<<<< HEAD
=======
>>>>>>> 537cb50... add clarification for alignment quantification
If you updated your noteobok you're probably seeing this kind of error:
Unreadable Notebook: /home/ucsd-train01/code/biom262-2016/weeks/week04/0_alignment_expression_quantification.ipynb NotJSONError('Notebook does not appear to be JSON: '{\n "cells": [\n {\n "cell_type": "m...',)
That's because there are merge conflicts in the Jupyter notebook that make it incomprehensible to Jupyter, but we can fix that.
Do "git status" to see which files are modified. The ones with "both modified" are the ones with the merge conflict:
[ucsd-train01@tscc-login2 biom262-2016]$ git status
# On branch master
# Your branch is ahead of 'origin/master' by 73 commits.
#
# Unmerged paths:
# (use "git add/rm <file>..." as appropriate to mark resolution)
#
# both modified: weeks/week04/0_alignment_expression_quantification.ipynb
#
# Untracked files:
# (use "git add <file>..." to include in what will be committed)
#
# weeks/week01/data/gencode.v19.annotation.chr22.transcript.promoter.nfkb.gtf
# weeks/week01/data/tf.nkfb.bed
no changes added to commit (use "git add" and/or "git commit -a")
Now you'll need to edit the modified file in question using nano
.
nano weeks/week04/0_alignment_expression_quantification.ipynb
(you'll need to replace the above path with the actual path to your conflicted notebook)
Use the arrow keys (up and down) to scroll through the file until you see the merge conflict syntax, which looks like this:
<<<<<<< HEAD
"Now use `ln -s filename newplace` to create soft links of the folders in your `~/scratch/shalek2013/` directory to your `~/projects/shalek2013` directory so when yo$
=======
"Now use `ln -s filename newplace` to create **soft links** (aka \"shortcuts\" or \"pointers\") of the folders in your `~/scratch/shalek2013/` directory to your `~/p$
"\n",
"```\n",
"ln -s /projects/ps-yeolab/biom262-2016/seqdata/shalek2013 $HOME/projects/shalek2013/raw_data\n",
"ln -s $HOME/scratch/shalek2013/processed_data $HOME/projects/shalek2013/processed data\n",
"```\n",
"\n",
"Fix bottom one to:\n",
"```\n",
"ln -s $HOME/scratch/shalek2013/processed_data $HOME/projects/shalek2013/processed_data\n",
"```\n",
"\n",
"If you're seeing black and red links you've done something wrong... remove the `~/projects/shalek2013` directory and start over.\n",
"\n",
"```\n",
"rm -f ~/projects/shalek2013\n",
"```\n",
"\n",
"Then make sure your:\n",
>>>>>>> 537cb50... add clarification for alignment quantification
To fix the conflict, remove the lines:
<<<<<<< HEAD
"Now use `ln -s filename newplace` to create soft links of the folders in your `~/scratch/shalek2013/` directory to your `~/projects/shalek2013` directory so when yo$
=======
and
>>>>>>> 537cb50... add clarification for alignment quantification
If that's the only one you can find, then you're done! Refresh the notebook. If you find more, you'll need to resolve the rest. Remember that the first section, "HEAD
" is what you have locally, and the second section is what the remote has.
Now you'll need to add and commit the file.
[ucsd-train01@tscc-login2 biom262-2016]$git add weeks/week04/0_alignment_expression_quantification.ipynb
[ucsd-train01@tscc-login2 biom262-2016]$ git commit -m 'fix updates from upstream'
git ls-files -u