To make the bam file smaller and easier to use, we'll use only the alignments from chromosome 11.
samtools view
to subset your S10
bam file on chromosome 11¶Notice that the usage output of samtools view
shows you can specify a "region" such as "chr11
" to get a subset of your data:
[ucsd-train01@tscc-login2 processed_data]$ samtools view -h
Usage: samtools view [options] <in.bam>|<in.sam>|<in.cram> [region ...]
Go to your processed data folder and get only the "chr11
" region. This will take a little bit.
[ucsd-train01@tscc-login2 ~]$ cd ~/projects/shalek2013/processed_data
[ucsd-train01@tscc-login2 processed_data]$ samtools view [flag(s) to output as bam] S10.Aligned.out.sorted.bam [region] > S10.Aligned.out.sorted.chr11.bam
All programs using bam files also need the ".bai
" index file, so we'll create that too.
[ucsd-train01@tscc-login2 processed_data]$ samtools index S10.Aligned.out.sorted.chr11.bam
You should now have these two files:
[ucsd-train01@tscc-login2 processed_data]$ ls -lha S10.Aligned.out.sorted.chr11.bam*
-rw-r--r-- 1 ucsd-train01 biom262-group 418M Feb 4 06:20 S10.Aligned.out.sorted.chr11.bam
-rw-r--r-- 1 ucsd-train01 biom262-group 119K Feb 4 06:23 S10.Aligned.out.sorted.chr11.bam.bai
The file transfer takes about a minute.
Use the scp
secure copy command to copy files to your laptop. This is just like regular "cp oldfile newplace
" except it knows how to deal with outside servers like TSCC. You have to specify the exact location of the file on TSCC.
scp ucsd-train##@tscc.sdsc.edu:~/projects/shalek2013/processed_data/S10.Aligned.out.sorted.chr11.bam ~/Desktop
scp ucsd-train##@tscc.sdsc.edu:~/projects/shalek2013/processed_data/S10.Aligned.out.sorted.chr11.bam.bai ~/Desktop
This command will put the files on your desktop (since it says ~/Desktop
at the end).
Add the directory with your putty executables to your path:
set PATH=C:\path\to\putty\directory;%PATH%
(you'll need to replace the fake path above with the real one on your computer - maybe it's C:\Olga\Desktop
)
Copy the files over, specifying your private key location, the location of the file on the server (specifying the exact location), and the place you want to put it on your computer.
pscp -i C:\path\to\putty\biom262\private\key ucsd-train##@tscc.sdsc.edu:~/projects/shalek2013/processed_data/S10.Aligned.out.sorted.chr11.bam %HOMEPATH%/Desktop
pscp -i C:\path\to\putty\biom262\private\key ucsd-train##@tscc.sdsc.edu:~/projects/shalek2013/processed_data/S10.Aligned.out.sorted.chr11.bam.bai %HOMEPATH%/Desktop
This command will put the files on your desktop (since it says %HOMEPATH%/Desktop
at the end). Let me know if this doesn't work.
Find where you downloaded IGV (click there to download if you haven't yet. it'll ask you to login/register because they need to track users for their grants), and double-click it to open.
These data were aligned to the mm10
version of the genome, but the default on IGV is hg19
. So we'll need to change the genome. In the upper left where it says "Human hg19", click that and then click "More ..."
In the genome box, start typing "mm10" to search and get the genome.
You won't need to search for the genome every time -- just this first time. But you'll always need to make sure your genome is correct :)
Choose "File > Load from file."
Find your the bam file that you just downloaded. Once loaded, your screen should look like:
Two of the genes from the paper, Irgm1 and Irf7 are on chromosome 11. Let's check them out. Search for "Irgm1" in the search box and press "Enter."
Now you should see the Irgm1 locus: