Bioinformatics: Introduction to Unix
First create a directory. Directories are like folders to keep files and such. You can create a directory by typing the command “mkdir” for “make directory”. For example, create a directory to store your scripts:
mkdir scripts
Next, create a directory for your data:
mkdir data
You can list all of your files and directories by the command “ls”. Type it to list the contents of a directory:
ls
You can go inside of a directory. This is with the command “cd” for “change directory”. For example change directory into your data directory:
cd data
Now type “ls” again:
ls
You can see nothing. This means you are in your data directory. Now check your current “working directory”, which is the directory that you are in:
pwd
You should see now that you are in your data directory. Your home directory is designated by the symbol “~” in the upper left of your keyboard (hit shift when clicking it)
ls ~
You should see your scripts directory and your data directory. These commands constitute some of the basics of navigating in a unix shell.
Next, let’s learn how to create some files and write a program. First open the program with “nano” and create a new python script:
nano ~/scripts/helloworld.py
The “.py” at the end refers to python. Next type this command:
print "hello world!"
To save, type “control-x” then type “y” and then hit return.
Type:
python helloworld.py
Is this what you expected would happen? For some of you this may technically be your first program! Now on to biological sequences…
Biological sequences
To enter the python shell, simply type “python”:
python
You should now see a new prompt with three greater than signs “>>>”. First let’s import some tools for working with biological sequences:
>>> from Bio.Seq import Seq
Next, let’s create a DNA sequence:
>>> DNA = Seq('ATGCCAGTGACGUGA')
To view your DNA sequence, simply type “DNA” again:
>>> DNA
Now it just says “Alphabet” referring to the DNA alphabet here. Next let’s transcribe the DNA into RNA, making an RNA copy where the Ts (letter T=thymine nucleotide) get turned to Us (letters U = uracil nucleotide):
>>> RNA=DNA.transcribe()
We can check our RNA sequence by typing “RNA” again:
>>> RNA
Finally, we can translate this RNA sequence into protein:
>>> Protein=RNA.translate()
Check to see that the translation worked:
>>> Protein