In this project, we will learn how to analyze biological sequences (DNA, RNA and Protein) using online web tools. This will include transcription, translation.
Go to Expasy Tools
here. Take a look at the wide array of web tools that are available. This is just scratching the surface. There are a ton of different websites devoted to bioinformatics analysis.
For this project, we will be performing some of the transformations and analysis that you did with the DNA models and by paper.
First, let's reverse complement a sequence. To do this, go to the reverse complement web tool here:
Reverse Complement
What is the reverse complement of
"AATTGGCC"
?
Palindromes: A sequence that is it's own reverse complement is called a "palindrome". Given what you've learned about reverse complement, can you make up a 10bp DNA sequence that is a palindrome?
Recall the sequences discussed earlier today:
..----------------------------------
.-----------------------------------
------------------------------------
GGAUGGUAAGUAGUUUUUGAGAUCCCUCAUCAUAAA->RNA
GGATGGTAAGTAGTTTTTGAGATCCCTCATCATAAA->DNA
||||||||||||||||||||||||||||||||||||
CCTACCATTCATCAAAAACTCTAGGGAGTAGTATTT->DNA
CCUACCAUUCAUCAAAAACUCUAGGGAGUAGUAUUU->RNA
------------------------------------
-----------------------------------.
----------------------------------..
Let's paste the "forward strand" sequence
"GGATGGTAAGTAGTTTTTGAGATCCCTCATCATAAA"
into the translation tool. Note the forward strand is the upper DNA sequence of the example from earlier today, and the reverse strand (the lower DNA sequence) is the reverse complement of it. Demonstrate that the reverse strand is the reverse complement using the reverse complement tool.
Now let's transcribe and translate a DNA Sequence. To do this, jump to the translation tool, or click this link:
translation tool. Best to open this in a seperate tab. Paste the forward strand into the text box and click "translate".
You will see that there are in fact 6 different translations. This is because there are two strands (forward and reverse) and each of these has 3 possible "frame shifts". A "frame shift" corresponds to the fact that the 3 nucleotide codons can start at postion 1, 2, or 3 of the sequence. Starting at position 4 basically gives the same thing as starting at position 1 without the first codon. Which translation gives the longest protein?
How well do you understand the codon table? Here is the codon table shown earlier today:
Codon table from Wikipedia. Modify the sequence given above to produce a sequence of the same length, but starts with a start codon (codes for Methionine) and ends with a stop codon.
Now let's look at a real world example. First, in a new window or tab, let's go to NCBI's nuceotide search tool:
NCBI Nucleotide. You could right click this link and open in a new tab.
In the search box, search for the gene name "vrs1". You will see that the first hit is a "gene" and the second hit is an mRNA, the transcribed sequence. We want the mRNA. Click the "FASTA" link below the second search result. It should be 1313nt long.
Now let's copy this sequence and paste it into the translation tool. Translate it! Which translation is longest? Which translation is correct? Let's find out. Let's open NCBI's Protein database here:
NCBI Protein. Search for the same protein in the database "vrs1". The result is the first hit. Let's click the "FASTA" link. It should be a 222aa protein. Now which translation from before is the correct one?