Bioinfo Care

RNA - Transcription

Introduction:

The differences in the composition of RNA and DNA have already been noted. In addition, RNA is not usually found as a double helix but as a single strand. However, the single polynucleotide strand may fold back on itself to form portions which have a double helix structure like the tertiary structure of proteins.

The biosynthesis of RNA, called transcription, proceeds in much the same fashion as the replication of DNA and also follows the base pairing principle. Again, a section of DNA double helix is uncoiled and only one of the DNA strands serves as a template for RNA polymerase enzyme to guide the synthesis of RNA. After the synthesis is complete, the RNA separates from the DNA and the DNA recoils into its helix.

The transcription of a single RNA strand is illustrated in the graphic on the left. One major difference is that the heterocyclic amine, adenine, on DNA codes for the incorporation of uracil in RNA rather than thymine as in DNA. Remember that thymine is not found in RNA and do not confuse the replacement of uracil in RNA for thymine in DNA in the transcription process. For example, thymine in DNA still codes for adenine on RNA not uracil, while the adenine on DNA codes for uracil in RNA.

RNA Transcription Process:

The RNA transcription process occurs in three stages: initiation, chain elongation, and termination.

The first stage occurs when the RNA Polymerase-Promoter Complex binds to the promoter gene in the DNA. This also allows for the finding of the start sequence for the RNA polymerase. The promoter enzyme will not work unless the sigma protein is present (shown in blue in graphic). Specific sequences on the non coding strand of DNA are recognized as the signal to start the unwinding process.

The recognition sequences are as follows:
Non-coding DNA -5' recognition sections in bold
GGCCGCTTGACAAAAGTGTTAAATTGTGCTATACT

Once the process has been initiated, then the RNA polymerase elongation enzyme takes over and is described in the next panel.

RNA Polymerase - Elongation:

The elongation begins when the RNA polymerase "reads" the template DNA. Only one strand of the DNA is read for the base sequence. The RNA which is synthesized is the complementary strand of the DNA.

The RNA (top strand) and DNA (bottom strand) sequences in the model are:
5' -GACCAGGCA-3'
3'-TCTGGTCCGTAAA-5'

In the graphic, the magenta color is the template DNA, while the green is the RNA strand.

In the next reaction step, uracil triphosphate (UTP) is the next to be added to the RNA by bind and pairing with the adenine (A) nucleotide on the template DNA strand. A phosphodiester bond is formed; the RNA chain is than elongated to 10 nucleotides; and diphosphate left over would dissociate.

Types of RNA

Messenger RNA:

Messenger RNA (mRNA) is synthesized from a gene segment of DNA which ultimately contains the information on the primary sequence of amino acids in a protein to be synthesized. The genetic code as translated is for m-RNA not DNA. The messenger RNA carries the code into the cytoplasm where protein synthesis occurs.

Genetic Code:

Each gene (or distinct segment) on DNA contains instructions for making one specific protein with order of amino acids coded by the precise sequence of heterocyclic amines on the nucleotides. Since proteins have a variety of functions including those of enzymes mistakes in the primary sequence of amino acids in proteins may have lethal effects.

How can a polymeric nucleotide with only four different heterocyclic amines specify the sequence of 20 or more different amino acids? If each nucleotide coded for a single amino acid, then obviously only 4 of the 20 amino acids could be accommodated. If the nucleotides were used in groups of two, there are 16 different combinations possible which is still inadequate.

It has been determined that the genetic code is actually based upon triplets of nucleotides which provide 64 different codes using the 4 nucleotides. During the 1960's, a tremendous effort was devoted to proving that the code was read as triplets, and also to solving the genetic code. The genetic code was originally translated for the bacteria E. Coli, but its universality has since been established. The genetic code is "read" from a type of RNA called messenger RNA (mRNA). Each nucleotide triplet, called a codon, can be "read" and translated into an amino acid to be incorporated into a protein being synthesized.

Ribosomal RNA:

In the cytoplasm, ribsomal RNA (rRNA) and protein combine to form a nucleoprotein called a ribosome.The ribosome serves as the site and carries the enzymes necessary for protein synthesis. In the graphic on the left, the ribosome is shown as made from two sub units, 50S and 30 S. There are about equal parts rRNA and protein. The far left graphic shows the complete ribosome with three tRNA attached.

The ribosome attaches itself to m-RNA and provides the stabilizing structure to hold all substances in position as the protein is synthesized. Several ribosomes may be attached to a single RNA at any time. In upper right corner is the 30S sub unit with mRNA and tRNA attached.Transfer RNA:

Transfer RNA (tRNA) contains about 75 nucleotides, three of which are called anticodons, and one amino acid. The tRNA reads the code and carries the amino acid to be incorporated into the developing protein.

There are at least 20 different tRNA's - one for each amino acid. The basic structure of a tRNA is shown in the left graphic. Part of the tRNA doubles back upon itself to form several double helical sections. On one end, the amino acid, phenylalanine, is attached. On the opposite end, a specific base triplet, called theanticodon, is used to actually "read" the codons on the mRNA.

The tRNA "reads" the mRNA codon by using its own anticodon. The actual "reading" is done by matching the base pairs through hydrogen bonding following the base pairing principle. Each codon is "read" by various tRNA's until the appropriate match of the anticodon with the codon occurs.

Genetic Code

First Base	Second Base				Third Base
First Base	U	C	A	G	Third Base
U	phe	ser	tyr	cys	U
U	phe	ser	tyr	cys	C
U	leu	ser	stop	stop	A
U	leu	ser	stop	trp	G
C	leu	pro	his	arg	U
C	leu	pro	his	arg	C
C	leu	pro	glu	arg	A
C	leu	pro	glu	arg	G
A	ile	thr	asn	ser	U
A	ile	thr	asn	ser	C
A	ile	thr	lys	arg	A
A	met-start	thr	lys	arg	G
G	val	ala	asp	gly	U
G	val	ala	asp	gly	C
G	val	ala	glu	gly	A
G	val	ala	glu	gly	G