Bioinformatics Unit 2: Discussion

SSU Home | SSU Biology | CourseInfo | Calendar | Home

Glossary   |   Self Tests   |   Software   |   Objectives   |   Articles


Unit 2: Genomics   Discussion 2

The DNA Alphabet
The Genetic Code
What Does a Genome Encode?
Phylogenetic Footprinting
Genome Annotation


What is Encoded in a Genome?

The DNA Alphabet:

Informatics in the genetic code is explained in The genome chose its alphabet with care by D Bradley (2002) in Science 297: 1789-1791. The choice of A, T, G, and C may have incorporated a mechanism for minimizing the occurrence of errors in the pairing of bases.

The Genetic Code:

Deciphering the genetic code was not easy. The history of this process is described in The Invention of the Genetic Code by B Hayes (1998) in the American Scientist January-February issue. Figure 5 compares codes predicted by various schemes with the main code used in chromosomal DNA.

A Nobel prize was awarded for the decoding effort. See The Nobel Prize in Physiology or Medicine 1968. (Nobel e-Museum (2003) The Official Web Site of The Nobel Foundation.)

What Does a Genome Encode?:

Ninety-eight percent of DNA does not code for protein genes. A look at what DNA does encode is in Evolving Genomic Metaphors: A New Look at the Language of DNA. (JC Avise (2001) Science 294: 86-87.)

Genes that don't seem to have any function are another mystery, addressed in part in Complicity of gene and pseudogene by JT Lee (2003) Nature 423: 26-28.

Phylogenetic Footprinting:

Annotating short sequence motifs like transcription factor binding sites is a major problem. One approach is to compare genomic sequences from closely-related species. The article Tracking evolution's footprints in the genome by JB Weitzman (2003) Journal of Biology 2: article 9, describes the why and how of this important method of genome annotation.

Genome Annotation:

The problems created for bioinformaticists of so many data portals and formats are discussed in Creating a bioinformatics nation by L Stein (2002) Nature 417: 119-120.

One solution to these problems has been to develop a consistent set of software applications using a programmer's toolkit. The most popular development package is The Bioperl toolkit: Perl modules for the life sciences. (JE Stajich et al. (2002 Genome Res. 12(10): 1611-1618.)

[top of page]

Related Links:

Updated 09/30/2003 by bchapman@classroomtools.com