Exercise 1: Protein
Structure
Objectives:
1. Become familiar with the basics of protein
structure.
- See relationship of primary, secondary, tertiary, and
quaternary structure of proteins.
- Recognize structural motifs and their relationship to
overall structure and to function.
- Recognize a domain and its relationship to overall
structure and function.
2. Become familiar with on-line resources for protein
structure and learn to use a molecular viewer.
- Know how to find a particular protein of
interest.
- Know how to gather information on a particular
protein.
- Know how to use a molecular viewer to gain
information.
3. Understand the principles used to predict and to model
protein structures.
Introduction:
One of the best ways to understand structure-function
relationships is to be able to view the actual structure.
The next best thing is to view a structural model which is
constructed using as much hard data relating to the
structure as possible. The activities in this exercise will
introduce a variety of tools useful in exploring protein
structure and viewing 3-D models. Although it may seem a bit
like skipping to the last chapter of a book to see how the
story turns out before actually reading it, the first
activity is to view 3-D structures of some sample proteins.
The structures are derived from calculations based on X-ray
crytalography, and frequently NMR, data of purified samples
of the proteins.
After having a chance to become familiar with using
Chime, a 3-D viewer, you will have the opportunity to
explore other tools useful in examining a variety of
characteristics which contribute to the final calculated
structure, such as inter-atomic forces,
hydrophobicity-hydrophilicity, and structural motifs.
Multiple sequence alignments allow for the identification of
conserved regions. These conserved regions contain sequence
patterns which relate to motifs and patterns of secondary
structure. Protein folding to a final tertiary or quaternary
structure is dependent on many factors. Secondary
structures, along with the characters of the individual
amino acids involved, contribute significantly. However,
that is not the whole story. During assembly, polypeptides
are guided into their final structures by chaparonins, along
with the formation of intrachain, and sometimes interchain,
disulfide bridges. If present, glycosylations and prosthetic
strucutres also contribute to the final structural
conformation. By comparing the sequence of a protein of
interest to sequences of proteins whose structures have been
determined, it is possible to infer the probable structure
of that protein. If the sequence homology is high and the
function is known to be the same, or at least very similar,
then the inferred structure is likely to be correct or
nearly correct. If the sequence alignments show only certain
homologous regions with low global homology, then overall
structure cannot be accurately determined without the input
of more data. The ideal is purifying the protein and
crystalizing it for X-ray diffraction analysis. NMR analysis
of a stable pure protein is also possible, but the results
are more ambiguous than with X-ray crystalography.
After exploring the available tools and playing with a
couple of worked examples in this exercise, you will have
the opportunity of exploring a protein of interest in the
associated project.
Reading in Brown:
- Review Chapter 6 pp. 85-98 on multiple sequence
alignments.
- Then read Chapter 7 pp. 99-119 on structure-function
relationships.
Reading in Gibas and Jambeck:
- Review Chapter 8 pp. 193-197 on multiple sequence
alignments and pp. 205-214 on profiles and motifs.
- Review chemistry and protein structure viewers
Chapter 9 pp. 215-234
- Prediction of structure from sequence: Chapter 10 pp.
268-293.
There are summary questions at the end of this
section. Read them through before you start browsing. You
can answer them as you go, or answer them after browsing the
following sites. Points = 10. Due 11/13.
[top of
page]
Pre-Exercise:
As a warm-up activity to molecular modeling, you'll want
to become familiar both with structure of proteins at
different levels of organization and with the structure
viewers. Here you have some choice for tutorials. If you
have never seen any of these sites before, surf quickly to
see what they have to offer.You might want to spend most of
your time with just one site, and then explore the others to
enrich your experience.
1. The first site is nicely organized as notes
with interactive illustrations. This site presents secondary
and tertiary structures as interactive images. You can click
and drag to move the images in 3D. There are also links to
expand on some topics.
http://www.cm.utexas.edu/CH339K/Hackert/Proteins/Proteins.htm
2. The Lehninger book support site below has well
developed tutorials. Click on "Lehninger 3D Structure
Models". Select "Protein Architecture". This is a new
tutorial, and not quite complete. It stops before getting to
tertiary structure, but through secondary structure it is
quite good. As you go through the guided explanation, you'll
see different types of displays of the structures, along
with the details of the tutorial. This will give you an idea
of what you can do during this exercise, and after, with
proteins of your choice. [Along the way you'll be
learning more about protein chemistry as well.]
http://www.worthpublishers.com/lehninger/
After doing "Protein Architecture", select one or more
of the other tutorials. The main take-home lesson in
these tutorials is the relationship of
structure-function, with a variety of examples of how
specific structures contribute to general and specific
functions.
3. Another excellent site for macromolecular
tutorials is the OMM site. "Amino Acid Structures" is an
excellent 3D reference site and simple to use. "Chemical
bonds and Protein Structure" goes into detail of bond
formation and how it contributes to structure. Secondary and
tertiary structure are addressed in the other tutorials.
Select a few to get a feel for different motifs and
domains.
http://www.clunet.edu/BioDev/omm/exhibits.htm#displays
[top of
page]
Exercise:
Part 1: Viewing
molecular models
Note: If you are already familiar with using RasMol or
Chime and feel you don't need the review, you can skip
section A; go to section B.
A. Finding and displaying a protein.
1. Go to NCBI:
http://www.ncbi.nlm.nih.gov/Structure/
2. Type the protein you wish to find in the search
bar. Try one of the following if you are looking for some
place to start:
|
|
|
# Amino acids
|
|
lysozyme- a lytic enzyme found in tears
|
|
129
|
|
ferredoxin- involved in electron
transport
|
|
97
|
|
IP3- signal transduction protein, opens
Ca++ channels
|
|
106
|
|
ARF- involved in assembling coated
vesicles
|
|
175
|
- Try to start with smaller proteins, such as those
listed here. When you are comfortable in navigating, try
more complex proteins. You may choose a protein of
interest to you, or if you need ideas to get started try
one of the following: "Fab", "C1q", "reverse
transcriptase", "chaperonin", or "porin".
3. A structure query page will come up with all
documents in the NCBI database that have protein structures
related to your inquiry. Click on "retrieve documents".
4. A page of summaries will appear giving a short
description of the proteins found. Browse the list, then
select one by checking the box. Then click on "view
structure summary".
5. A structure summary page will come up listing
the compound, taxonomy, etc. Scroll down until you see
Options, Viewer, and Complexity at the
bottom of the page.
6. Under "View" accent the "RasMol" dart. Then
click on the "View/Save Structure".
7. Once the protein model comes up, click on the
MDL prompt in the bottom right corner of the
screen.
a. Drag down to "color" and highlight
"groups". This allows you to easily view a polypeptide
from its amino end to its carboxyl terminus through the
change of rainbow hues.
b. Drag down to "display" and click on
"ribbon". How many different ribbons does your protein
have? What does each ribbon represent? Try some of the
other display types.
c. Look for different structural motifs, such
as a-helix, b-sheet,
and hairpin turn. Which ones, and how many of each, did
you find on your selected protein?
d. Place the arrow on the protein, click and
hold down the mouse. As you move the mouse, you can move
the protein. As you move the protein around, do you see
anything that you couldn't see before? Try also stereo
display and rotation.
e. Try marking individual amino acids. For
example, select cysteine to locate disulfide bond sites.
Or try to mark amino acids which you would predict as
being found on hairpin loops. [See "Shorthand
symbols for amino acids" for abbreviations. See a
biochem or molecular biology reference for amino acid
structures. Or go to http://www.clunet.edu/BioDev/omm/exhibits.htm#displays
for an on-line reference (look under Hall of
Introductory Exhibits for Amino Acid
Structures).]
f. For a handy table for mouse controls while
using Chime, go to: http://www.clunet.edu/BioDev/omm/exhibits.htm
g. For color schemes used, see the entry in the
RasMol manual: http://info.bio.cmu.edu/Courses/BiochemMols/RasFrames/SHAPELY.HTM.
To browse the full manual, click on "Table of Contents"
or go to: http://info.bio.cmu.edu/Courses/BiochemMols/RasFrames/TOC.HTM
8. Pull up another protein and compare the two.
See 2 above. Write a brief summary of the comparison.
[See summary questions below.]
[top of
page]
B. Digging deeper for details on structure- other
sites to explore.
If you want to find out more about your protein, or any
other protein, you need to go to another database or use a
different application. This is easy to do! For your protein,
many of these databases and applications can be accessed
from your "Structure Summary" page.
1. From the "Structure Summary" page, click on the
colored number that is listed after the "PDB Id".
2. You should see a table with information
regarding who studied the protein and methods used for
studying the protein. If you scroll down, you will see
headings on the left that read "Structural Neighbors",
"Geometry", "Other Sources", etc. This is the spot. As you
browse, do check out "Other Sources"- there is much to be
found here. Have fun finding literature and more information
about your protein. There are some guided activities for
some of these sites in Part 2, so exploring now will help
you later. A few sites of interest are listed here.
a. SCOP provides structural
information, including protein classification organized
in a hierarchy, a list of related proteins along with
links to NCBI and structural views.
b. PRODOM identifies domains which can be found
in other proteins. By selecting each identified domain,
you can get an alignment and a basic tree, along with
access to other information and links for other related
proteins.
c. PROSITE is database of protein
families and domains located at ExPASy. Lots of tools are
available here.
d. STING allows viewing a 3D structure which is
interactively linked to the primary sequence. There are
other features which are handy, such as chain selection,
background color selection, and toggling water and
ligands if present. Chime controls are also active.
e. GRASS allows use of three types of viewers,
with up-front selection of parameters and viewing
specifications. It supports viewing in Chime, GRASP, and
VRML. For VRML, there is a link to downloading
CosmoPlayer on the Quick Start guide page. GRASP requires
SGI platform, so for now that is not available. However,
click on the link below to examine what you can learn
about surface structure and electrostatic forces:
http://honiglab.cpmc.columbia.edu/grasp/
3. Another good one to try is PDB Sum. You
can get there through "Other Sources" or by using the URL
below: [Save your PDB Id code to enter at the
site.]
http://www.biochem.ucl.ac.uk/bsm/pdbsum
At PDB Sum, you can view secondary structure motifs, and
other details of your sequence.
a. Explore what is available on the main
page that comes up after entering the PDB Id code. Note
that each chain of a multi-chain structure has a separate
display set of links along with the secondary structure
map.
b. Pfam is well worth exploration. It
identifies regions and major motifs in families and
allows direct linking to related structures.
c. FSSP is useful if you are looking for
related proteins with alignment scores. Many of the
returned sequences are linked as well.
Note that you can directly link to many of the
sites which can also be reached on the "Other Sources"
page at PDB. However, some links are not found on the PDB
table, such as Pfam and FSSP. Many of the linked sites
are interlinked so navigation among them is quite
simple.
[top of
page]
C. Learning
more about the viewers and using them for
presentations.
If you want to learn more about exploring protein
structure, making illustrations of protein models, and
seeing how these methods can be applied try any of the
following tutorials:
1. The new CN3D has some nice advantages and
features. It is pretty easy to learn, and you'll learn more
about proteins in the process. Some students have preferred
this viewer to Chime/RasMol when viewing primary and
tertiary structure together. [STING does this as well.
See B2d above.]
http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3dtut.shtml
2.The Online Macromolecular Museum's tutorial is
helpful in learning Chime scripting. You can learn how to
organize and execute some of the image types you have seen
in the structure tutorials.
http://www.clunet.edu/BioDev/omm/howdo.htm
[top of
page]
Part 2: Exploring protein
folding
A. Go to Prion
Protein Structure.
Work through this interesting problem of protein
folding and the consequences of alternate folding
patterns.
B. Go to Summary questions.
Part 3: Exploring on your own
A. Select a protein in which you have a particular
interest. Alternatively, select a function or process which
interests you, then find a protein which performs that
function or is involved in the process.
1. Identify the class of protein to which it
belongs. Explore how its structure contributes to its
function.
2. Run an MSA on a few closely related sequences.
Look for patterns in the sequence which relate to secondary
structural elements, such as helices, beta-stands, and
loops. Which regions are highly conserved? Where are they in
the tertiary [ and quarternary?] structure? How do
they contribute to function?
3. Look for different types of motifs and domains
in your protein. How do they contribute to function?
4. Examine the literature on the structural
determination of your protein. How was the structure of your
protein determined? What were some of the challenges in
determining the structure? How were they resolved? If there
are still areas unresolved in the structure, explore what
the problems are and how they can be resolved.
[top of
page]
Summary Questions:
Try to limit your answers to 2-3 typed pages [12 pt
font]. This length should be sufficient for your
comments and any appropriate copy/pasted examples. [You
need not retype or copy/paste the questions as part of your
responses, but do number them please.]
1. Using Chime:
a. Which protein did you first pick?
[Give the ID code as well.] How many different
peptides did it have? Did it have any beta-sheets? If so,
how many? Did it have any alpha-helices? If so, how many?
Did it have any hairpin loops? If so, how many? Were
there any other features worth noting? If so, what?
b. What was the second protein you picked?
Please include the ID code. Review its structural
characteristics, as you did for the first one. In what
ways was it similar to your first protein? In what ways
was it different?
2. Exporing other sites:
a. Briefly describe two other
applications you tried which related to learning
something more about the structure of one of your
proteins.
b. To what structural family or families did
your protein belong? Where did you find this
information?
c. What other viewer interface did you try? How
did it help in viewing your protein?
3. Discuss prion protein folding.
a. How do 1E1G and 1E1J differ?
b. How does 1B10 compare to the first two
models?
c. Do you think beta-sheets as seen in 1HQO
could contribute to the stabilization of the dimer? If
so, how? If not, why not?
d. After reading the articles and browsing the
different structure models, what do you think is the most
likely molecular scenerio for the formation of
PrPsc from PrPc? Support you
argument with cited evidence.
4. Discuss your protein.
a. Identify your protein, including ID
code. To what class of protein does it belong? Describe
its structure. Describe its function. What parts of the
structure are key to its function?
b. What regions are highly conserved? What
sequence patterns did you find?
c. What types of motifs and domains do you find
in your protein? How do they contribute to function?
d. How was the structure of your protein
determined? What were some of the challenges in
determining the structure? How were they resolved? If
there are still areas unresolved in the structure, what
are the problems? How can they be resolved?
Further
exploration:
1. For more on protein architecture:
http://www.kumc.edu/research/medicine/biochemistry/bioc800/pro-lobj.htm
2. For further reading on protein structure, an
excellent book:
Brandon, C. and Tooze, J., 1999. Introduction
to Protein Structure, 2e. Garland. ISBN: 0815323050.
[top of
page]
.
|