Bioinformatics Unit 5: Exercise

SSU Home | SSU Biology | SiteMap | Search | CourseInfo | Forum | Home

Glossary   |   Self Tests   |   Software   |   Objectives   |   Articles


Unit 5: Structure Prediction
Exercise 1

Pre-Exercise

Exercise 1:

Part 1- Viewing molecular models

Part 2- Exploring protein folding

Part 3- Exploring on your own

Summary questions

Further exploration

.

Exercise 1: Protein Structure


Objectives:

1. Become familiar with the basics of protein structure.

  • See relationship of primary, secondary, tertiary, and quaternary structure of proteins.
  • Recognize structural motifs and their relationship to overall structure and to function.
  • Recognize a domain and its relationship to overall structure and function.

2. Become familiar with on-line resources for protein structure and learn to use a molecular viewer.

  • Know how to find a particular protein of interest.
  • Know how to gather information on a particular protein.
  • Know how to use a molecular viewer to gain information.

3. Understand the principles used to predict and to model protein structures.

Introduction:

One of the best ways to understand structure-function relationships is to be able to view the actual structure. The next best thing is to view a structural model which is constructed using as much hard data relating to the structure as possible. The activities in this exercise will introduce a variety of tools useful in exploring protein structure and viewing 3-D models. Although it may seem a bit like skipping to the last chapter of a book to see how the story turns out before actually reading it, the first activity is to view 3-D structures of some sample proteins. The structures are derived from calculations based on X-ray crytalography, and frequently NMR, data of purified samples of the proteins.

After having a chance to become familiar with using Chime, a 3-D viewer, you will have the opportunity to explore other tools useful in examining a variety of characteristics which contribute to the final calculated structure, such as inter-atomic forces, hydrophobicity-hydrophilicity, and structural motifs. Multiple sequence alignments allow for the identification of conserved regions. These conserved regions contain sequence patterns which relate to motifs and patterns of secondary structure. Protein folding to a final tertiary or quaternary structure is dependent on many factors. Secondary structures, along with the characters of the individual amino acids involved, contribute significantly. However, that is not the whole story. During assembly, polypeptides are guided into their final structures by chaparonins, along with the formation of intrachain, and sometimes interchain, disulfide bridges. If present, glycosylations and prosthetic strucutres also contribute to the final structural conformation. By comparing the sequence of a protein of interest to sequences of proteins whose structures have been determined, it is possible to infer the probable structure of that protein. If the sequence homology is high and the function is known to be the same, or at least very similar, then the inferred structure is likely to be correct or nearly correct. If the sequence alignments show only certain homologous regions with low global homology, then overall structure cannot be accurately determined without the input of more data. The ideal is purifying the protein and crystalizing it for X-ray diffraction analysis. NMR analysis of a stable pure protein is also possible, but the results are more ambiguous than with X-ray crystalography.

After exploring the available tools and playing with a couple of worked examples in this exercise, you will have the opportunity of exploring a protein of interest in the associated project.

Reading in Brown:

  • Review Chapter 6 pp. 85-98 on multiple sequence alignments.
  • Then read Chapter 7 pp. 99-119 on structure-function relationships.

Reading in Gibas and Jambeck:

  • Review Chapter 8 pp. 193-197 on multiple sequence alignments and pp. 205-214 on profiles and motifs.
  • Review chemistry and protein structure viewers Chapter 9 pp. 215-234
  • Prediction of structure from sequence: Chapter 10 pp. 268-293.

There are summary questions at the end of this section. Read them through before you start browsing. You can answer them as you go, or answer them after browsing the following sites. Points = 10. Due 11/13.

 [top of page]

Pre-Exercise:

As a warm-up activity to molecular modeling, you'll want to become familiar both with structure of proteins at different levels of organization and with the structure viewers. Here you have some choice for tutorials. If you have never seen any of these sites before, surf quickly to see what they have to offer.You might want to spend most of your time with just one site, and then explore the others to enrich your experience.

1. The first site is nicely organized as notes with interactive illustrations. This site presents secondary and tertiary structures as interactive images. You can click and drag to move the images in 3D. There are also links to expand on some topics.

http://www.cm.utexas.edu/CH339K/Hackert/Proteins/Proteins.htm

2. The Lehninger book support site below has well developed tutorials. Click on "Lehninger 3D Structure Models". Select "Protein Architecture". This is a new tutorial, and not quite complete. It stops before getting to tertiary structure, but through secondary structure it is quite good. As you go through the guided explanation, you'll see different types of displays of the structures, along with the details of the tutorial. This will give you an idea of what you can do during this exercise, and after, with proteins of your choice. [Along the way you'll be learning more about protein chemistry as well.]

http://www.worthpublishers.com/lehninger/

After doing "Protein Architecture", select one or more of the other tutorials. The main take-home lesson in these tutorials is the relationship of structure-function, with a variety of examples of how specific structures contribute to general and specific functions.

3. Another excellent site for macromolecular tutorials is the OMM site. "Amino Acid Structures" is an excellent 3D reference site and simple to use. "Chemical bonds and Protein Structure" goes into detail of bond formation and how it contributes to structure. Secondary and tertiary structure are addressed in the other tutorials. Select a few to get a feel for different motifs and domains.

http://www.clunet.edu/BioDev/omm/exhibits.htm#displays
[top of page]

Exercise:

Part 1: Viewing molecular models

Note: If you are already familiar with using RasMol or Chime and feel you don't need the review, you can skip section A; go to section B.

A. Finding and displaying a protein.

1. Go to NCBI:

http://www.ncbi.nlm.nih.gov/Structure/

2. Type the protein you wish to find in the search bar. Try one of the following if you are looking for some place to start:

# Amino acids

lysozyme- a lytic enzyme found in tears

129

ferredoxin- involved in electron transport

97

IP3- signal transduction protein, opens Ca++ channels

106

ARF- involved in assembling coated vesicles

175
  • Try to start with smaller proteins, such as those listed here. When you are comfortable in navigating, try more complex proteins. You may choose a protein of interest to you, or if you need ideas to get started try one of the following: "Fab", "C1q", "reverse transcriptase", "chaperonin", or "porin".

3. A structure query page will come up with all documents in the NCBI database that have protein structures related to your inquiry. Click on "retrieve documents".

4. A page of summaries will appear giving a short description of the proteins found. Browse the list, then select one by checking the box. Then click on "view structure summary".

5. A structure summary page will come up listing the compound, taxonomy, etc. Scroll down until you see Options, Viewer, and Complexity at the bottom of the page.

6. Under "View" accent the "RasMol" dart. Then click on the "View/Save Structure".

7. Once the protein model comes up, click on the MDL prompt in the bottom right corner of the screen. 

a. Drag down to "color" and highlight "groups". This allows you to easily view a polypeptide from its amino end to its carboxyl terminus through the change of rainbow hues.

b. Drag down to "display" and click on "ribbon". How many different ribbons does your protein have? What does each ribbon represent? Try some of the other display types.

c. Look for different structural motifs, such as a-helix, b-sheet, and hairpin turn. Which ones, and how many of each, did you find on your selected protein?

d. Place the arrow on the protein, click and hold down the mouse. As you move the mouse, you can move the protein. As you move the protein around, do you see anything that you couldn't see before? Try also stereo display and rotation.

e. Try marking individual amino acids. For example, select cysteine to locate disulfide bond sites. Or try to mark amino acids which you would predict as being found on hairpin loops. [See "Shorthand symbols for amino acids" for abbreviations. See a biochem or molecular biology reference for amino acid structures. Or go to http://www.clunet.edu/BioDev/omm/exhibits.htm#displays for an on-line reference (look under Hall of Introductory Exhibits for Amino Acid Structures).]

f. For a handy table for mouse controls while using Chime, go to: http://www.clunet.edu/BioDev/omm/exhibits.htm

g. For color schemes used, see the entry in the RasMol manual: http://info.bio.cmu.edu/Courses/BiochemMols/RasFrames/SHAPELY.HTM. To browse the full manual, click on "Table of Contents" or go to: http://info.bio.cmu.edu/Courses/BiochemMols/RasFrames/TOC.HTM

8. Pull up another protein and compare the two. See 2 above. Write a brief summary of the comparison. [See summary questions below.]

 [top of page]

B. Digging deeper for details on structure- other sites to explore.

If you want to find out more about your protein, or any other protein, you need to go to another database or use a different application. This is easy to do! For your protein, many of these databases and applications can be accessed from your "Structure Summary" page.

1. From the "Structure Summary" page, click on the colored number that is listed after the "PDB Id".
 

2. You should see a table with information regarding who studied the protein and methods used for studying the protein. If you scroll down, you will see headings on the left that read "Structural Neighbors", "Geometry", "Other Sources", etc. This is the spot. As you browse, do check out "Other Sources"- there is much to be found here. Have fun finding literature and more information about your protein. There are some guided activities for some of these sites in Part 2, so exploring now will help you later. A few sites of interest are listed here.

a. SCOP provides structural information, including protein classification organized in a hierarchy, a list of related proteins along with links to NCBI and structural views.

b. PRODOM identifies domains which can be found in other proteins. By selecting each identified domain, you can get an alignment and a basic tree, along with access to other information and links for other related proteins.

c. PROSITE is database of protein families and domains located at ExPASy. Lots of tools are available here.

d. STING allows viewing a 3D structure which is interactively linked to the primary sequence. There are other features which are handy, such as chain selection, background color selection, and toggling water and ligands if present. Chime controls are also active.

e. GRASS allows use of three types of viewers, with up-front selection of parameters and viewing specifications. It supports viewing in Chime, GRASP, and VRML. For VRML, there is a link to downloading CosmoPlayer on the Quick Start guide page. GRASP requires SGI platform, so for now that is not available. However, click on the link below to examine what you can learn about surface structure and electrostatic forces:

http://honiglab.cpmc.columbia.edu/grasp/

3. Another good one to try is PDB Sum. You can get there through "Other Sources" or by using the URL below: [Save your PDB Id code to enter at the site.]

http://www.biochem.ucl.ac.uk/bsm/pdbsum

At PDB Sum, you can view secondary structure motifs, and other details of your sequence.

a. Explore what is available on the main page that comes up after entering the PDB Id code. Note that each chain of a multi-chain structure has a separate display set of links along with the secondary structure map.

b. Pfam is well worth exploration. It identifies regions and major motifs in families and allows direct linking to related structures.

c. FSSP is useful if you are looking for related proteins with alignment scores. Many of the returned sequences are linked as well.

Note that you can directly link to many of the sites which can also be reached on the "Other Sources" page at PDB. However, some links are not found on the PDB table, such as Pfam and FSSP. Many of the linked sites are interlinked so navigation among them is quite simple.

[top of page]

C. Learning more about the viewers and using them for presentations.

If you want to learn more about exploring protein structure, making illustrations of protein models, and seeing how these methods can be applied try any of the following tutorials:

1. The new CN3D has some nice advantages and features. It is pretty easy to learn, and you'll learn more about proteins in the process. Some students have preferred this viewer to Chime/RasMol when viewing primary and tertiary structure together. [STING does this as well. See B2d above.]

http://www.ncbi.nlm.nih.gov/Structure/CN3D/cn3dtut.shtml

2.The Online Macromolecular Museum's tutorial is helpful in learning Chime scripting. You can learn how to organize and execute some of the image types you have seen in the structure tutorials.

http://www.clunet.edu/BioDev/omm/howdo.htm
[top of page]

Part 2: Exploring protein folding

A. Go to Prion Protein Structure.

Work through this interesting problem of protein folding and the consequences of alternate folding patterns.

B. Go to Summary questions.

Part 3: Exploring on your own

A. Select a protein in which you have a particular interest. Alternatively, select a function or process which interests you, then find a protein which performs that function or is involved in the process.

1. Identify the class of protein to which it belongs. Explore how its structure contributes to its function.

2. Run an MSA on a few closely related sequences. Look for patterns in the sequence which relate to secondary structural elements, such as helices, beta-stands, and loops. Which regions are highly conserved? Where are they in the tertiary [ and quarternary?] structure? How do they contribute to function?

3. Look for different types of motifs and domains in your protein. How do they contribute to function?

4. Examine the literature on the structural determination of your protein. How was the structure of your protein determined? What were some of the challenges in determining the structure? How were they resolved? If there are still areas unresolved in the structure, explore what the problems are and how they can be resolved.

[top of page]

Summary Questions:

Try to limit your answers to 2-3 typed pages [12 pt font]. This length should be sufficient for your comments and any appropriate copy/pasted examples. [You need not retype or copy/paste the questions as part of your responses, but do number them please.]

1. Using Chime:

a. Which protein did you first pick? [Give the ID code as well.] How many different peptides did it have? Did it have any beta-sheets? If so, how many? Did it have any alpha-helices? If so, how many? Did it have any hairpin loops? If so, how many? Were there any other features worth noting? If so, what?

b. What was the second protein you picked? Please include the ID code. Review its structural characteristics, as you did for the first one. In what ways was it similar to your first protein? In what ways was it different?

2. Exporing other sites:

a. Briefly describe two other applications you tried which related to learning something more about the structure of one of your proteins.

b. To what structural family or families did your protein belong? Where did you find this information?

c. What other viewer interface did you try? How did it help in viewing your protein?

3. Discuss prion protein folding.

a. How do 1E1G and 1E1J differ?

b. How does 1B10 compare to the first two models?

c. Do you think beta-sheets as seen in 1HQO could contribute to the stabilization of the dimer? If so, how? If not, why not?

d. After reading the articles and browsing the different structure models, what do you think is the most likely molecular scenerio for the formation of PrPsc from PrPc? Support you argument with cited evidence.

4. Discuss your protein.

a. Identify your protein, including ID code. To what class of protein does it belong? Describe its structure. Describe its function. What parts of the structure are key to its function?

b. What regions are highly conserved? What sequence patterns did you find?

c. What types of motifs and domains do you find in your protein? How do they contribute to function?

d. How was the structure of your protein determined? What were some of the challenges in determining the structure? How were they resolved? If there are still areas unresolved in the structure, what are the problems? How can they be resolved?

 

Further exploration:

1. For more on protein architecture:

http://www.kumc.edu/research/medicine/biochemistry/bioc800/pro-lobj.htm

2. For further reading on protein structure, an excellent book:

Brandon, C. and Tooze, J., 1999. Introduction to Protein Structure, 2e. Garland. ISBN: 0815323050.
[top of page]

.

SSU Home | SSU Biology | SiteMap | Search | CourseInfo | Forum | Home

Glossary   |   Self Tests   |   Software   |   Objectives   |   Articles

Updated 11/11/03 by thatcher@sonoma.edu