|
|
|||||||||
|
|
|||||||||
|
|
|||||||||
|
|||||||||
|
The paper by SF Altschul et al. is entitled "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs." Click here for a copy. The project consists of participation in class in a discussion of the paper (10 points), analysis of the PSI-BLAST paper, and a written report (10 points). Total points = 20. Report due at the beginning of class on Thursday November 20, 2003. The paper selected for this project addresses the problem of creating a profile of sequences in a database describing a family to which the query sequence belongs. 1. As you read the paper, make note of any terms or concepts which you do not understand. 2. Create an outline or idea map of the concepts you do understand. 3. In exploring questions raised by the paper, you should use other references in the articles library, resources in the exercises, and glossaries. Discussion and report summary: The following questions should guide both the discussion and your report. 1. What changes were made by the authors to improve BLAST's performance on protein databases? 2. What do the S' and E values represent in the statistical treatment of high-scoring segment pairs? 3. How in general does the two-hit scheme work for triggering extensions? 4. Why is the two-hit algorithm better and faster than the one-hit extension trigger? 5. How does gapped-BLAST work? 6. What makes the gapped-BLAST extension scheme faster and better than the Smith-Waterman dynamic programming implementation in FASTA? 7. How does PSI-BLAST collect sequences for a position-specific scoring matrix for a given query sequence? 8. How does PSI-BLAST create the scoring matrix? 9. What is the purpose of the iteration process? . |
|
|