Bioinformatics Unit 1: Exercise 1

SSU Home | SSU Biology | CourseInfo | Forum | Home

Glossary   |   Self Tests   |   Software   |   Objectives   |   Articles


Unit 1: Databases & Queries
Exercise 1

Pre-Exercise

Project 1: A look at the literature

Exercise

Exercise 2: Molecular databases

Summary questions

Further exploration

.

Exercise 1: Introduction to Bioinformatics

Internet as a Research Resource & Search Strategy Basics


Note 1: You should get in the habit of bringing a disk to class to save your log and any other useful files. You may use either a floppy or zip disk.

Note 2: This exercise is in the format and design of later exercises, but it is not graded for points since the basic material should be review. If you turn in answers to the summary questions, you will receive feedback.


Objectives:

1. Gain a basic appreciation of what bioinformatics is and how it can be used.

2. Become familiar with on-line resources helpful in learning bioinformatics.

  • Be able to comfortably navigate on the Web.
  • Become familiar with the class web site.
  • Be able to use search engines efficiently.
  • Identify and use support sites relating to bioinformatics.
  • Be able to evaluate source and quality of information obtained.

3. Learn to use a log as an organizational resource and tool.

  • Keep track of bookmarked sites.
  • Annotate sites and resources; collect information; note location of downloaded files.
  • Use as resource for homework assignments and papers, including but not limited to content and references.

Introduction:

Although these exercises are being introduced in a PC computer lab, the exercises are not platform specific. You may choose to use either Netscape or Explorer as the browser. Please note; however, that you'll want to use Netscape when using plug-ins for viewing molecular models. This introduction is geared [or tries to be] both to those with little experience using the Internet and web browsers, and to those with a significant amount of experience.

If instructions seem to move along too fast, ask for help from other students and from me. If instructions seem too detailed, and you just want to get on with it, please do.

If using the IMac labs, you will need a 100 or 250 MB zip disk to save your work. Floppy drives have been fazed out in favor of zip drives. You also may print your work to the printer in the lab you are working. You need to supply your own paper. For the mobile lab, we do not yet have a priner available.

It is strongly suggested that you keep a log of your sessions on-line. Although you may bookmark sites while using a lab computer, the bookmarks are removed daily. Even if you are using your own computer, try using a log anyway. You can add comments, store specific database sources, and copy/paste whole pages of information for future use. A current version of Word is quite useful for a log, because URLs [Web addresses] can be turned into active link sites. In other words, you can carry your bookmarks with you, without having to retype them. [For more on logs, go to Keeping a log.]

There are summary questions at the end of this section. Read them through before you start browsing. You can answer them as you go, or answer them after browsing the following sites. Points = 0. Due 9/4.

[top of page]

Pre-Exercise:

1. Spend some time becoming familiar with what is available on the class web site, if you haven't already done so. Feel free to print out sections to use as reference. In the past, many have found that having a hard copy of the exercises for reference was a useful addition to using the online version.

2. Optional. If you are not familiar with what the World Wide Web and the Internet are, or you would like to become a little more familiar, go to

http://www.december.com/web/text/

This online course covers a broad range of topics of interest to beginning users of the Web.

[top of page]

Exercise:

Start a log by opening a Word document. You can size this window to be easily toggled when the browser is open, which you have also sized and positioned so that you can see both windows. Alternatively, you may prefer to work with both windows maximized and toggle between them by using the navigation bar.

When using you own computer or one to which you have regular access, I strongly recommend that you bookmark the sites you like. [Bookmarks are regularly removed from school computers, but they are still handy during a single session, so give them a try.]

1. Log on and open a browser. In this first section, there are links to representative resource sites providing useful background support for bioinformatics. Also included is a site introducing an idea for future development in the virtual world of information access. Briefly explore each one, noting what they have to offer locally and what links to other sites may be available.

  • For a basic introduction to some key components of bioinformatics, explore the site at the University of Texas:
    http://biotech.icmb.utexas.edu/pages/bioinfo.html

    Spend a little time now or come back later to explore further. The resource link on their homepage takes you to a great list of links to other sites.

  • See "Further exploration" at the end of this exercise for suggestions to extend this survey of sites further.
[top of page]

2. Before you get too far along in your log, it is good to know what you should be doing to record the necessary information for the sites you visit. This is useful for several reasons: 1) you want to return to a site or find a specific image or piece of information again; 2) you want to share your discovered site with someone else; 3) you need to cite your references in a report, paper, or on a web page. Just as you need to cite print references, web sites need to be cited as well. The related issue of plagiarism is also important. You will need to clearly distinguish your own original work and writing from that of others. Briefly explore the following sites. You can refer back to them as needed.

3. So far you have been visiting sites of static pages. These are being served up from both local and remote servers. It's nice having all these convenient links to click on. But what do you do if you want to find out about something not linked right in front of you, or you want to see what else might be available? You need to do a search. [For those of you rolling your eyes and yawning, feel free to skip ahead. However, just be careful that you don't miss something important.] Being able to conduct efficient and productive searches is a skill crucial to using the Internet and the databases associated with it. You will be learning how to search in a variety of contexts throughout the semester. For those of you not used to general searching on the web, you should try the following as a way of becoming familiar and comfortable in digging up information and resources of interest.

Look at some general browser search engines, such as Google Dogpile, Ask Jeeves, Yahoo, and [among many others]. When looking for something specific, sometimes you don't quite know where to start; or you thought you did, but came up empty. Sometimes, you just want to find out what is available in a general category. The search engines can be very useful, if you know how to use them efficiently. Different search engines will give different results.

  • Try out a broad term, such as "bioinformatics" or "genomics", on three different browser search engines. You can pick them from a list, by clicking on "Search" on your browser navigation bar or type in their URLs. I personally like


    [www.google.com; www.yahoo.com; www.ask.com]

    because they happen to work well in the science fields, whereas some of the others tend to have more depth in other areas.

    [www.dogpile.com] can search other search engines as a group, therefore can be extremely useful when hunting for hard-to-find items. See what you get on your search in terms of how many hits, and how relevant the top picks are to what you wanted.

  • Try a short string of words or a phrase, such as "prokaryotic molecular motors" or "DNA-based computer design". Try inserting Boolean operators ["and", "or", "not"] between the words to see what effect they have on your search results. [In some cases, you will need to capitalize the Boolean operators; in other cases, you may need to alter your notation.] Add or subtract some words. Not all search engines work the same way. Some require specific syntax in word strings. Check out their "Help" information on advanced searching. This familiarity will prove useful later on.

    Something to consider. How do you evaluate the quality of the sites found? Say for example you're looking for information on the latest treatments for skin cancer. How do you recognize the difference between a legitimate scientific report on a treatment trial versus a paper posted by someone with a financial interest in a particular approach versus something posted by a quack group? What can you do when performing a search to maximize useful hits while reducing worthless ones?

[top of page]

4. Now that you've had some experience with the Web and in finding sites of interest, it's time to take a step back to look at the bigger picture of how it all works. Try one of the following tutorials if you are unfamiliar with the structure and function of databases.

By combining what you see here with what you've seen browsing in your books, you should have a better appreciation of how the system works as a whole and have an understanding of what some of the challenges are. This perspective should prove useful as you progress in the course.

5. When you access Snoopy in the Schultz Center's University Library or via the Web, you are querying the catalogue database. As a way of introducing databases, the library is a good place to start. Being able to access current literature is an important skill in all fields of science. Bioinformatics is no exception. Browse the following sites to become familiar with what is available and to become comfortable with navigating journal databases. It is useful to try searching for something of interest as a way of testing your search techniques. This will give you an opportunity to practice and extend your search strategies into querying databases. Check your log for ideas of search terms, or come up with something new. How about finding out what's available on using BLAST [something we'll be doing next week] or sequencing genomes [part of Unit 2's focus]? As you visit each site, make note in your log of the site's capabilities and uses for future reference. Copy/pasting information from the web pages is a quick way to do this, along with citations and your own notes.

When searching database resources from a computer on campus, you will need to do nothing special to visit restricted sites. When accessing these restricted sites from an off-campus computer, you will need to first set up a proxy, using your library code number as user ID and select a password [PIN]. If you have never done this, it is easy to do by using the library's on-line form available under "info & services". If you already have a PIN, you need not redo this, unless you want to change your password. If you forgot your password, you need to visit the front desk in the library.

For access to Biosis, you no longer need an additional ID and password to enter Dialog. You can also access Science Citation Index and Biological Abstracts through Dialog's portal. Please restrict your uses to Biological Abstracts, Biosis and Science Citation Index. Other uses are restricted by contract. Many of the "off-limits" resources such as Eric and Medline are available by other means, as you will see. For further information visit the reference desk in the library.

Why use more than one database when conducting a search? As you explore, pay attention to what types of resources are available, the types of journals represented, and how well you do in obtaining useful titles related to your query. Knowing what types of databases to use for any specific query type is something that comes with familiarity with the contents of the databases available.

How do you obtain hard copies of the papers you find? This used to be simple. Go to a library which had the journal [sometimes this required a long trip], request it or find it on a shelf, queue up at the copier, and then finally start loading in coins. If you could not find a library with the journal, too bad. Now, however; we have many more ways of obtaining full text copies, but that adds to the challenge of learning the best approach, prioritizing, and keeping an eye on the budget. Keep this idea of access and acquisition in mind as you visit these different sites. Afterwards, reread the paragraph at the end of this section regarding some things to consider before establishing a priority path you can use to obtain your desired hard copies.

  • A good place to start is to become familiar with the tools available locally through the library to find specific journal papers or journal titles. The library subscribes not only to specific print versions of journals, but also to electronic versions of journals and to databases. [If you simply go looking for hard copies, you will miss out on a huge variety of resources available to you.] Go to
    http://libweb.sonoma.edu/collections/journals.html

    Here you can access Dialog's Biosis, Elsevier's Science Direct and Wiley's Interscience, along with other journal resources, via the Database list. You can use the journal locator, and connect to New Jour- a full text resource. Also, periodically check Trial databases for new things. Note that many of these provide full text access.

    Another local site which is quite useful is the Cell Molecular Guide: http://libweb.sonoma.edu/research/subject/cellmol.html

  • After exploring Interscience through the library site, try going directly to it:
    http://www3.interscience.wiley.com/cgi-bin/simplesearch

    See if you find any difference in what you can do and how you can acquire a copy of a paper.

  • In addition to browsing New Jour, a visit to Highwire Press, the other large site for free journal access is worthwhile. Both of these access sites are expanding and adding more journals. These sites are especially useful after you have found a citation and you want to check the availability of a specific journal. Check back for updates so you can update your acquisition strategy periodically.
    http://www.highwire.org/
    .
  • Next give PubMed a try:


    [www.ncbi.nlm.nih.gov/PubMed]

    This is the freely accessible Medline, with full text journal articles available when marked. However, that does not mean they are all free. Some free ones are marked as such up front. Others may be free, but you have to check them out first, since the service providing access may have different agreements with different journals. Even if full text availability is not indicated, you may still be able to get it through some of the resources you have already visited. It is a good idea when searching outside of the library portal, that you utilize your log as much as possible. Collect summaries and/or abstracts of the articles of interest. Then you can try different means of getting your hands on a copy.

  • Two other valuable sites to check are MedBioWorld and BioMedNet. MedBioWorld is a broad range search site, including journals and other reference resources. Bio MedNet makes full text journal articles available, and is the home of the free e-journal H.M.S. Beagle. [Quick quiz: What is the significance of the name of this journal? Who's name is associated with it? Give yourself 2 points if you are correct.]

    [www.sciencekomm.at]

    [www.bmn.com]

Now back to the question regarding deciding how to get a hard copy. If you keep in mind cost, both to you and to a resource provider, such as the library, then it is fairly easy to decide how to set up your priority path for obtaining the articles you want. You might think you could sit down with you log list of desired papers and fill out an Inter-Library Loan request for each one, then sit and wait for days to weeks to get them. Bad idea for several reasons. Often in the time it takes you to fill out the form, you could have the paper ready to print out. If too many requests are made through ILL, the whole system slows down, because someone else is doing your work for you while others really needing this service are delayed in obtaining what they requested. Requests which are determined to be unnecessary are returned, therefore time is wasted. In addition, you will need to pay for copies through ILL, although they are much less expensive when compared to commercial providers. Therefore save using ILL for when you can't obtain a paper by another method and you really need it. For example, you find a paper you want, but it isn't in the library, it isn't in any full text database you tried, nor is it available directly from the journal without a high service charge. What else is there? What about nearby libraries? A scenic drive might be more desirable than sitting around impatiently waiting for a pick-up notice in your e-mail.

[top of page]

Summary Questions:

Try to limit your answers to 2 typed pages [12 pt font]. This length should be sufficient for your comments and any appropriate copy/pasted examples. [You need not retype or copy/paste the questions as part of your responses.]

1. Plagiarism has become more of an issue with the increased use of electronic media and resources. This is due in part to the ease of access. [It may also be due to the mistaken assumption that if something is freely available, it can be used freely without acknowledgment.] Outline a brief strategy to ensure that your work is your own and that others receive credit when credit is due. Give a citation of a useful site you collected while doing this exercise, perhaps while you were searching the web. Be sure to use an appropriate style format.

2. The following pertain to searching the web.

a. Briefly summarize the results of testing three browser search engines on a general term and on a word string, with and without Boolean operators. Include which search engines you used, which terms you used, the number of hits and their relevance.

b. Discuss how you evaluate the quality of the sites found. What can you do when performing a search to maximize useful hits while reducing worthless ones?

3. Briefly summarize your understanding of databases, including the different types of database and how they are organized. What do you see as features of key importance to using bioinformatics databases and tools? What do you see as key challenges which must be met as this field grows?

4. The following pertain to searching literature databases and resources.

a. Outline what steps you took to obtain a citatation for a specific paper of interest.

b. Outline a checklist of resources and options to consider in obtaining a copy of a paper in the least expensive manner, assuming that you don't get it on the first click from a database. Give an ordered strategy you would use in the future to guide to you in obtaining literature of interest in the most efficient, lowest cost manner. [A one item list of "Ask professor for copy." is not acceptable! K]

[top of page]

Further exploration:

1. If you are looking for interesting resources in biology & chemistry, try the following:

2. Visit the Tutorial page and explore further resource links which may prove useful during the semester. Perhaps you'll even find something helpful for another class or two.

3. For a preview of 3 major sites that you will be using in this course and beyond, try exploring NCBI, EMBL, & Biology Workbench below.

[top of page]

.

Updated 09/11/2003 by bsc@classroomtools.com, thatcher@sonoma.edu