Bmerc Logo Limitations of doing Text keyword search in Genbak.


To start lynx up to search databases at the ncbi:

Disadvantages
This example will show a major limitation of searching Genbank with keywords.
A user who is casually searching Genbank may not know what keywords the
Database curators decided to use to identify a given sequence.  For example,
lets say you wish to see if there is a known sequence in the database dealing
with human HIV1 protease.  

	

mbcrr% lynx http://www.ncbi.nlm.nih.gov

Press the down arrow key to select: _Searching GenBank and other Databases_

Now select: * Text Searching

You will see the following menu: Search with Non-Forms Clients * GenBank * GenBank Updates

Select GenBank

Type s for search as it states at the bottom of the screen.

Enter a database search string:

hiv1 protease

If you wish to find HIV type 1 protease genes, the above search will not give you what you want, it will give you all sequences dealing with hiv1 OR protease, which is over 4000 entries!

You are missing the boolean term

and

Thus, for a first try, you might want to type

hiv1 and protease

You will only pick up a couple of hits. The problem with entering in a keyword is that the keyword may have been entered in the database in a different form. For example, you may wish to try "hiv-1" instead of "hiv1" (Or even hiv-i)

Try entering hiv-1 and protease

Now you will hit about 350 sequences. You may wish to look at a couple of hits and then refine your query by adding a third keyword

hiv-1 and protease not partial

which will give you about 27 hits. ( see file key.27)

One way to get around this is keyword ambiguity problem is to try the ENTREZ retrievel system. This interface takes a bit more time to learn, but it allows you to type in a keyword such as "hiv1" and then select a mode such as "SELECTION" so that you can see an alphabetized list of keywords:

hiv1 (13/35) ... hiv (46/3048) hiv-1(285/5534) ...

After looking at the selection list, you can select hiv-1.

You can then add the term protease to restrict your search further.



  [Top] [Above] [More Info] [Index] [References] [Search BMERC]

James Freeman<jfreeman@darwin.bu.edu>

Last modified: Thu Feb 6 16:45:09 EST 1997