This is an example of doing a fasta search in GCG
  • Skip this description and get to the example. FASTA
    1. Disadvantages:

      1. SPEED: The BLAST program is preferred over the FASTA program because BLAST executes much faster than FASTA. A typical BLAST search done locally will execute in less than a minute, whereas a local FASTA search will take about 30 to 60 minutes or more.
      2. SENSITIVITY: The BLAST program is usually more sensitive than the FASTA program for detecting protein sequence similarity when both programs are used with their default parameters because it does not require a perfect match in the first stage of the search.
      3. DNA vs AA: The BLAST program can directly translate a nucleotide sequence into six frames and search a protein database. This would require six separate searches with FASTA.
      4. MASK REPEAT AREAS: The BLAST program allows you to filter repeat regions and areas of low complexity so that you do not have many false hits just because your sequence has a short repeat in it.
    2. Advantages

      1. Weak DNA Hits: The long word size in a BLAST DNA sequence similarity search allows the program to execute extremely fast, but the price of speed is a loss in sensitivity. The FASTA program will show some weak DNA hits that will not be found in your BLAST report.
      2. CONCLUSION: The BLAST program is preferred over FASTA for sequence similarity searching--it will give you your answers in a minute or two rather than waiting an hour for your FASTA report.
    
    
    cyrus% gcg <Return>
    

    First, Fetch a sequence to use as a query sequence. The GCG program called "fetch" will fetch a sequence from your local database. The sequence will be saved in a file with the name of the database appended at the end of the name. For example, the sequence "t57624" is saved in a file called "t57624.gb_est1" because "t57624" comes from the Genbank Expressed Sequence Tag database.

    cyrus% fetch t57624 <Return>
    

    Now, use the GCG blast program to perform the sequence search.

    cyrus% blast <Return>
    
    cyrus% fasta
    
    FastA does a Pearson and Lipman search for similarity between a query
    sequence and a group of sequences of the same type (nucleic acid or
    protein). For nucleotide searches, FastA may be more sensitive than BLAST.
    
     FASTA with what query sequence ?t57624.gb_est1
    
                      Begin (* 1 *) ?
                    End (*   248 *) ?
    
     Search for query in what sequence(s) (* GenEmbl:* *) ?
    
     What word size (* 2 *) ?
    
     Don't show scores whose E() value exceeds: (* 10.0 *):
    
     What should I call the output file (* t57624.fasta *)? 
    
              1 Sequences         924 aa searched    
            101 Sequences      31,577 aa searched    
    
    
    
    
    Note, you will see lines like this printed to your screen for about 30 minutes or so.....
    
    ...
         58,801 Sequences  21,121,532 aa searched
         58,901 Sequences  21,165,931 aa searched    
         59,001 Sequences  21,201,087 aa searched
    
    ...
    
    How many alignments would you like to see (* 120 *) ?
    
    Aligning...
    
     CPU time used:
           Database scan:  0:06:36.1
    Post-scan processing:  0:00:58.6
          Total CPU time:  0:07:35.1
     Output File: t57624.fasta
    
    
    cyrus% more t57624.fasta <Return>  
    
    
    

    You can use the WWW Fasta form to compare your sequence against EMBL at the WWW site at http://www.ebi.ac.uk/htbin/fasta.py?request