The following is a short tutorial on how to align a group of sequences with the pileup program.
cyrus% gcg <Return>First, Fetch a sequence to use as a query sequence. The GCG program called "fetch" will fetch a sequence from your local database. The sequence will be saved in a file with the name of the database appended at the end of the name. For example, the sequence "hba_human" is saved in a file called "hba_human.swissprot" because "hba_human" comes from the Swissprot database.
cyrus% fetch hba_human <Return>Now, do the same for the rest of your sequences.
You should see the following files: cyrus% dir *.swissprot 6 glb5_petma.swissprot 66 hbb_human.swissprot 6 hba_horse.swissprot 6 lgb2_luplu.swissprot 38 hba_human.swissprot 8 myg_phyca.swissprot 4 hbb_horse.swissprot cyrus% Note: Now check to see if there are any other sequences you wish to align in your directory. cyrus% dir *.ig <Return> 2 1coh.ig 2 2mhb.ig 136 hb.ig cyrus% cat 1coh.ig 2mhb.ig > temp.ig <Return> cyrus% fromig temp.ig <Return> FromIG reformats one or more sequences from IntelliGenetics format into individual files in GCG format. 1coha 141 aa. 2mhba 141 aa. Finished FROMIG with 2 files written. 282 bases were reformatted. Here is an examle of a gcg formatted file: cyrus% more 1coha To use pileup, you should first create a file containing the names of the sequences you wish to align. You can either use an editor like "emacs" "pico" or "vi", or you can create a file with the unix "cat" command. To create a file with pico: cyrus% pico seqlist<Return> 1coha 2mhba sw:glb5_petma sw:hba_horse hba_human.swissprot hbb_horse.swissprot hbb_human.swissprot lgb2_luplu.swissprot myg_phyca.swissprot ^x (Control x to end creation of file "seqlist" then type "y" to save changes) Note that in the above example, "sw:glb5_petma" will get the sequence called glb5_petma from the Swissprot database, whereas the sequence called "hba_human.swissprot" would be your own sequence file in your directory. Therefore, you really did not have to fetch the sequences from Swissprot. cyrus% pileup <Return> PileUp creates a multiple sequence alignment from a group of related sequences using progressive, pairwise alignments. It can also plot a tree showing the clustering relationships used to create the alignment. PileUp of what sequences ? @seqlist<Return> Note: at the "Name of sequence(s)" prompt, enter "@seqlist" The "@" symbol tells gcg that this will not be a sequence, but rather a file containing the names of sequences. As an alternative to this, you could use a "wildcard" expression such as "*.swiss*", but in this example, you would have missed two sequences that you wished to align. What is the gap creation penalty (* 12 *) ? <Return> What is the gap extension penalty (* 4 *) ? <Return> This program can display the clustering relationships graphically. Do you want to: A) Plot to a FIGURE file called "pileup.figure" B) Plot graphics on LN03-SCRIPTPRINTER attached to |lpr -Plj C) Suppress the plot Please choose one (* A *): B<Return> The minimum density for a one-page plot is 7.3 sequences/100 platen units. What density do you want (* 7.3 *) ? <Return> What should I call the output file name (* seqlist.msf *) ? <Return> cyrus% more seqlist.msf -- To display a consensus sequence from pileup output: cyrus% pretty -con -case seqlist.msf{*}<Return> Pretty displays multiple sequence alignments and calculates a consensus sequence. It does not create the alignment; it simply displays it. seqlist.msf{1coha} len: 167 wgt: 1.00 seqlist.msf{hba_human} len: 167 wgt: 1.00 seqlist.msf{2mhba} len: 167 wgt: 1.00 seqlist.msf{HBA_HORSE} len: 167 wgt: 1.00 seqlist.msf{hbb_horse} len: 167 wgt: 1.00 seqlist.msf{hbb_human} len: 167 wgt: 1.00 seqlist.msf{GLB5_PETMA} len: 167 wgt: 1.00 seqlist.msf{myg_phyca} len: 167 wgt: 1.00 seqlist.msf{lgb2_luplu} len: 167 wgt: 1.00 Begin (* 1 *) ? <Return> Begin (* 1 *) ? <Return> End (* 167 *) ? <Return> Find consensus to what minimum plurality (* 2.00 *) ? 9<Return> What should I call the output file (* pretty.pretty *) ? seqlist.pretty<Return> cyrus% more seqlist.pretty NOTE: You might also want to add the option "-ident" with the pretty program. The "-ident" qualifier will show you the consensus only when they ALL agree, which may give you the motif you are looking for. Type "genhelp pileup" for online help on pileup. You can create publication quality figures with the boxshade program.Example of boxshade output
You can access a comprehensive list of multiple alignment programs at the VSNS BioComputing Division Multiple Alignment Resource Page http://www.techfak.uni-bielefeld.de/bcd/Curric/MulAli/welcome.html
You can learn more about multiple sequence alignments by going to the ALGORITHMS FOR MULTIPLE SEQUENCE ALIGNMENTS homepage at the www site http://ben.vub.ac.be/embnet.news/vol2_1/align.html