The tbl format is used at the BMERC to describe dna and protein sequences. Its format in sed/perl notation is:
[A-Z_a-z0-9]+\t[ACDEFGHIKLMNPQRSTVWYacdefghiklmnpqrstvwy]+\n$ [A-Z_a-z0-9]+\t[ACDEFGHIKLMNPQRSTVWYacdefghiklmnpqrstvwy]+\n$ ...
In english the format is:
A sequence id followed by a tab then a sequence and ending with a newline.
Here is an example:
5ADH STAGKVIKCKAAVLWEEKKPFSIEEVEVAPPKAHEVRIKMVATGICRSDDHVVSGTLVTPLPVIAGHEAAGIVESIGEGVTTVRPGDKVIPLFTPQCGKCRVCKHPEGNFCLKNDLSMPRGTMQDGTSRFTCRGKPIHHFLGTSTFSQYTVVDEISVAKIDAASPLEKVCLIGCGFSTGYGSAVKVAKVTQGSTCAVFGLGGVGLSVIMGCKAAGAARIIGVDINKDKFAKAKEVGATECVNPQDYKKPIQEVLTEMSNGGVDFSFEVIGRLDTMVTALSCCQEAYGVSVIVGVPPDSQNLSMNPMLLLSGRTWKGAIFGGFKSKDSVPKLVADFMAKKFALDPLITHVLPFEKINEGFDLLRSGESIRTILTF
A yeast clique is a group of related yeast genes which have been grouped by a common blast score, or some other heuristic into clusters. At present some yeast genes are in more than one cluster or clique, which may indicate a multi-domain protein. The information contained in these cliques has come primarily from the GeneQuiz Consortium and the Mips Genome Commission.
A raw protein format is used at the BMERC to describe a single protein sequence. Its format in sed/perl notation is:
^[ACDEFGHIKLMNPQRSTVWYacdefghiklmnpqrstvwy \t\n]+$
In english the format is:
A string with only the allowed 20 amino acid letters with optional whitespace
(space, tab, and newline) characters included within it.
Here is an example:
MVVFKNIGHIITKALALGSSTVMMGGMLAGTTESPGEYLYQDGKRLKAYRGMGSIDAMQKTGTKGNAST SRYFSESDSVLVAQGVSGAVVDKGSIKKFIPYLYNGLQHSCQDIGCRSLTLLKENVQSGKVRFEFRTAS AQLEGGVNNLHSYEKRLHN
A raw dna format is used at the BMERC to describe a single dna sequence. Its format in sed/perl notation is:
^[ACGTacgt \t\n]+$
In english the format is:
A string with only the allowed 4 dna letters with optional whitespace
(space, tab, and newline) characters included within it.
Here is an example:
ATGGCATCCACCGATTTCTCCAAGATTGAAACTTTGAAACAATTAAACGCTTCTTTGGCTGACAAGTCATACATTGAAG GGTATGTTCCGATTTAGTTTACTTTATAGATCGTTGTTTTTCTTTCTTTTTTTTTTTTCCTATGGTTACATGTAAAGGG AAGTTAACTAATAATGATTACTTTTTTTCGCTTATGTGAATGATGAATTTAATTCTTTGGTCCGTGTTTATGATGGGAA GTAAGACCCCCGATATGAGTGACAAAAGAGATGTGGTTGACTATCACAGTATCTGACGATAGCACAGAGCAGAGTATCA TTATTAGTTATCTGTTATTTTTTTTTCCTTTTTTGTTCAAAAAAAGAAAGACAGAGTCTAAAGATTGCATTACAAGAAA AAAGTTCTCATTACTAACAAGCAAAATGTTTTGTTTCTCCTTTTAAAATAGTACTGCTGTTTCTCAAGCTGACGTCACT GTCTTCAAGGCTTTCCAATCTGCTTACCCAGAATTCTCCAGATGGTTCAACCACATCGCTTCCAAGGCCGATGAATTCG ACTCTTTCCCAGCTGCCTCTGCTGCCGCTGCCGAAGAAGAAGAAGATGACGATGTCGATTTATTCGGTTCCGACGATGA AGAAGCTGACGCTGAAGCTGAAAAGTTGAAGGCTGAAAGAATTGCCGCATACAACGCTAAGAAGGCTGCTAAGCCAGCT AAGCCAGCTGCTAAGTCCATTGTCACTCTAGATGTCAAGCCATGGGATGATGAAACCAATTTGGAAGAAATGGTTGCTA ACGTCAAGGCCATCGAAATGGAAGGTTTGACCTGGGGTGCTCACCAATTTATCCCAATTGGTTTCGGTATCAAGAAGTT GCAAATTAACTGTGTTGTCGAAGATGACAAGGTTTCCTTGGATGACTTGCAACAAAGCATTGAAGAAGACGAAGACCAC GTCCAATCTACCGATATTGCTGCTATGCAAAAATTA
last modified: Thursday, January 21, 1999 5:51:56 PM