BMERC : needle tools : File formats : Alignment file formats
The homolog or .hlg file contains homolog sequences aligned to a single core sequence, which may then be counted in that core's environments. The aligned sequences are kept in a structured sequence table format file where the first "sequence" encodes the core secondary structure, the second sequence is that of the core, and the third and subsequent sequences are the homologs. The alphabet for the secondary structure is "ehlt", where the letters stand for extended (strand), helix, loop, and turn respectively, plus "-" for gaps. [Not sure if case matters; there is no extant code that actually uses this information. -- rgr, 22-Apr-97.] The pattern of gaps in the secondary structure string must be identical to the pattern of gaps in the core sequence string.
The alignment is somewhat more constrained than a standard multiple alignment in that gaps are not permitted in core elements. [fill this out. -- rgr, 10-Jan-97.]
Here is an example, the original file for 1hoe (from the ~thread/alignment/1hoe.hlg file, dated July 1994). There are four lines in the file, which have been wrapped with backslashes ("\") that do not appear in the data.
1hoe ------------------------------lllllllllleeeeeettteeeeeeettt\ eeeeeeeettteeeeeeeettteeeeeellllllllleeeeeeel 1hoe ------------------------------DTTVSEPAPSCVTLYQSWRYSQADNGCAE\ TVTVKVVYEDDTEGLCYAVAPGQITTVGDGYIGSHGHARYLARCL 1hoe ------------------------------DTTVSEPAPSCVTLYQSWRYSQADNGCAE\ TVTVKVVYEDDTEGLCYAVAPGQITTVGDGYIGSHGHARYLARCL 1hoe MRVRALRLAALVGAGAALALSPLAAGPASADTTVSEPAPSCVTLYQSWRYSQADNGCAQ\ TVTVKVVYEDDTEGLCYAVAPGQITTVGDGYIGSHGHARYLARCL