BMERC : needle tools : Appendices : perl subroutines
[This is an experiment. These subroutines are not documented, except inline. So I've snarfed the inline comments out of the source code. Is this useful to anyone? -- rgr, 2-Nov-99.]
Note: I still consider these subroutines to be internal; their specifications may change without notice at my whim.
The original version of this algorithm attempted to correct both aligned sequences at the same time; this version gives an equivalent result if called twice, with the second call using the result of the first call in reverse order. Since we never want to give a false appearance of alignment when both sequences happen to have residues inserted in the same place, we need to treat the sequences independently, i.e. by doing:
insert1---------- -------insertion2rather than trying to deal with "insertion-in-both" as a special case. This independence of insertions into each of the sequences is what made the asymmetric reimplementation possible.
The arguments are:
($total_vv_up_to_14A_res1 > $vv_exp_tr, $total_vv_up_to_14A_res2 > $vv_exp_tr, $cb_distance, ($vv_up_to_7_5A_res1_w1 > ($ss1 ? $vv_tr_e : $vv_tr_h) && $vv_up_to_7_5A_res2_w2 > ($ss2 ? $vv_tr_e : $vv_tr_h)), $seg1, $seg2).
[The history of this code is lost in the mist. When I got it it had already been converted to C from an earlier FORTRAN implementation. Neither was at all documented. -- rgr, 8-Aug-96.] [now further converted to perl. -- rgr, 1-Aug-97.]
The original version of this algorithm attempted to correct both aligned sequences at the same time; this version gives an equivalent result if called twice, with the second call using the result of the first call in reverse order. Since we never want to give a false appearance of alignment when both sequences happen to have residues inserted in the same place, we need to treat the sequences independently, i.e. by doing:
insert1---------- -------insertion2rather than trying to deal with "insertion-in-both" as a special case. This independence of insertions into each of the sequences is what made the asymmetric reimplementation possible.
This doesn't have to be really spiffy, since it isn't needed much.
[The history of this code is lost in the mist. When I got it it had already been converted to C from an earlier FORTRAN implementation. Neither was at all documented. -- rgr, 8-Aug-96.] [now further converted to perl. -- rgr, 1-Aug-97.]
These are also known as "raw VV" files.
This has a library file of its own so that extract_all_pdb_sequences can be shared between the check-pdb-seqs.pl and pdb-domain-seq.pl scripts. [check-pdb-seqs.pl is not distributed presently. -- rgr, 9-Jul-99.]
This is a rule-based revision that doesn't support -use-plus.
This is a rule-based revision that doesn't support -use-plus.
The arguments are:
($total_vv_up_to_14A_res1 > $vv_exp_tr, $total_vv_up_to_14A_res2 > $vv_exp_tr, $cb_distance, ($vv_up_to_7_5A_res1_w1 > ($ss1 ? $vv_tr_e : $vv_tr_h) && $vv_up_to_7_5A_res2_w2 > ($ss2 ? $vv_tr_e : $vv_tr_h)), $seg1, $seg2).
&exposure_bin_max_exposure($env-1) <= $exp, and $exp < &exposure_bin_max_exposure($env).In a sense, this is the inverse of the compute_exposure_bin function.
dssp4.pl [-locus name] [-chain L] [-min-strand-length slen] [-min-helix-length hlen] [-keep-short-strands] [ -t | -pdb | -ss ] [filename]Where: filename is the name of a dssp-format file (stdin is used if not supplied).
See the http://bmerc-www.bu.edu/needle-doc/latest/ss-tools.html#dssp4 page for detailed documentation.
The -chain and -ss options are mutually exclusive.
Default output is tab-delimited residue ("pdbres"), AA, secondary structure, and exposure, one line per residue.
Given 'smoothed DSSP' -t output on the standard input and the full protein sequence for a given chain, interpolate residues with missing coordinates as loop residues with empty residue numbers (pdbres fields), producing the same format on output. See the http://bmerc-www.bu.edu/needle-doc/latest/misc-tools.html#expand-dssp page for details. Operates on a single chain, which may be specified; the default chain ID is a space. For an explanation of the input format, see the http://bmerc-www.bu.edu/needle-doc/latest/dssp-progs.html#dssp4-default-output-format page.
Usage: expand-dssp.pl [-chain L] full-sequence < dssp-in > dssp-out
Where: filename is the name of an abbreviated-DSSP-format file, and the chain ID is a single letter (defaults to ' ').
Note that the chain may be considered of interest if the following expression evaluates to true (nonzero):
@desired_chains == 0
|| defined($chain_start_interesting_p{$chain});
This should be the case if we expect any of the chain's residues to pass
the interesting_residue_p test. (But $chain_start_interesting_p{$chain}
will be zero if all of the subranges on that chain had explicit start
PDBRES values.)
install - install a program, script, or datafile This comes from X11R5 (mit/util/scripts/install.sh). Copyright 1991 by the Massachusetts Institute of Technology$program is the pathname of the thing where it lives now, $installed_program_name is its "new" name when in place, and $program_pretty_name is for use in messages.