BMERC : needle tools : Programs : Exposure programs
The programs documented below produce Eisenberg "fat alanine" exposure file in .nexp (exposure) file format from PDB atomic coordinates. The original version, generate-exposure, has a fixed argument pattern and implements "standard" EFA exposure. efa.pl is a newer version that does the same thing by default, but provides more options that allow different variations on the EFA theme.
efa.pl < 2mhr.ent > 2mhr.nexp
writes 2mhr.nexp in the current directory, in the .nexp (exposure) file
format, which is the format the
mrf-envs program expects.
Usage:
efa.pl [-locus locus] [-pdb-file pdb-file-name] [-chain L]
[-output-file output-file-name]
[-dssp-file dssp-file-name] [-verbose]
[-radius [ss] atom res radius]
[-default-radius atom radius]
Arguments:
The first thing that efa.pl does is to pass the PDB data through filter-pdb-atoms.pl in order to standardize atom variants and catch anomalies, not to mention converting all residues to alanine and "hallucinating" beta carbons for native glycines. Accordingly, you may see error messages at the top of the transcript, such as those below:
gamow% efa.pl -locus 3rub -pdb-file 3rub.ent
filter-pdb-atoms.pl: 3rub.ent: Chain break: 10.3110037338758 A between
THR L 63 and VAL L 69 .
filter-pdb-atoms.pl: 3rub.ent: Missing beta carbon for MET L 405 .
filter-pdb-atoms.pl: 3rub.ent: Missing backbone atom CA for ASN L 468
filter-pdb-atoms.pl: 3rub.ent: Missing backbone atom C for ASN L 468
filter-pdb-atoms.pl: 3rub.ent: Missing backbone atom O for ASN L 468
filter-pdb-atoms.pl: 3rub.ent: Missing beta carbon for ASN L 468 .
. . .
If the -chain argument was specified,
filter-pdb-atoms.pl will also extract the specified chain(s) or
chain subrange(s). See the filter-pdb-atoms.pl
documentation for more details.
After filtering, efa.pl next determines which atom radius to use for each selected atom, by using the following precedence rules.
By default, the exception "-radius CB ALA 2.1" is built in; this is the "fat" in "fat alanine", since the standard carbon radius is 1.9Å. One may explicitly specify "-radius CB ALA 1.9" to reinstate the standard value (obtaining "nonfat alanine", one presumes). [Using "-radius E CB ALA 2.5", which might be termed "high-fat alanine", may become the new standard. By contrast, the original recipe might better be called "lowfat alanine". -- rgr, 20-May-98.]
Note that if any exception specifies an ss, then the -dssp-file argument is required. Use "-dssp-file -" to read DSSP data from the standard input. Either "raw" abbreviated DSSP or "filtered" (dssp4.pl output in the default format) is acceptable. To use dssp4.pl, it is convenient to read the DSSP data (rather than the PDB file) from the standard input:
gamow% dssp4.pl -clean 1rcf.ent.out \
| efa.pl -radius E CB ALA 2.5 -dssp-file - \
-pdb-file 1rcf.ent > 1rcf.nexp
[Must add a note on the *.eng data files -- when I figure out what they mean. They are copied by the hydro script into the current directory, except for those that already exist, which allows some customization. -- rgr, 24-May-96.]
[Some programs used by efa.pl still insist that locus be
exactly four characters long. For that reason, efa.pl uses the
locus "tmp0" internally. This will become apparent if the script fails
for any reason. -- rgr, 20-May-96.]
Default atom radii
These values are used when none of the exceptional values apply.
Compatibility with generate-exposure
efa.pl is backward compatible with generate-exposure; if you give it
the same args, you get the same thing:
Instead of specifying the locus and PDB file name arguments
positionally, it is preferable to give them as keyword arguments:
Known bugs:
Atom
Radius
C 1.9 N 1.7 O 1.4 S 1.8 P 1.8 M 1.7 I 2.0
gamow% efa.pl 1rcf 1rcf.ent
gamow% cmp 1rcf.nexp 1rcf.nexp.orig
gamow%
Notice how all of the old generate-exposure output has been
silenced. If the code had encountered an error (or if the
-verbose option had been given), some of it would have been
echoed to the standard error stream. In the normal course of events,
this output is discarded.
gamow% efa.pl -locus 1rcf -pdb-file 1rcf.ent
Done this way, either can be omitted. If the locus is omitted, the
default for output is to write the standard output. If the PDB file
name is omitted, it is read from the standard input. The following is
therefore equivalent (except for the presence of the locus in any error
messages):
gamow% efa.pl > 1rcf.nexp < 1rcf.ent
It is also more readable, since it doesn't rely on hidden naming
conventions.
forrtl: error (65): floating invalid
The resulting exposure file will be incomplete (leading to
further complaints). -- rgr, 18-Feb-98. [Fixed in Release 1.1.
-- rgr, 8-Jun-98.]
generate-exposure
The generate-exposure script takes the locus name and the full
(i.e. loops intact) PDB file as its arguments, and produces a file
describing the Eisenberg "fat alanine" exposure values for each residue.
[need reference. -- rgr, 20-Dec-96.] For example, the command
generate-exposure 2mhr 2mhr.ent
generates a file called 2mhr.nexp in the current directory, in
the .nexp (exposure) file
format, which is the format mrf-envs program expects.
Note: The default behavior of efa.pl produces the same results, so generate-exposure is considered obsolescent.
Usage:
generate-exposure locus pdb-file-name
Arguments:
The first thing that generate-exposure does is to pass the PDB file through filter-pdb-atoms.pl in order to standardize atom variants and catch anomalies. Accordingly, you may see error messages at the top of the transcript, such as those below:
generate-exposure 3rub 3rub.ent
filter-pdb-atoms.pl: 3rub.ent: Chain break: 10.3110037338758 A between
THR L 63 and VAL L 69 .
filter-pdb-atoms.pl: 3rub.ent: Missing beta carbon for MET L 405 .
filter-pdb-atoms.pl: 3rub.ent: Missing backbone atom CA for ASN L 468
filter-pdb-atoms.pl: 3rub.ent: Missing backbone atom C for ASN L 468
filter-pdb-atoms.pl: 3rub.ent: Missing backbone atom O for ASN L 468
filter-pdb-atoms.pl: 3rub.ent: Missing beta carbon for ASN L 468 .
. . .
See the filter-pdb-atoms.pl
documentation for more details.
[Must add a note on the *.eng data files -- when I figure out what they mean. They are copied by the hydro script into the current directory, except for those that already exist, which allows some customization. -- rgr, 24-May-96.]
[Some programs used by generate-exposure still insist that locus be exactly four characters long. For that reason, generate-exposure uses the locus "tmp0" internally. This will become apparent if the script fails for any reason. -- rgr, 20-May-96.]
Known bugs:
forrtl: error (65): floating invalid
The resulting exposure file will be incomplete (leading to
further complaints). -- rgr, 18-Feb-98. [Fixed in Release 1.1.
-- rgr, 8-Jun-98.]