CNS has a powerful atom selection syntax that allows one to select
atoms without reference to the internal index and to construct
arbitrary logical expressions of selected atoms. The number of
selected atoms from the last executed selection statement is stored in
the symbol $SELECT. Atom selections are generally fragile, which
means that information associated with atom selections is lost when
changing the molecular structure. Certain selections are rendered only
partially fragile by mapping the selected atoms to the new molecular
structure after it has been modified. This applies to all atom
properties except for the internal stores, which are fragile. It also
applies to the atom-based parameter statements. Note, however, that
the atom selections are not applied to any newly created atoms and
these atoms are then not selected. (The reason for this is that CNS
does not store the strings associated with the selections.)
An atom selection consists of one or more expressions which are
combined with the logical operators AND and OR.
Parentheses are used to group expressions (see the preceeding
documentation for further examples). The following selection
statements are possible:
- ALL selects all atoms.
- expression AROUnd real selects all atoms that are within
the specified real cutoff value around any atom selected in the
expression.
- ATOM *segment-name* *residue-number* *atom* selects all
atoms that match the specified segment name, residue number, and atom
name or wildcards of them.
- ATTRibute [ABS] atom-property < | = | # | > real
selects all atoms that have (optionally absolute) properties less
than, equal to, not equal to, or greater than the specified real
number.
- BONDedto expression selects all atoms which are covalently
bonded to any atom selected by the expression. The connectivity is
found by analysis of the bonds defined in the molecular topology.
- BYGRoup expression selects all atoms that belong to groups
containing at least one atom that has been selected in the expression.
- BYRes expression selects all atoms that belong to residues
containing at least one atom that has been selected in in the expression.
- CHEMical *type* selects all atoms that match the specified
type or a wildcard of it.
- CHEMIcal type:type selects all atoms that have types
greater than or equal to the first type but less than or equal to the
second type in alphanumeric order.
- FBOX real real real real real real selects all atoms that
lie within the specified fractional limits (xmin xmax ymin ymax zmin
zmax). The crystallographic unit cell must have been defined.
- HYDRogen selects all atoms with masses (atom-property MASS)
approximately less than 3.5 amu.
- ID integer selects all atoms that match the specified
internal atom number. It should be used with caution. The main
application is in conjunction with the "FOR symbol IN ID"
statement.
- KNOWn selects all atoms with known coordinates.
- NAME *atom* selects all atoms that match the specified atom
name or a wildcard of it.
- NAME atom:atom selects all atoms that have atom names
greater than or equal to the first atom name but less than or equal to
the second atom name.
- NONE selects no atoms.
- NOT expression selects all atoms that have not been
selected in the expression.
- POINt vector CUT real selects all atoms that are within the
specified real cutoff value around the specified 3d-vector.
- PREVious selects all atoms that have been selected in a
previous selection in application statements that contain multiple
selections.
- RESIdue *residue-number* selects all atoms that match the
specified residue number or a wildcard of it.
- RESIDue residue-number:residue-number selects all atoms
that have residue numbers greater than or equal to the first residue
number but less than or equal to the second residue number.
- RESName *residue-name* selects all atoms that match the
specified residue name or a wildcard of it.
- RESName residue-name:residue-name selects all atoms that
have residue names greater than or equal to the first residue name but
less than or equal to the second residue name.
- expression SAROund real selects all atoms that are within
the specified real cutoff value around any atoms selected in the
expression or any of its crystallographic or non-crystallographic
symmetry mates.
- SEGId *segment-name* selects all atoms that match the
specified segment name or a wildcard of it.
- SEGId segment-name:segment-name selects all atoms that have
segment names greater than or equal to the first segment name but less
than or equal to the second segment name.
- SFBOX real real real real real real selects all atoms
including symmetry related atoms that lie within the specified
fractional limits (xmin xmax ymin ymax zmin zmax). The
crystallographic unit cell and spacegroup symmetry must have been
defined.
- STORE1|STORE2|STORE3|STORE4|STORE5|STORE6|STORE7|STORE8|STORE9
selects all atoms for which the value of STOREi is greater than 0;
e.g., STORE2 is short hand for "ATTRibute STORE2 > 0", etc. The STOREi
can be defined by the IDENtity statement or the DO statement.
- TAG selects exactly one atom from each residue. These
selected atoms may be used to "tag" all residues without having to
refer to residue numbers or identifiers. The sequence of selected
atoms is determined by the order in which the residues have been
created through the segment statement. The identity of the atom
selected per residue is not well defined.
The atomic properties which can be used in the attribute statement
are:
- B B-factors of main coordinate set in A2 (real)
- BCOMp B-factors of comparison coordinate set in A2 (real)
- CHARge electric charge in electronic charges (real)
- DX x component of first derivatives in kcal mole-1 Å-1 (real)
- DY y component of first derivatives in kcal mole-1 Å-1 (real)
- DZ z component of first derivatives in kcal mole-1 Å-1 (real)
- FBETa friction coefficient in psec-1 (real)
- HARMonic energy constants of harmonic restraints in kcal mole-1 Å-2 (real)
- MASS mass in amu (real)
- Q occupancies of main coordinate set (real)
- QCOMp occupancies of comparison coordinate set (real)
- REFX x component of reference coordinate set in Å (real)
- REFY y component of reference coordinate set in Å (real)
- REFZ z component of reference coordinate set in Å (real)
- RMSD array used by various modules, e.g., the COOR RMS statement
- SCATTER_A1 atomic form-factor coefficient a1 (real)
- SCATTER_A2 atomic form-factor coefficient a2 (real)
- SCATTER_A3 atomic form-factor coefficient a3 (real)
- SCATTER_A4 atomic form-factor coefficient a4 (real)
- SCATTER_B1 atomic form-factor coefficient b1 (real)
- SCATTER_B2 atomic form-factor coefficient b2 (real)
- SCATTER_B3 atomic form-factor coefficient b3 (real)
- SCATTER_B4 atomic form-factor coefficient b4 (real)
- SCATTER_C atomic form-factor coefficient c (real)
- SCATTER_FP atomic f' coefficient (real)
- SCATTER_FDP atomic f'' coefficient (real)
- VX x component of current velocities in Å psec-1 (real)
- VY y component of current velocities in Å psec-1 (real)
- VZ z component of current velocities in Å psec-1 (real)
- X x component of main coordinate set in Å (real)
- XCOMp x component of comparison coordinate set in Å (real)
- Y y component of main coordinate set in Å (real)
- YCOMp y component of comparison coordinate set in Å (real)
- Z z component of main coordinate set in Å (real)
- ZCOMp z component of comparison coordinate set in Å (real)
Examples
The first example selects all CA carbon atoms between residue
number 40 and 100:
( name ca and resid 40:100 )
The next example selects all heavy side-chain atoms:
( not ( name ca or name n or name c or name o or hydrogen ) )
The next example selects all atoms in Phe residues that are within
20Å around residue 1:
( resname phe and ( residue 1 around 20.0 ) )
The next example is similar to the previous one, except that all
atoms of a particular Phe residue are selected once any atom of this
residue is within 20Å from residue 1:
( byresidue ( resname phe and ( residue 1 around 20.0 ) ) )
Suppose that we want to get the stereochemistry as a function of
residue number. The following example tags each residue by using its
Ca carbon atom and then computes the rms deviation of bond lengths
from ideality for all atoms of the selected residue.
for $1 in id ( name ca ) loop main
igroup
interaction ( byresidue ( id $1 ) ) ( byresidue ( id $1 ) )
end
print threshold=0.1 bonds
display $result
end loop main
Do, show and identity statements
The do, show and identity statement allow the manipulation and
query of atomic properties, such as masses, charges, coordinates,
forces, and atom names. Mathematical expressions can be constructed
that involve atomic properties. The show statement can also be used to
analyze atomic properties and to transfer the information to the
$RESULT symbol. The identity statement can be used to define and store
an atom selection that can be recalled later.
- DO manipulates atom properties. The operations are carried
out component by component for all selected atoms.
- IDENtify is used to define and store an atom selection. The
output array will contain the sequential number of the selected atoms
or otherwise zero.
- SHOW show-property can be used to analyze atom
properties. The possible show-properties are:
- AVE shows the arithmetic average of selected elements and stores it in $RESULT.
- ELEMent shows selected elements and stores the last element in $RESULT.
- MAX shows the maximum of selected elements and stores it in $RESULT.
- MIN shows the minimum of selected elements and stores it in $RESULT.
- NORM shows the norm (sqrt(sum x^2/N)) of selected elements and stores it in $RESULT.
- RMS shows the root-mean-square (rms) deviation (sqrt((sum
(x-)^2)/N)) of selected elements and stores it in $RESULT.
- SUM shows the arithmetic sum of selected elements and stores it in $RESULT.
Atom properties can be reassigned the result of an operation using
the DO statement:
DO ( atom-property = operation ) ( atom-selection )
The SHOW statement show only the result of an operation:
SHOW ( operation ) ( atom-selection )
The IDENtity statement does not use an operation, only an atomic
STORE array and an atom selection:
IDENtity ( store[i] ) ( atom-selection )
Operations are expressions using:
atom-property | function | integer | real | string | symbol
Where operators (in order of increasing precedence) are:
- + denotes addition; concatenation for strings.
- - denotes subtraction; unary minus or negative concatenation for strings.
- * denotes multiplication.
- / denotes division.
- ^ denotes exponentiation.
- ** denotes exponentiation (same as ^).
Operators with the highest precedence are excuted first. Operators
with the same precedence are executed from left to right. Operations
have to be meaningful; i.e., the data type of the operands has to
match the operation. For strings, only the "+" and "-" operations are
allowed.
Functions are as follows. Use of a string requires enclosure in
double quotes " ". The data type of the function arguments has to
match the data type of the operands.
- ABS() expects one argument and returns its absolute
value. Argument restrictions: no string.
- ACOS() denotes arc cosine. Argument restrictions: no string
or complex; expects argument in degrees.
- ASIN() denotes arc sine. Argument restrictions: no string
or complex; expects argument in degrees.
- CAPITALIZE() converts all characters in a string to uppercase.
- COS() denotes cosine. Argument restrictions: no string;
expects argument in degrees.
- DECODE() converts a character string to a numerical number
if possible.
- ENCODE() converts a numerical number to a character string.
- EXP() is an exponentiation function. Argument restrictions:
no string.
- GAUSS() is a Gaussian distribution random-number function;
it has one argument, the desired standard deviation. The mean of the
distribution is always zero. Argument restrictions: no string or
complex.
- IMOD( a, b ) returns the nearest integer remainder of the
first argument divided by the second. Argument restrictions: no string
or complex.
- INT() is a truncation. Argument restrictions: no string.
- LOG10() is a base-10 logarithmic function. Argument
restrictions: no string or complex; argument must be greater than
zero.
- LOG() is a natural logarithmic function. Argument
restrictions: no string or complex; real numbers must be greater than
zero.
- MAX( list ) is a maximum-value function; it must have at
least two arguments, and it returns the value of the argument with the
maximum value. Argument restrictions: no string or complex.
- MAXW() is a Maxwellian distribution random-number function;
it has one argument, the desired standard deviation. The mean of the
distribution is always zero. Argument restrictions: no string or
complex.
- MIN( list ) is a minimum-value function; it must have at
least two arguments, and it returns the value of the argument with the
minimum value. Argument restrictions: no string or complex.
- MOD( a, b ) returns the remainder of the first argument
divided by the second. Argument restrictions: no string or complex.
- NINT() returns nearest integer. Argument restrictions: no
string.
- NORM() is a normalization function; it has one argument, which
must be an atom property. This function calculates the sum of the
squares of all selected elements in the argument array. Then it
divides each selected element by the square root of the sum of the
squares. Argument restrictions: no string or complex.
- RANDom() is a random-number function; it has no
argument. It returns a uniform distribution between 0 and 1.
- SIGN() is a transfer of sign. If the argument is >= 0, it
returns +1; if the argument is < 0, it returns -1. Argument
restrictions: no string or complex.
- SIN() denotes sine. Argument restrictions: no string;
expects degrees.
- SQRT() returns the square root of the given
argument. Argument restrictions: no string; no negative real numbers.
- STEP() is a step function; it expects one real-number
argument. If the argument is greater than zero, step returns a one;
otherwise step returns a zero. Argument restrictions: no string or
complex.
- TAN() denotes tangent in degrees. Argument restrictions: no
string or complex.
The complete set of atomic properties is:
- B B-factors of main coordinate set in Å^2 (real)
- BCOMp B-factors of comparison coordinate set in Å^2 (real)
- CHARge electric charge in electronic charges (real)
- CHEMical chemical atom type (string)
- DX x component of first derivatives in kcal mole-1 Å-1 (real)
- DY y component of first derivatives in kcal mole-1 Å-1 (real)
- DZ z component of first derivatives in kcal mole-1 Å-1 (real)
- FBETa friction coefficient in psec-1 (real)
- HARMonic energy constants of harmonic restraints in kcal mole-1 Å-2 (real)
- MASS mass in amu (real)
- NAME atom name (string)
- Q occupancies of main coordinate set (real)
- QCOMp occupancies of comparison coordinate set (real)
- REFX x component of reference coordinate set in Å (real)
- REFY y component of reference coordinate set in Å (real)
- REFZ z component of reference coordinate set in Å (real)
- RESId residue number (string)
- RESName residue name (string)
- RMSD array used by various modules, e.g., the COOR RMS statement
- SEGId segment or chain identifier (string)
- SCATTER_A1 atomic form-factor coefficient a1 (real)
- SCATTER_A2 atomic form-factor coefficient a2 (real)
- SCATTER_A3 atomic form-factor coefficient a3 (real)
- SCATTER_A4 atomic form-factor coefficient a4 (real)
- SCATTER_B1 atomic form-factor coefficient b1 (real)
- SCATTER_B2 atomic form-factor coefficient b2 (real)
- SCATTER_B3 atomic form-factor coefficient b3 (real)
- SCATTER_B4 atomic form-factor coefficient b4 (real)
- SCATTER_C atomic form-factor coefficient c (real)
- SCATTER_FP atomic f' coefficient (real)
- SCATTER_FDP atomic f'' coefficient (real)
- STORE1 1st internal store, is fragile (real)
- STORE2 2nd internal store, is fragile (real)
- STORE3 3rd internal store, is fragile (real)
- STORE4 4th internal store, is fragile (real)
- STORE5 5th internal store, is fragile (real)
- STORE6 6th internal store, is fragile (real)
- STORE7 7th internal store, is fragile (real)
- STORE8 8th internal store, is fragile (real)
- STORE9 9th internal store, is fragile (real)
- VX x component of current velocities in Å psec-1 (real)
- VY y component of current velocities in Å psec-1 (real)
- VZ z component of current velocities in Å psec-1 (real)
- X x component of main coordinate set in Å (real)
- XCOMp x component of comparison coordinate set in Å (real)
- Y y component of main coordinate set in Å (real)
- YCOMp y component of comparison coordinate set in Å (real)
- Z z component of main coordinate set in Å (real)
- ZCOMp z component of comparison coordinate set in Å (real)
Requirements
The molecular structure has to be present. Upon modification of the
molecular structure (e.g., delete or patch statement), the contents of
the internal stores are destroyed; i.e., they are fragile. However,
all other atom properties are conserved; i.e., they are only partially
fragile.
Examples
The first example divides the coordinate array Z by the derivative
array DX, adds the quotient to the coordinate array Y, and stores the
result in the coordinate array X. The operations are carried out
component by component for all atoms.
do ( X = Y + Z / DX ) ( all )
The next example computes a Gaussian distribution with standard
deviation 1.0 and stores the result in the coordinate array x for all
Ca atoms:
do ( X = GAUSS( 1.0 ) ) ( name ca )
The next example provides a listing of the X coordinates of all Tyr
residues:
show element ( X ) ( resname tyr )
The next example computes the average of all electric charges in
residue 34. This average value is then stored in the symbol $1 by
using the evaluate statement.
show ave ( charge ) ( residue 34 )
evaluate ($1=$RESULT)
The next example stores the specified atom selection in the array
STORE1:
identity ( store1 ) ( attribute mass > 30.0 )
The array STORE1 will be nonzero for the selected atoms and zero
otherwise. The values for the selected atoms represent the sequential
number of the selected atoms. The array STORE1 can be recalled by
using ( store1 ) in a selection statement.
Back to tutorials
Previous section
Next section