Crystallography & NMR System

CNS has a powerful atom selection syntax that allows one to select atoms without reference to the internal index and to construct arbitrary logical expressions of selected atoms. The number of selected atoms from the last executed selection statement is stored in the symbol $SELECT. Atom selections are generally fragile, which means that information associated with atom selections is lost when changing the molecular structure. Certain selections are rendered only partially fragile by mapping the selected atoms to the new molecular structure after it has been modified. This applies to all atom properties except for the internal stores, which are fragile. It also applies to the atom-based parameter statements. Note, however, that the atom selections are not applied to any newly created atoms and these atoms are then not selected. (The reason for this is that CNS does not store the strings associated with the selections.)

An atom selection consists of one or more expressions which are combined with the logical operators AND and OR. Parentheses are used to group expressions (see the preceeding documentation for further examples). The following selection statements are possible:

ALL selects all atoms.
expression AROUnd real selects all atoms that are within the specified real cutoff value around any atom selected in the expression.
ATOM *segment-name* *residue-number* *atom* selects all atoms that match the specified segment name, residue number, and atom name or wildcards of them.
ATTRibute [ABS] atom-property < | = | # | > real selects all atoms that have (optionally absolute) properties less than, equal to, not equal to, or greater than the specified real number.
BONDedto expression selects all atoms which are covalently bonded to any atom selected by the expression. The connectivity is found by analysis of the bonds defined in the molecular topology.
BYGRoup expression selects all atoms that belong to groups containing at least one atom that has been selected in the expression.
BYRes expression selects all atoms that belong to residues containing at least one atom that has been selected in in the expression.
CHEMical *type* selects all atoms that match the specified type or a wildcard of it.
CHEMIcal type:type selects all atoms that have types greater than or equal to the first type but less than or equal to the second type in alphanumeric order.
FBOX real real real real real real selects all atoms that lie within the specified fractional limits (xmin xmax ymin ymax zmin zmax). The crystallographic unit cell must have been defined.
HYDRogen selects all atoms with masses (atom-property MASS) approximately less than 3.5 amu.
ID integer selects all atoms that match the specified internal atom number. It should be used with caution. The main application is in conjunction with the "FOR symbol IN ID" statement.
KNOWn selects all atoms with known coordinates.
NAME *atom* selects all atoms that match the specified atom name or a wildcard of it.
NAME atom:atom selects all atoms that have atom names greater than or equal to the first atom name but less than or equal to the second atom name.
NONE selects no atoms.
NOT expression selects all atoms that have not been selected in the expression.
POINt vector CUT real selects all atoms that are within the specified real cutoff value around the specified 3d-vector.
PREVious selects all atoms that have been selected in a previous selection in application statements that contain multiple selections.
RESIdue *residue-number* selects all atoms that match the specified residue number or a wildcard of it.
RESIDue residue-number:residue-number selects all atoms that have residue numbers greater than or equal to the first residue number but less than or equal to the second residue number.
RESName *residue-name* selects all atoms that match the specified residue name or a wildcard of it.
RESName residue-name:residue-name selects all atoms that have residue names greater than or equal to the first residue name but less than or equal to the second residue name.
expression SAROund real selects all atoms that are within the specified real cutoff value around any atoms selected in the expression or any of its crystallographic or non-crystallographic symmetry mates.
SEGId *segment-name* selects all atoms that match the specified segment name or a wildcard of it.
SEGId segment-name:segment-name selects all atoms that have segment names greater than or equal to the first segment name but less than or equal to the second segment name.
SFBOX real real real real real real selects all atoms including symmetry related atoms that lie within the specified fractional limits (xmin xmax ymin ymax zmin zmax). The crystallographic unit cell and spacegroup symmetry must have been defined.
STORE1|STORE2|STORE3|STORE4|STORE5|STORE6|STORE7|STORE8|STORE9 selects all atoms for which the value of STOREi is greater than 0; e.g., STORE2 is short hand for "ATTRibute STORE2 > 0", etc. The STOREi can be defined by the IDENtity statement or the DO statement.
TAG selects exactly one atom from each residue. These selected atoms may be used to "tag" all residues without having to refer to residue numbers or identifiers. The sequence of selected atoms is determined by the order in which the residues have been created through the segment statement. The identity of the atom selected per residue is not well defined.

The atomic properties which can be used in the attribute statement are:

B B-factors of main coordinate set in A2 (real)
BCOMp B-factors of comparison coordinate set in A2 (real)
CHARge electric charge in electronic charges (real)
DX x component of first derivatives in kcal mole-1 Å-1 (real)
DY y component of first derivatives in kcal mole-1 Å-1 (real)
DZ z component of first derivatives in kcal mole-1 Å-1 (real)
FBETa friction coefficient in psec-1 (real)
HARMonic energy constants of harmonic restraints in kcal mole-1 Å-2 (real)
MASS mass in amu (real)
Q occupancies of main coordinate set (real)
QCOMp occupancies of comparison coordinate set (real)
REFX x component of reference coordinate set in Å (real)
REFY y component of reference coordinate set in Å (real)
REFZ z component of reference coordinate set in Å (real)
RMSD array used by various modules, e.g., the COOR RMS statement
SCATTER_A1 atomic form-factor coefficient a1 (real)
SCATTER_A2 atomic form-factor coefficient a2 (real)
SCATTER_A3 atomic form-factor coefficient a3 (real)
SCATTER_A4 atomic form-factor coefficient a4 (real)
SCATTER_B1 atomic form-factor coefficient b1 (real)
SCATTER_B2 atomic form-factor coefficient b2 (real)
SCATTER_B3 atomic form-factor coefficient b3 (real)
SCATTER_B4 atomic form-factor coefficient b4 (real)
SCATTER_C atomic form-factor coefficient c (real)
SCATTER_FP atomic f' coefficient (real)
SCATTER_FDP atomic f'' coefficient (real)
VX x component of current velocities in Å psec-1 (real)
VY y component of current velocities in Å psec-1 (real)
VZ z component of current velocities in Å psec-1 (real)
X x component of main coordinate set in Å (real)
XCOMp x component of comparison coordinate set in Å (real)
Y y component of main coordinate set in Å (real)
YCOMp y component of comparison coordinate set in Å (real)
Z z component of main coordinate set in Å (real)
ZCOMp z component of comparison coordinate set in Å (real)

Examples

The first example selects all CA carbon atoms between residue number 40 and 100:

  ( name ca and resid 40:100 )

The next example selects all heavy side-chain atoms:

  ( not ( name ca or name n or name c or name o or hydrogen ) )

The next example selects all atoms in Phe residues that are within 20Å around residue 1:

  ( resname phe and ( residue 1 around 20.0 ) )

The next example is similar to the previous one, except that all atoms of a particular Phe residue are selected once any atom of this residue is within 20Å from residue 1:

  ( byresidue ( resname phe and ( residue 1 around 20.0 ) ) )

Suppose that we want to get the stereochemistry as a function of residue number. The following example tags each residue by using its Ca carbon atom and then computes the rms deviation of bond lengths from ideality for all atoms of the selected residue.

  for $1 in id ( name ca ) loop main 
     igroup 
       interaction ( byresidue ( id $1 ) ) ( byresidue ( id $1 ) )
     end 
     print threshold=0.1 bonds 
     display $result 
  end loop main

Do, show and identity statements

The do, show and identity statement allow the manipulation and query of atomic properties, such as masses, charges, coordinates, forces, and atom names. Mathematical expressions can be constructed that involve atomic properties. The show statement can also be used to analyze atomic properties and to transfer the information to the $RESULT symbol. The identity statement can be used to define and store an atom selection that can be recalled later.

DO manipulates atom properties. The operations are carried out component by component for all selected atoms.
IDENtify is used to define and store an atom selection. The output array will contain the sequential number of the selected atoms or otherwise zero.
SHOW show-property can be used to analyze atom properties. The possible show-properties are:
- AVE shows the arithmetic average of selected elements and stores it in $RESULT.
- ELEMent shows selected elements and stores the last element in $RESULT.
- MAX shows the maximum of selected elements and stores it in $RESULT.
- MIN shows the minimum of selected elements and stores it in $RESULT.
- NORM shows the norm (sqrt(sum x^2/N)) of selected elements and stores it in $RESULT.
- RMS shows the root-mean-square (rms) deviation (sqrt((sum (x-)^2)/N)) of selected elements and stores it in $RESULT.
- SUM shows the arithmetic sum of selected elements and stores it in $RESULT.

Atom properties can be reassigned the result of an operation using the DO statement:

  DO ( atom-property = operation ) ( atom-selection )

The SHOW statement show only the result of an operation:

  SHOW ( operation ) ( atom-selection )

The IDENtity statement does not use an operation, only an atomic STORE array and an atom selection:

  IDENtity ( store[i] ) ( atom-selection )

Operations are expressions using:

  atom-property | function | integer | real | string | symbol

Where operators (in order of increasing precedence) are:

+ denotes addition; concatenation for strings.
- denotes subtraction; unary minus or negative concatenation for strings.
* denotes multiplication.
/ denotes division.
^ denotes exponentiation.
** denotes exponentiation (same as ^).

Operators with the highest precedence are excuted first. Operators with the same precedence are executed from left to right. Operations have to be meaningful; i.e., the data type of the operands has to match the operation. For strings, only the "+" and "-" operations are allowed.

Functions are as follows. Use of a string requires enclosure in double quotes " ". The data type of the function arguments has to match the data type of the operands.

ABS() expects one argument and returns its absolute value. Argument restrictions: no string.
ACOS() denotes arc cosine. Argument restrictions: no string or complex; expects argument in degrees.
ASIN() denotes arc sine. Argument restrictions: no string or complex; expects argument in degrees.
CAPITALIZE() converts all characters in a string to uppercase.
COS() denotes cosine. Argument restrictions: no string; expects argument in degrees.
DECODE() converts a character string to a numerical number if possible.
ENCODE() converts a numerical number to a character string.
EXP() is an exponentiation function. Argument restrictions: no string.
GAUSS() is a Gaussian distribution random-number function; it has one argument, the desired standard deviation. The mean of the distribution is always zero. Argument restrictions: no string or complex.
IMOD( a, b ) returns the nearest integer remainder of the first argument divided by the second. Argument restrictions: no string or complex.
INT() is a truncation. Argument restrictions: no string.
LOG10() is a base-10 logarithmic function. Argument restrictions: no string or complex; argument must be greater than zero.
LOG() is a natural logarithmic function. Argument restrictions: no string or complex; real numbers must be greater than zero.
MAX( list ) is a maximum-value function; it must have at least two arguments, and it returns the value of the argument with the maximum value. Argument restrictions: no string or complex.
MAXW() is a Maxwellian distribution random-number function; it has one argument, the desired standard deviation. The mean of the distribution is always zero. Argument restrictions: no string or complex.
MIN( list ) is a minimum-value function; it must have at least two arguments, and it returns the value of the argument with the minimum value. Argument restrictions: no string or complex.
MOD( a, b ) returns the remainder of the first argument divided by the second. Argument restrictions: no string or complex.
NINT() returns nearest integer. Argument restrictions: no string.
NORM() is a normalization function; it has one argument, which must be an atom property. This function calculates the sum of the squares of all selected elements in the argument array. Then it divides each selected element by the square root of the sum of the squares. Argument restrictions: no string or complex.
RANDom() is a random-number function; it has no argument. It returns a uniform distribution between 0 and 1.
SIGN() is a transfer of sign. If the argument is >= 0, it returns +1; if the argument is < 0, it returns -1. Argument restrictions: no string or complex.
SIN() denotes sine. Argument restrictions: no string; expects degrees.
SQRT() returns the square root of the given argument. Argument restrictions: no string; no negative real numbers.
STEP() is a step function; it expects one real-number argument. If the argument is greater than zero, step returns a one; otherwise step returns a zero. Argument restrictions: no string or complex.
TAN() denotes tangent in degrees. Argument restrictions: no string or complex.

The complete set of atomic properties is:

B B-factors of main coordinate set in Å^2 (real)
BCOMp B-factors of comparison coordinate set in Å^2 (real)
CHARge electric charge in electronic charges (real)
CHEMical chemical atom type (string)
DX x component of first derivatives in kcal mole-1 Å-1 (real)
DY y component of first derivatives in kcal mole-1 Å-1 (real)
DZ z component of first derivatives in kcal mole-1 Å-1 (real)
FBETa friction coefficient in psec-1 (real)
HARMonic energy constants of harmonic restraints in kcal mole-1 Å-2 (real)
MASS mass in amu (real)
NAME atom name (string)
Q occupancies of main coordinate set (real)
QCOMp occupancies of comparison coordinate set (real)
REFX x component of reference coordinate set in Å (real)
REFY y component of reference coordinate set in Å (real)
REFZ z component of reference coordinate set in Å (real)
RESId residue number (string)
RESName residue name (string)
RMSD array used by various modules, e.g., the COOR RMS statement
SEGId segment or chain identifier (string)
SCATTER_A1 atomic form-factor coefficient a1 (real)
SCATTER_A2 atomic form-factor coefficient a2 (real)
SCATTER_A3 atomic form-factor coefficient a3 (real)
SCATTER_A4 atomic form-factor coefficient a4 (real)
SCATTER_B1 atomic form-factor coefficient b1 (real)
SCATTER_B2 atomic form-factor coefficient b2 (real)
SCATTER_B3 atomic form-factor coefficient b3 (real)
SCATTER_B4 atomic form-factor coefficient b4 (real)
SCATTER_C atomic form-factor coefficient c (real)
SCATTER_FP atomic f' coefficient (real)
SCATTER_FDP atomic f'' coefficient (real)
STORE1 1st internal store, is fragile (real)
STORE2 2nd internal store, is fragile (real)
STORE3 3rd internal store, is fragile (real)
STORE4 4th internal store, is fragile (real)
STORE5 5th internal store, is fragile (real)
STORE6 6th internal store, is fragile (real)
STORE7 7th internal store, is fragile (real)
STORE8 8th internal store, is fragile (real)
STORE9 9th internal store, is fragile (real)
VX x component of current velocities in Å psec-1 (real)
VY y component of current velocities in Å psec-1 (real)
VZ z component of current velocities in Å psec-1 (real)
X x component of main coordinate set in Å (real)
XCOMp x component of comparison coordinate set in Å (real)
Y y component of main coordinate set in Å (real)
YCOMp y component of comparison coordinate set in Å (real)
Z z component of main coordinate set in Å (real)
ZCOMp z component of comparison coordinate set in Å (real)

Requirements

The molecular structure has to be present. Upon modification of the molecular structure (e.g., delete or patch statement), the contents of the internal stores are destroyed; i.e., they are fragile. However, all other atom properties are conserved; i.e., they are only partially fragile.

Examples

The first example divides the coordinate array Z by the derivative array DX, adds the quotient to the coordinate array Y, and stores the result in the coordinate array X. The operations are carried out component by component for all atoms.

  do ( X = Y + Z / DX ) ( all )

The next example computes a Gaussian distribution with standard deviation 1.0 and stores the result in the coordinate array x for all Ca atoms:

  do ( X = GAUSS( 1.0 ) ) ( name ca )

The next example provides a listing of the X coordinates of all Tyr residues:

  show element ( X ) ( resname tyr )

The next example computes the average of all electric charges in residue 34. This average value is then stored in the symbol $1 by using the evaluate statement.

  show ave ( charge ) ( residue 34 ) 
  evaluate ($1=$RESULT)

The next example stores the specified atom selection in the array STORE1:

  identity ( store1 ) ( attribute mass > 30.0 )

The array STORE1 will be nonzero for the selected atoms and zero otherwise. The values for the selected atoms represent the sequential number of the selected atoms. The array STORE1 can be recalled by using ( store1 ) in a selection statement.

Back to tutorials Previous section Next section