The segment statement generates the molecular structure by interpreting the coordinate file to obtain the residue sequence or by explicitly specifying the residue sequence. The residues have to be defined in the topology statement. Appropriate patches to are applied to generate covalent links between or within molecules, such as disulfide bridges.
SyntaxThe segment statement is used to generate the molecular topology information. This is invoked from the main level of CNS:
SEGMent { segment-statement } END
The possible segment statements are as follows:
The chain statement is most often used to generate molecular topology from the sequence read from a coordinate file. The possible chain statements are as follows:
The molecular topology has to be defined. Execution of the segment statement destroys fragile atom selections.
Example: A Polypeptide ChainThe following example file shows how to set up a polypeptide segment (Tyr, Ala, Glu, Lys, Ile, Ala), assuming that the topology and parameters have been previously read. For completeness, the definitions of the patch residues PEPT, PROP, NTER, and CTER have been included. Normally, these patch residues are already defined in the topology files.
topology PRESidue PEPT ADD BOND -C +N ADD ANGLE -CA -C +N ADD ANGLE -O -C +N ADD ANGLE -C +N +CA ADD ANGLE -C +N +H ADD DIHEdral -C +N +CA +C ADD DIHEdral -N -CA -C +N ADD DIHEdral -CA -C +N +CA ADD IMPRoper -C -CA +N -O {planar -C} ADD IMPRoper +N -C +CA +H {planar +N} END PRESidue PEPP ADD BOND -C +N ADD ANGLE -CA -C +N ADD ANGLE -O -C +N ADD ANGLE -C +N +CA ADD ANGLE -C +N +CD ADD DIHEdral -C +N +CA +C ADD DIHEdral -N -CA -C +N ADD DIHEdral -CA -C +N +CA ADD IMPRoper -C -CA +N -O {planar -C} ADD IMPRoper +N +CA +CD -C {planar +N} END PRESidue NTER GROUp ADD ATOM +HT1 TYPE=HC CHARge=0.35 END ADD ATOM +HT2 TYPE=HC CHARge=0.35 END MODIfy ATOM +N TYPE=NH3 CHARge=-0.30 END ADD ATOM +HT3 TYPE=HC CHARge=0.35 END DELETE ATOM +H END MODIfy ATOM +CA CHARge=0.25 END ADD BOND +HT1 +N ADD BOND +HT2 +N ADD BOND +HT3 +N ADD ANGLe +HT1 +N +HT2 ADD ANGLe +HT2 +N +HT3 ADD ANGLe +HT2 +N +CA ADD ANGLe +HT1 +N +HT3 ADD ANGLe +HT1 +N +CA ADD ANGLe +HT3 +N +CA ADD DIHEdral +HT2 +N +CA +C ADD DIHEdral +HT1 +N +CA +C ADD DIHEdral +HT3 +N +CA +C END PRESidue PROP GROUp ADD ATOM +HT1 TYPE=HC CHARge= 0.35 END ADD ATOM +HT2 TYPE=HC CHARge= 0.35 END MODIfy ATOM +N TYPE=NH3 CHARge=-0.20 END MODIfy ATOM +CD CHARge= 0.25 END MODIfy ATOM +CA CHARge= 0.25 END ADD BOND +HT1 +N ADD BOND +HT2 +N ADD ANGLe +HT1 +N +HT2 ADD ANGLe +HT2 +N +CA ADD ANGLe +HT1 +N +CD ADD ANGLe +HT1 +N +CA ADD ANGLe +CD +N +HT2 ADD DIHEdral +HT2 +N +CA +C ADD DIHEdral +HT1 +N +CA +C END PRESidue CTER GROUp MODIfy ATOM -C CHARge= 0.14 END ADD ATOM -OT1 TYPE=OC CHARge=-0.57 END ADD ATOM -OT2 TYPE=OC CHARge=-0.57 END DELETE ATOM -O END ADD BOND -C -OT1 ADD BOND -C -OT2 ADD ANGLe -CA -C -OT1 ADD ANGLe -CA -C -OT2 ADD ANGLe -OT1 -C -OT2 ADD DIHEdral -N -CA -C -OT2 ADD IMPRoper -C -CA -OT2 -OT1 END end segment name="PROT" chain link pept head - * tail + * end first prop tail + pro end ! special n-ter for PRO first nter tail + * end last cter head - * end sequence TYR ALA GLU LYS ILE ALA end end end
Here is an example generating the molecular topology from a coordinate file using the chain identifier to generate the segid and recognising each chain as a separate molecule:
segment chain convert=true separate=true @@CNS_TOPPAR:protein.link coordinates @@model.pdb end end coordinates convert=true @@model.pdb
The information about the patch residues is normally included in the topology file. The information about the peptide linkages is also included in the "protein.link" file in the "libraries/toppar" directory. Wildcards allow one to use the same patch residues ("PEPT") for all combinations of amino acids. The only exception is Pro, which needs a special patch. The residues are numbered consecutively, starting with 1. To use a particular numbering for residues, one should use the COOR option to read the sequence from the coordinate file.
Patching the Molecular StructureThe patch statement refers to a patch residue (PRESidue), which consists of additions of atoms by the add statement, modifications by the modify statement, or deletions by the delete statement. A patch can establish chain linkages, such as peptide bonds, chain termini, disulfide bridges, and covalent links to ligands.
SyntaxThe patch statement is invoked from the main level of CNS:
PATCh patch-residue REFErence=NIL|patch-character=(atom-selection) .... END
Patches the specified selections of atoms using the patch residue with name residue name. The patch character corresponds to the first character in the PRES specification. The specification of NIL implies that in the PRESidue the patch characters are omitted. Multiple references are used to establish links to more than one residue.
RequirementsThe molecular structure has to be well defined. Execution of the patch statement destroys fragile atom selections; e.g., the atom properties are conserved during patching, except for the internal stores (STORE1, STORE2, ... ). The atom properties of additional atoms are set to default values. Other information, such as information for the xray or noe statements, is lost during patching.
Example: Incorporation of Disulfide BridgesThe DISU patch residue defines the covalent link between two cysteine residues. To make a disulfide bridge, the following patch statement can be used:
topology presidue DISU group modify atom 1CB charge= 0.19 END modify atom 1SG type=S charge=-0.19 END group modify atom 2CB charge= 0.19 END modify atom 2SG type=S charge=-0.19 END add bond 1SG 2SG add angle 1CB 1SG 2SG add angle 1SG 2SG 2CB add dihedral 1CA 1CB 1SG 2SG add dihedral 1CB 1SG 2SG 2CB add dihedral 1SG 2SG 2CB 2CA end end patch disu reference=1=( resid 15 ) reference=2=( resid 25 ) end
This will make a disulfide bridge between residue number 15 and residue number 25. (For completeness, the definition of the patch residue DISU is listed here as well. Normally, this patch residue is already included in the topology files.)
Example: Modification of the Protonation Degree of HistidinesThe protonation degree of HIS can be changed by using the patch residues HISE and HISD that are defined in file "protein.top", which is located in the "libraries/toppar" directory. The histidine patches can be invoked by using the patch statement:
patch HISE reference=nil=( resid 14 ) end
This changes the protonation degree of His-14 to that of a singly HE2-protonated histidine.
Deleting AtomsThe delete statement removes the selected atoms from the current molecular structure. It will also delete any connections such as bonds, bond angles, or dihedral angles involving an atom that is deleted.
SyntaxDELEte SELEction=(atom-selection) ENDRequirements
The molecular structure has to be well defined. Execution of the delete statement destroys fragile atom selections; e.g., the atom properties are conserved during deletions, except for the internal stores (STORE1, STORE2, ... ). Other information, such as information for the xray or noe statements, is lost during deletions.
Example: Delete One AtomTo remove the hydrogen of peptide nitrogen of residue 1, one can specify:
delete selection=(resid 1 and name HN) endDuplicating the Molecular Structure
The duplicate statement allows one to duplicate the molecular structure or selected atoms of it. The statement duplicates all atom properties, coordinates, connectivities, bonds, angles, etc. It has a renaming feature that allows one to specify a new segment or residue name for the duplicated atoms.
SyntaxThe duplicate statement is invoked from the main level of CNS:
DUPLicate { duplicate-statement } END
The possible duplicate statements are as follows:
The molecular structure has to be well defined. Execution of the duplicate statement destroys fragile atom selections.
Example: Duplication of Side-Chain AtomsThe following example duplicates the atoms of a side chain. The duplicated atoms are identical to the original ones except for the segment name.
duplicate selection=( resid 40 and not ( name ca or name n or name c or name o ) ) segid="NEW" endStructure Statement
The structure statement allows one to read a molecular structure file that has been written previously by the write structure statement.
SyntaxThe structure statement is invoked from the main level of CNS:
STRUcture { structure-statement } END
The possible structure statements are as follows:
Execution of the duplicate statement destroys fragile atom selections.
Example: How to Read a Molecular Structure FileThe following example shows how to read the molecular structure file "molecule1.mtf":
structure @molecule1.mtf endExample: Append Two Molecular Structure Files
Molecular structure information can be appended:
structure @molecule1.mtf @molecule2.mtf end
but care should be taken to avoid duplicate atom definitions.
Writing a Molecular Topology FileThe write structure statement writes the current molecular topology to the specified output file. The molecular topology file (MTF file) contains information that was generated by the segment statement and modified by the patch statement or the delete statement. Specifically, the molecular topology file contains the following information: atom names, types, charges, and masses; residue names and segment names; and a list of bond terms, angle terms, dihedral terms, improper terms, explicit nonbonded exclusions, and nonbonded group partitions. It does not contain atomic coordinates, parameters, constraints, restraints, or any other information that is specific to effective energy terms, such as diffraction data. The molecular topology file is in STAR format (like the mmCIF format).
Syntax
WRITe STRUcture OUTPut=filename END