Generating and manipulating the molecular topology


The segment statement generates the molecular structure by interpreting the coordinate file to obtain the residue sequence or by explicitly specifying the residue sequence. The residues have to be defined in the topology statement. Appropriate patches to are applied to generate covalent links between or within molecules, such as disulfide bridges.

Syntax

The segment statement is used to generate the molecular topology information. This is invoked from the main level of CNS:

  SEGMent { segment-statement } END

The possible segment statements are as follows:

The chain statement is most often used to generate molecular topology from the sequence read from a coordinate file. The possible chain statements are as follows:

Requirements

The molecular topology has to be defined. Execution of the segment statement destroys fragile atom selections.

Example: A Polypeptide Chain

The following example file shows how to set up a polypeptide segment (Tyr, Ala, Glu, Lys, Ile, Ala), assuming that the topology and parameters have been previously read. For completeness, the definitions of the patch residues PEPT, PROP, NTER, and CTER have been included. Normally, these patch residues are already defined in the topology files.

topology 

   PRESidue PEPT 
     ADD BOND -C +N 
     ADD ANGLE -CA -C +N 
     ADD ANGLE -O  -C +N 
     ADD ANGLE -C  +N +CA 
     ADD ANGLE -C  +N +H 

     ADD DIHEdral  -C +N +CA +C 
     ADD DIHEdral  -N -CA -C +N 
     ADD DIHEdral  -CA -C +N +CA 

     ADD IMPRoper  -C -CA +N -O  {planar -C} 
     ADD IMPRoper  +N -C +CA +H  {planar +N} 

   END 

   PRESidue PEPP 

     ADD BOND -C +N 
     ADD ANGLE -CA -C +N 
     ADD ANGLE -O  -C +N 
     ADD ANGLE -C  +N +CA 
     ADD ANGLE -C  +N +CD 

     ADD DIHEdral  -C +N +CA +C 
     ADD DIHEdral  -N -CA -C +N 
     ADD DIHEdral  -CA -C +N +CA 

     ADD IMPRoper  -C -CA +N -O  {planar -C} 
     ADD IMPRoper  +N +CA +CD -C  {planar +N} 

  END 

  PRESidue NTER  

    GROUp 

    ADD    ATOM +HT1  TYPE=HC   CHARge=0.35  END 
    ADD    ATOM +HT2  TYPE=HC   CHARge=0.35  END 
    MODIfy ATOM +N    TYPE=NH3  CHARge=-0.30 END 
    ADD    ATOM +HT3  TYPE=HC   CHARge=0.35  END 
    DELETE ATOM +H                           END 
    MODIfy ATOM +CA             CHARge=0.25  END 

    ADD BOND +HT1 +N 
    ADD BOND +HT2 +N 
    ADD BOND +HT3 +N 

    ADD ANGLe +HT1  +N    +HT2 
    ADD ANGLe +HT2  +N    +HT3 
    ADD ANGLe +HT2  +N    +CA 
    ADD ANGLe +HT1  +N    +HT3 
    ADD ANGLe +HT1  +N    +CA 
    ADD ANGLe +HT3  +N    +CA 

    ADD DIHEdral +HT2  +N    +CA   +C 
    ADD DIHEdral +HT1  +N    +CA   +C 
    ADD DIHEdral +HT3  +N    +CA   +C 

  END 

  PRESidue PROP     

    GROUp 
    ADD    ATOM +HT1  TYPE=HC   CHARge= 0.35   END 
    ADD    ATOM +HT2  TYPE=HC   CHARge= 0.35   END 
    MODIfy ATOM +N    TYPE=NH3  CHARge=-0.20   END 
    MODIfy ATOM +CD             CHARge= 0.25   END 
    MODIfy ATOM +CA             CHARge= 0.25   END 

    ADD BOND +HT1  +N 
    ADD BOND +HT2  +N 

    ADD ANGLe +HT1  +N    +HT2 
    ADD ANGLe +HT2  +N    +CA 
    ADD ANGLe +HT1  +N    +CD 
    ADD ANGLe +HT1  +N    +CA 
    ADD ANGLe +CD   +N    +HT2 

    ADD DIHEdral +HT2  +N    +CA   +C 
    ADD DIHEdral +HT1  +N    +CA   +C 

  END 

  PRESidue CTER     

    GROUp 
    MODIfy ATOM -C             CHARge= 0.14  END 
    ADD    ATOM -OT1  TYPE=OC  CHARge=-0.57  END 
    ADD    ATOM -OT2  TYPE=OC  CHARge=-0.57  END 
    DELETE ATOM -O                           END 

    ADD BOND -C    -OT1 
    ADD BOND -C    -OT2 

    ADD ANGLe -CA   -C   -OT1 
    ADD ANGLe -CA   -C   -OT2 
    ADD ANGLe -OT1  -C   -OT2 

    ADD DIHEdral -N    -CA    -C   -OT2 
    ADD IMPRoper -C    -CA    -OT2 -OT1 

  END 

end 

segment 
   name="PROT" 
   chain 
      link pept head - * tail + * end             
      first prop tail + pro end     ! special n-ter for PRO 
      first nter tail + *   end 
      last cter head - * end 
      sequence TYR ALA GLU LYS ILE ALA end 
   end 
end 

Here is an example generating the molecular topology from a coordinate file using the chain identifier to generate the segid and recognising each chain as a separate molecule:

segment
  chain
    convert=true
    separate=true
    @@CNS_TOPPAR:protein.link
    coordinates @@model.pdb
  end
end

coordinates
  convert=true
  @@model.pdb

The information about the patch residues is normally included in the topology file. The information about the peptide linkages is also included in the "protein.link" file in the "libraries/toppar" directory. Wildcards allow one to use the same patch residues ("PEPT") for all combinations of amino acids. The only exception is Pro, which needs a special patch. The residues are numbered consecutively, starting with 1. To use a particular numbering for residues, one should use the COOR option to read the sequence from the coordinate file.

Patching the Molecular Structure

The patch statement refers to a patch residue (PRESidue), which consists of additions of atoms by the add statement, modifications by the modify statement, or deletions by the delete statement. A patch can establish chain linkages, such as peptide bonds, chain termini, disulfide bridges, and covalent links to ligands.

Syntax

The patch statement is invoked from the main level of CNS:

  PATCh patch-residue
    REFErence=NIL|patch-character=(atom-selection)
    ....
  END

Patches the specified selections of atoms using the patch residue with name residue name. The patch character corresponds to the first character in the PRES specification. The specification of NIL implies that in the PRESidue the patch characters are omitted. Multiple references are used to establish links to more than one residue.

Requirements

The molecular structure has to be well defined. Execution of the patch statement destroys fragile atom selections; e.g., the atom properties are conserved during patching, except for the internal stores (STORE1, STORE2, ... ). The atom properties of additional atoms are set to default values. Other information, such as information for the xray or noe statements, is lost during patching.

Example: Incorporation of Disulfide Bridges

The DISU patch residue defines the covalent link between two cysteine residues. To make a disulfide bridge, the following patch statement can be used:

topology 
   presidue DISU     
      group 
      modify atom 1CB           charge= 0.19  END 
      modify atom 1SG  type=S   charge=-0.19  END 
      group 
      modify atom 2CB           charge= 0.19  END 
      modify atom 2SG  type=S   charge=-0.19  END 

      add bond 1SG 2SG 

      add angle  1CB 1SG 2SG 
      add angle  1SG 2SG 2CB 

      add dihedral   1CA 1CB 1SG 2SG 
      add dihedral   1CB 1SG 2SG 2CB 
      add dihedral   1SG 2SG 2CB 2CA 

   end 
end 

patch disu
   reference=1=( resid 15 ) reference=2=( resid 25 ) 
end 

This will make a disulfide bridge between residue number 15 and residue number 25. (For completeness, the definition of the patch residue DISU is listed here as well. Normally, this patch residue is already included in the topology files.)

Example: Modification of the Protonation Degree of Histidines

The protonation degree of HIS can be changed by using the patch residues HISE and HISD that are defined in file "protein.top", which is located in the "libraries/toppar" directory. The histidine patches can be invoked by using the patch statement:

patch HISE 
   reference=nil=( resid 14 ) 
end 

This changes the protonation degree of His-14 to that of a singly HE2-protonated histidine.

Deleting Atoms

The delete statement removes the selected atoms from the current molecular structure. It will also delete any connections such as bonds, bond angles, or dihedral angles involving an atom that is deleted.

Syntax
  DELEte 
    SELEction=(atom-selection)
  END
Requirements

The molecular structure has to be well defined. Execution of the delete statement destroys fragile atom selections; e.g., the atom properties are conserved during deletions, except for the internal stores (STORE1, STORE2, ... ). Other information, such as information for the xray or noe statements, is lost during deletions.

Example: Delete One Atom

To remove the hydrogen of peptide nitrogen of residue 1, one can specify:

  delete 
    selection=(resid 1 and name HN) 
 end
Duplicating the Molecular Structure

The duplicate statement allows one to duplicate the molecular structure or selected atoms of it. The statement duplicates all atom properties, coordinates, connectivities, bonds, angles, etc. It has a renaming feature that allows one to specify a new segment or residue name for the duplicated atoms.

Syntax

The duplicate statement is invoked from the main level of CNS:

  DUPLicate { duplicate-statement } END

The possible duplicate statements are as follows:

Requirements

The molecular structure has to be well defined. Execution of the duplicate statement destroys fragile atom selections.

Example: Duplication of Side-Chain Atoms

The following example duplicates the atoms of a side chain. The duplicated atoms are identical to the original ones except for the segment name.

duplicate 
   selection=( resid 40 and not 
             ( name ca or name n or name c or name o ) ) 
   segid="NEW" 
end 
Structure Statement

The structure statement allows one to read a molecular structure file that has been written previously by the write structure statement.

Syntax

The structure statement is invoked from the main level of CNS:

  STRUcture { structure-statement } END

The possible structure statements are as follows:

Requirements

Execution of the duplicate statement destroys fragile atom selections.

Example: How to Read a Molecular Structure File

The following example shows how to read the molecular structure file "molecule1.mtf":

structure 
   @molecule1.mtf 
end 
Example: Append Two Molecular Structure Files

Molecular structure information can be appended:

structure 
   @molecule1.mtf 
   @molecule2.mtf 
end 

but care should be taken to avoid duplicate atom definitions.

Writing a Molecular Topology File

The write structure statement writes the current molecular topology to the specified output file. The molecular topology file (MTF file) contains information that was generated by the segment statement and modified by the patch statement or the delete statement. Specifically, the molecular topology file contains the following information: atom names, types, charges, and masses; residue names and segment names; and a list of bond terms, angle terms, dihedral terms, improper terms, explicit nonbonded exclusions, and nonbonded group partitions. It does not contain atomic coordinates, parameters, constraints, restraints, or any other information that is specific to effective energy terms, such as diffraction data. The molecular topology file is in STAR format (like the mmCIF format).

Syntax

  WRITe STRUcture OUTPut=filename END

Back to tutorials   Previous section   Next section