A heterotrimer with N-terminal acetyl and C-terminal amide capping
groups, a chlorine ion and water molecules
This tutorial shows how to generate a molecular topology file
(.mtf) and a CNS coordinate file (.pdb)
using the generate.inp task file, starting with a standard
coordinate file as obtained from the PDB.
The structure with the PDB ID code
1bb1 is
a heterotrimer with a few particularities:
- The protein chains have N-terminal acetyl and C-terminal amide
capping groups. These groups require the use of custom
topology and parameter files. The files capping.top,
capping.param and protein_capping.link
from another tutorial are used.
- The original PDB file includes coordinates for a chlorine ion.
This requires the use of the standard topology and parameter
files ion.top and ion.param.
- The original PDB files includes coordinates for a number of
water molecules.
This requires the use of the standard topology and parameter
files water.top and water.param.
- Some atoms are missing from five residues (residue identifier
15 in chain B, residue identifiers 2, 9, 29, and 34 in chain C).
To conform to the naming conventions adopted in the creation of
capping.top, capping.param and
protein_capping.link, a few simple modification must be
made to the original PDB file:
- The original residue name for the amide group is NH2 and
has to be renamed to NHH.
- The original name for the terminal carbon of the acetyl group is
CH3 and has to be renamed to CA.
The generate.inp task files also needs to be modified:
- The correct name for the protein coordinate file
must be defined (1bb1.pdb).
- The original PDB file (like most PDB files) does
not use segment identifiers (columns 73-76), but only the
one-character chain identifiers (column 22). However, CNS
only uses segment identifiers, and chain identifiers are not used.
Therefore it is necessary to convert the chain identifiers to
segment identifiers. This is requested by setting the flag labeled
convert chainid to segid if chainid is non-blank
to true.
- Because of the presence of the capping groups, the customized
linkage file protein_capping.link must be used instead of
the standard library file CNS_TOPPAR:protein.link.
- The topology and parameter files for the capping groups are
defined as prosthetic group topology and parameter
files.
- There are no chain identifiers or segment identifiers for the
chlorine ion and the water molecules in the original PDB
file. For clarity it is desirable to assign segment identifiers.
This can be done by inserting CNS commands in the space
labeled any final patches can be applied here
(although strictly speaking this is not a patch):
do (segid=I) (resname CL)
do (segid=W) (resname HOH)
In generate.inp, the standard library topology and parameter
files for ions and water molecules are included by default. Please note
that most parameter files are not included by default in the refinement
task files. In general, the user has to take care that all the
necessary parameter files are included.
The command to generate the files generate.mtf and
generate.pdb is:
cns_solve < generate.inp > generate.out [6 seconds]
In the output file generate.pdb, the missing atoms mentioned
above are included. A list of the atoms built is shown as
REMARK statements at the top of the file.
Inspection of the generate.pdb output file shows that the
first character of the CNS segment identifier is also used as
chain identifier. In CNS, segment identifiers can be up to
four characters long. This means, it would be possible to use, for
example, the identifiers ION and WAT instead of just
I and W. However, many other programs (including the
graphics program O) can only handle the one-character chain
identifiers and ignore the segment identifiers. For compatibility, it
is therefore frequently much more convenient to use only one-character
segment identifiers.
See also: Tools for building coordinates of hetero compounds
Script to run this tutorial
Back to tutorials
Previous section
Next section