The structure with the PDB ID code 2bop is a protein monomer with a DNA fragment, a ytterbium ion, and water molecules. The nomenclature used for the DNA residue names and some atom names is significantly different from that used in CNS. There is a utility program called fix_dna_rna which can be used to convert a PDB format DNA/RNA file to something more suitable for CNS.
fix_dna_rna < 2bop.pdb > 2bop_fix1.pdb [< 1 second]For a certain carbon in thymine, the atom name used in the original PDB file is C5M, while CNS expects the name C5A. The atom names are changed with the UNIX command:
sed 's/ C5M THY / C5A THY /' 2bop_fix1.pdb > 2bop_fix2.pdb [< 1 second]It would also be possible to manually change the atom names by using a standard text editor.
In the original PDB file, the chain identifier used for the water molecules is identical to that of the protein. This can lead to complicated selection statements in other task files and is also generally confusing. It is recommended to change the chain identifiers, for example with the UNIX command:
sed 's/ HOH A / HOH W /' 2bop_fix2.pdb > 2bop_fix3.pdb [< 1 second]
The command to generate the files generate_easy.mtf and generate_easy.pdb is:
cns_solve < generate_easy.inp > generate_easy.out [3 seconds]
Inspection of the generate_easy.pdb output file shows that the first character of the CNS segment identifier is also used as chain identifier. In CNS, segment identifiers can be up to four characters long. However, many other programs (including the graphics program O) can only handle the one-character chain identifiers and ignore the segment identifiers. For compatibility, it is therefore frequently much more convenient to use only one-character segment identifiers.