Although the model has been obtained by molecular replacement, and should therefore have reasonable geometry, it is sensible to check the model prior to starting refinement. In this example the CNS task file model_stats_start.inp is used to analyse the model geometry and diffraction statistics. The results will indicate if the model has poor geometry which can occur in the course of manual rebuilding or also due to an error at the generate stage (see previous section). Locating any possible problems prior to refinement will save time later on.
cns_solve < model_stats_start.inp > model_stats_start.out [2 minutes]
The output listing file (model_stats_start.list) contains a variety of information which is self-explanatory. Important things to note are:
Initial R-values:
=================================== summary ================================== resolution range: 500 - 1.8 A R-values: initial r= 0.5112 free_r= 0.4890 after B-factor and/or bulk solvent correction r= 0.4306 free_r= 0.4424
Initial R-values that are very different from expected are usually the result of a simple mistake such as incorrect space group, unit cell dimensions, input diffraction data or input model.
R-values versus resolution:
================================= R-values =================================== =======> R-values with |Fobs|/sigma cutoff= 0.0 Test set (test = 1): #bin | resolution range | #refl | 1 3.88 500.01 354 0.3723 2 3.08 3.88 354 0.4350 3 2.69 3.08 375 0.4688 4 2.44 2.69 394 0.4178 5 2.27 2.44 349 0.4616 6 2.13 2.27 384 0.4603 7 2.03 2.13 349 0.4796 8 1.94 2.03 348 0.5447 9 1.86 1.94 374 0.4543 10 1.80 1.86 393 0.4623 Working set: #bin | resolution range | #refl | 1 3.88 500.01 3463 0.3662 2 3.08 3.88 3468 0.4065 3 2.69 3.08 3422 0.4435 4 2.44 2.69 3431 0.4527 5 2.27 2.44 3444 0.4629 6 2.13 2.27 3416 0.4555 7 2.03 2.13 3438 0.4647 8 1.94 2.03 3443 0.4569 9 1.86 1.94 3400 0.4575 10 1.80 1.86 3323 0.4782
This distribution of R-values is reasonable - there is no dramatic increase in R-value (in particular free R-value) as resolution increases. If there were resolution shells with R-values significantly higher than the rest this might indicate possible problems with the data processing (ice rings for example). However, the R-values indicate the fit to the experimental data and should not be manipulated by removing data - in particular by the use of sigma cutoffs to exclude weak data.
The overall geometry:
rmsd bonds= 0.009109 with 2 bond violations > 0.05 rmsd angles= 1.57655 with 9 angle violations > 8.0 rmsd dihedrals= 24.18230 with 0 angle violations > 60.0 rmsd improper= 1.36747 with 37 angle violations > 3.0
If the model has been through extensive manual rebuilding the initial geometry may have significant deviations from ideality. Any major problem can be detected by the detailed geometry analysis (below).
The geometry in detail:
================================= geometry =================================== =======> bond violations (atom-i |atom-j ) dist. equil. delta energy const. (A 159 CG1 |A 159 CD1 ) 1.569 1.513 0.056 1.222 389.218 (B 159 CG1 |B 159 CD1 ) 1.568 1.513 0.055 1.181 389.218 =======> angle violations (atom-i |atom-j |atom-k ) angle equil. delta energy const. (A 111 N |A 111 CA |A 111 C ) 103.152 111.200 -8.048 4.891 247.886 (A 112 N |A 112 CA |A 112 C ) 103.195 111.200 -8.005 4.839 247.886 (A 182 N |A 182 CA |A 182 C ) 101.879 111.200 -9.321 6.561 247.886 (A 198 N |A 198 CA |A 198 C ) 99.371 111.200 -11.829 10.566 247.886 (B 111 N |B 111 CA |B 111 C ) 103.139 111.200 -8.061 4.907 247.886 (B 112 N |B 112 CA |B 112 C ) 103.134 111.200 -8.066 4.913 247.886 (B 182 N |B 182 CA |B 182 C ) 101.841 111.200 -9.359 6.614 247.886 (B 198 N |B 198 CA |B 198 C ) 99.325 111.200 -11.875 10.648 247.886 (B 206 N |B 206 CA |B 206 C ) 103.168 111.200 -8.032 4.872 247.886 =======> improper angle violations (atom-i |atom-j |atom-k |atom-L ) angle equil. delta energy const. period (A 114 C |A 114 CA |A 115 N |A 114 O ) 3.997 0.000 -3.997 3.649 750.000 0 (A 117 CA |A 117 N |A 117 C |A 117 CB ) 31.678 35.264 3.586 2.938 750.000 0 (A 120 CA |A 120 N |A 120 C |A 120 CB ) 38.700 35.264 -3.435 2.696 750.000 0 (B 115 CA |B 115 N |B 115 C |B 115 CB ) 32.248 35.264 3.017 2.079 750.000 0 (B 114 C |B 114 CA |B 115 N |B 114 O ) 4.034 0.000 -4.034 3.717 750.000 0 (B 117 CA |B 117 N |B 117 C |B 117 CB ) 31.640 35.264 3.624 3.001 750.000 0 (B 120 CA |B 120 N |B 120 C |B 120 CB ) 38.645 35.264 -3.380 2.611 750.000 0 =======> dihedral angle violations (atom-i |atom-j |atom-k |atom-L ) angle equil. delta energy const. period
Specific problems with the model geometry can be identified here. In particular the deviations for the bond lengths should be checked. If there are very long bond lengths the model should be checked. This can occur as a result of a mistake at the generate stage, in particular forgetting to include a TER or BREAK card between separate chains. If present, these unphysical bonds will cause serious problems in refinement.
Non-trans peptide bonds:
============================ non-trans peptides ============================== cis-peptide: segid=A resid=186 resname=PRO current dihedral value= -0.443 cis-peptide: segid=B resid=186 resname=PRO current dihedral value= -0.262
The presence of non-trans peptides, unless they are proline residues, will cause problems in refinement. They can be identified, and an appropriate parameter file created using the CNS task file general/cis_peptide.inp. The parameter file generated is read into subsequent refinement task files.