The model has been obtained by manual building, and therefore may have distorted geometry so it is sensible to check the model prior to starting refinement. In this example the CNS task file model_stats_start.inp is used to analyse the model geometry and diffraction statistics. The results will indicate if the model has poor geometry which can occur in the course of manual rebuilding or also due to an error at the generate stage (see previous section). Locating any possible problems prior to refinement will save time later on.
cns_solve < model_stats_start.inp > model_stats_start.out [2 minutes]
The output listing file (model_stats_start.list) contains a variety of information which is self-explanatory. Important things to note are:
Initial R-values:
=================================== summary ================================== resolution range: 500 - 1.9 A R-values: initial r= 0.4304 free_r= 0.4381 after B-factor and/or bulk solvent correction r= 0.3222 free_r= 0.3503
Initial R-values that are very different from expected are usually the result of a simple mistake such as incorrect space group, unit cell dimensions, input diffraction data or input model.
R-values versus resolution:
================================= R-values =================================== =======> R-values with |Fobs|/sigma cutoff= 0.0 Test set (test = 1): #bin | resolution range | #refl | 1 4.09 500.01 529 0.2983 2 3.25 4.09 439 0.3047 3 2.84 3.25 538 0.3758 4 2.58 2.84 514 0.3993 5 2.39 2.58 517 0.3738 6 2.25 2.39 484 0.4019 7 2.14 2.25 456 0.3951 8 2.05 2.14 461 0.4313 9 1.97 2.05 393 0.4059 10 1.90 1.97 382 0.4383 Working set: #bin | resolution range | #refl | 1 4.09 500.01 4660 0.2644 2 3.25 4.09 4764 0.2869 3 2.84 3.25 4636 0.3492 4 2.58 2.84 4532 0.3591 5 2.39 2.58 4460 0.3647 6 2.25 2.39 4336 0.3694 7 2.14 2.25 4185 0.3759 8 2.05 2.14 3993 0.3786 9 1.97 2.05 3754 0.3983 10 1.90 1.97 3412 0.4272
This distribution of R-values is reasonable - there is no dramatic increase in R-value (in particular free R-value) as resolution increases. If there were resolution shells with R-values significantly higher than the rest this might indicate possible problems with the data processing (ice rings for example). However, the R-values indicate the fit to the experimental data and should not be manipulated by removing data - in particular by the use of sigma cutoffs to exclude weak data.
The overall geometry:
rmsd bonds= 0.012550 with 8 bond violations > 0.05 rmsd angles= 1.41936 with 8 angle violations > 8.0 rmsd dihedrals= 21.99313 with 1 angle violations > 60.0 rmsd improper= 0.85765 with 8 angle violations > 3.0
If the model has been through extensive manual rebuilding the initial geometry may have significant deviations from ideality. Any major problem can be detected by the detailed geometry analysis (below).
The geometry in detail:
=======> bond violations (atom-i |atom-j ) dist. equil. delta energy const. ( 100 SD | 100 CE ) 1.659 1.791 -0.132 2.948 170.066 ( 112 SD | 112 CE ) 1.710 1.791 -0.081 1.129 170.066 ( 182 SD | 182 CE ) 1.509 1.791 -0.282 13.508 170.066 ( 184 CG | 184 SD ) 1.743 1.803 -0.060 1.862 512.111 ( 184 SD | 184 CE ) 1.643 1.791 -0.148 3.738 170.066 ( 239 SD | 239 CE ) 1.653 1.791 -0.138 3.218 170.066 ( 243 SD | 243 CE ) 1.658 1.791 -0.133 3.029 170.066 ( 247 SD | 247 CE ) 1.706 1.791 -0.085 1.239 170.066 =======> angle violations (atom-i |atom-j |atom-k ) angle equil. delta energy const. ( 28 N | 28 CA | 28 C ) 96.638 111.200 -14.562 16.011 247.886 ( 39 C | 40 N | 40 CA ) 114.478 122.600 -8.122 1.562 77.737 ( 58 N | 58 CA | 58 C ) 119.825 111.200 8.625 5.618 247.886 ( 68 N | 68 CA | 68 C ) 103.162 111.200 -8.038 4.879 247.886 ( 162 N | 162 CA | 162 C ) 122.939 111.800 11.139 11.752 310.948 ( 196 N | 196 CA | 196 C ) 100.964 111.200 -10.236 7.912 247.886 ( 197 N | 197 CA | 197 C ) 99.938 111.200 -11.262 9.577 247.886 ( 228 N | 228 CA | 228 C ) 120.181 111.200 8.981 6.091 247.886 =======> improper angle violations (atom-i |atom-j |atom-k |atom-L ) angle equil. delta energy const. period ( 40 N | 40 CA | 40 CD | 39 C ) -8.339 0.000 8.339 2.118 100.000 0 ( 62 N | 62 CA | 62 CD | 61 C ) 4.895 0.000 -4.895 0.730 100.000 0 ( 90 N | 90 CA | 90 CD | 89 C ) -3.091 0.000 3.091 0.291 100.000 0 ( 140 N | 140 CA | 140 CD | 139 C ) -5.569 0.000 5.569 0.945 100.000 0 ( 143 CA | 143 N | 143 C | 143 CB ) 38.823 35.264 -3.558 2.893 750.000 0 ( 143 N | 143 CA | 143 CD | 142 C ) -3.823 0.000 3.823 0.445 100.000 0 ( 162 N | 162 CA | 162 CD | 161 C ) 4.980 0.000 -4.980 0.755 100.000 0 ( 195 CA | 195 N | 195 C | 195 CB ) 38.604 35.264 -3.340 2.548 750.000 0 =======> dihedral angle violations (atom-i |atom-j |atom-k |atom-L ) angle equil. delta energy const. period ( 100 CB | 100 CG | 100 SD | 100 CE ) -174.558 -90.000 84.558 9.910 5.000 2
Specific problems with the model geometry can be identified here. In particular the deviations for the bond lengths should be checked. If there are very long bond lengths the model should be checked. This can occur as a result of a mistake at the generate stage, in particular forgetting to include a TER or BREAK card between separate chains. If present, these unphysical bonds will cause serious problems in refinement.
Non-trans peptide bonds:
============================ non-trans peptides ============================== cis-peptide: segid= resid=143 resname=PRO current dihedral value= -0.115
The presence of non-trans peptides, unless they are proline residues, will cause problems in refinement. They can be identified, and an appropriate parameter file created using the CNS task file general/cis_peptide.inp. The parameter file generated is read into subsequent refinement task files.