Checking the initial model


The model has been obtained by manual building, and therefore may have distorted geometry so it is sensible to check the model prior to starting refinement. In this example the CNS task file model_stats_start.inp is used to analyse the model geometry and diffraction statistics. The results will indicate if the model has poor geometry which can occur in the course of manual rebuilding or also due to an error at the generate stage (see previous section). Locating any possible problems prior to refinement will save time later on.

      cns_solve < model_stats_start.inp > model_stats_start.out [2 minutes]

The output listing file (model_stats_start.list) contains a variety of information which is self-explanatory. Important things to note are:

Initial R-values:

=================================== summary ==================================

resolution range: 500 - 1.9 A
  R-values:
  initial                                        r= 0.4304 free_r= 0.4381
  after B-factor and/or bulk solvent correction  r= 0.3222 free_r= 0.3503

Initial R-values that are very different from expected are usually the result of a simple mistake such as incorrect space group, unit cell dimensions, input diffraction data or input model.

R-values versus resolution:

================================= R-values ===================================


=======> R-values with |Fobs|/sigma cutoff= 0.0

 Test set (test = 1):

 #bin | resolution range | #refl |
    1   4.09  500.01        529      0.2983
    2   3.25    4.09        439      0.3047
    3   2.84    3.25        538      0.3758
    4   2.58    2.84        514      0.3993
    5   2.39    2.58        517      0.3738
    6   2.25    2.39        484      0.4019
    7   2.14    2.25        456      0.3951
    8   2.05    2.14        461      0.4313
    9   1.97    2.05        393      0.4059
   10   1.90    1.97        382      0.4383

 Working set:

 #bin | resolution range | #refl |
    1   4.09  500.01       4660      0.2644
    2   3.25    4.09       4764      0.2869
    3   2.84    3.25       4636      0.3492
    4   2.58    2.84       4532      0.3591
    5   2.39    2.58       4460      0.3647
    6   2.25    2.39       4336      0.3694
    7   2.14    2.25       4185      0.3759
    8   2.05    2.14       3993      0.3786
    9   1.97    2.05       3754      0.3983
   10   1.90    1.97       3412      0.4272

This distribution of R-values is reasonable - there is no dramatic increase in R-value (in particular free R-value) as resolution increases. If there were resolution shells with R-values significantly higher than the rest this might indicate possible problems with the data processing (ice rings for example). However, the R-values indicate the fit to the experimental data and should not be manipulated by removing data - in particular by the use of sigma cutoffs to exclude weak data.

The overall geometry:

rmsd bonds= 0.012550 with 8 bond violations > 0.05
rmsd angles=  1.41936 with 8 angle violations >  8.0
rmsd dihedrals= 21.99313 with 1 angle violations >  60.0
rmsd improper=  0.85765 with 8 angle violations >  3.0

If the model has been through extensive manual rebuilding the initial geometry may have significant deviations from ideality. Any major problem can be detected by the detailed geometry analysis (below).

The geometry in detail:


=======> bond violations

 (atom-i        |atom-j        )    dist.   equil.   delta    energy   const.

 (     100  SD  |     100  CE  )    1.659    1.791   -0.132    2.948  170.066
 (     112  SD  |     112  CE  )    1.710    1.791   -0.081    1.129  170.066
 (     182  SD  |     182  CE  )    1.509    1.791   -0.282   13.508  170.066
 (     184  CG  |     184  SD  )    1.743    1.803   -0.060    1.862  512.111
 (     184  SD  |     184  CE  )    1.643    1.791   -0.148    3.738  170.066
 (     239  SD  |     239  CE  )    1.653    1.791   -0.138    3.218  170.066
 (     243  SD  |     243  CE  )    1.658    1.791   -0.133    3.029  170.066
 (     247  SD  |     247  CE  )    1.706    1.791   -0.085    1.239  170.066

=======> angle violations

 (atom-i        |atom-j        |atom-k        )  angle    equil.     delta    energy  const.

 (     28   N   |     28   CA  |     28   C   )   96.638  111.200  -14.562   16.011  247.886
 (     39   C   |     40   N   |     40   CA  )  114.478  122.600   -8.122    1.562   77.737
 (     58   N   |     58   CA  |     58   C   )  119.825  111.200    8.625    5.618  247.886
 (     68   N   |     68   CA  |     68   C   )  103.162  111.200   -8.038    4.879  247.886
 (     162  N   |     162  CA  |     162  C   )  122.939  111.800   11.139   11.752  310.948
 (     196  N   |     196  CA  |     196  C   )  100.964  111.200  -10.236    7.912  247.886
 (     197  N   |     197  CA  |     197  C   )   99.938  111.200  -11.262    9.577  247.886
 (     228  N   |     228  CA  |     228  C   )  120.181  111.200    8.981    6.091  247.886

=======> improper angle violations

 (atom-i        |atom-j        |atom-k        |atom-L        )    angle    equil.   delta    energy   const.   period

 (     40   N   |     40   CA  |     40   CD  |     39   C   )   -8.339    0.000    8.339    2.118  100.000   0
 (     62   N   |     62   CA  |     62   CD  |     61   C   )    4.895    0.000   -4.895    0.730  100.000   0
 (     90   N   |     90   CA  |     90   CD  |     89   C   )   -3.091    0.000    3.091    0.291  100.000   0
 (     140  N   |     140  CA  |     140  CD  |     139  C   )   -5.569    0.000    5.569    0.945  100.000   0
 (     143  CA  |     143  N   |     143  C   |     143  CB  )   38.823   35.264   -3.558    2.893  750.000   0
 (     143  N   |     143  CA  |     143  CD  |     142  C   )   -3.823    0.000    3.823    0.445  100.000   0
 (     162  N   |     162  CA  |     162  CD  |     161  C   )    4.980    0.000   -4.980    0.755  100.000   0
 (     195  CA  |     195  N   |     195  C   |     195  CB  )   38.604   35.264   -3.340    2.548  750.000   0

=======> dihedral angle violations

 (atom-i        |atom-j        |atom-k        |atom-L        )    angle    equil.   delta    energy   const.   period

 (     100  CB  |     100  CG  |     100  SD  |     100  CE  ) -174.558  -90.000   84.558    9.910    5.000   2

Specific problems with the model geometry can be identified here. In particular the deviations for the bond lengths should be checked. If there are very long bond lengths the model should be checked. This can occur as a result of a mistake at the generate stage, in particular forgetting to include a TER or BREAK card between separate chains. If present, these unphysical bonds will cause serious problems in refinement.

Non-trans peptide bonds:

============================ non-trans peptides ==============================

cis-peptide: segid= resid=143 resname=PRO
             current dihedral value=   -0.115

The presence of non-trans peptides, unless they are proline residues, will cause problems in refinement. They can be identified, and an appropriate parameter file created using the CNS task file general/cis_peptide.inp. The parameter file generated is read into subsequent refinement task files.


Script to run this tutorial

Back to tutorials   Previous section   Next section