Scoring Functions

From Rizzo_Lab
Jump to: navigation, search

DOCK uses several types of scoring functions to discriminate among orientations and molecules. Scoring is requested using the score_molecules parameter. A large portion of scoring functions can be called through descriptor score (but not all). Those with descriptor score functionality will be denoted in bold.

Continuous/DOCK Cartesian Energy Score

Continuous scoring may be used to evaluate a ligand:receptor complex without the investment of a grid calculation, or to perform a more detailed calculation without the numerical approximation of the grid. Continuous score is implemented under Descriptor Score as well. Continuous score is a cartesian based score and can be combined with other Cartesian based scores with very little computational expense.

Continuous/DCE Score Parameters

Parameter Description Default Value
continuous_score_primary Does the user want to perform continuous non-grid scoring as the primary scoring function? no
continuous_score_secondary Does the user want to perform continuous non-grid scoring as the secondary scoring function? no
cont_score_rec_filename File that contains receptor coordinates receptor.mol2
cont_score_att_exp VDW Lennard-Jones potential attractive exponent 6
cont_score_rep_exp VDW Lennard-Jones potential repulsive exponent 12
cont_score_rep_rad_scale Scalar multiplier of the radii for the repulsive portion of the vdw energy component only 1.0
cont_score_use_dist_dep_dielectric Distance dependent dielectric switch yes
cont_score_dielectric Dielectric constant for the electrostatic term
cont_score_vdw_scale Scalar multiplier of vdw energy component 1
cont_score_vdw_scale Flag to turn off vdw portion of the scoring function when cont_score_vdw_scale=0 yes
cont_score_es_scale Scalar multiplier of electrostatic energy component 1.0
cont_score_turn_off_es Flag to turn off es portion of the scoring function when cont_score_es_scale = 0 yes

Continuous/DCE Score Output Components

Output Component Description
Continuous_score sum of the van der Waals and electrostatic interactions
Continuous_vdw_energy VDW interaction between ligand and receptor
Continuous_es_energy ES interaction between the ligand and receptor

Grid-Based Score

DOCK needs a fast scoring function to evaluate poses rapidly during growth. The energy grid is used for this. The grid stores the non-bonded Molecular Mechanics Potential of the receptor at each grid point.

Mm-equation.png

Grid can be called under descriptor score but it is suggested to not mix grid-based and cartesian space as that will dramatically increase computing time.

Grid Score Parameters

Parameter Description Default Value
grid_score_primary Does the user want to perform grid-based energy scoring as the primary scoring function? yes-
grid_score_rep_rad_scale Scalar multiplier of the radii for the repulsive portion of the VDW energy component only when grid score is turned on 1.0
grid_score_vdw_scale Scalar multiplier of the VDW energy component 1
grid_score_turn_off_vdw A flag to turn off vdw portion of scoring function when grid score vdw scale = 0 yes
grid_score_es_scale Flag to scale up or down the es portion of the scoring function when es scale is turned on 1
grid_score_turn_off_es A flag to turn off es portion of scoring function when grid score es scale = 0 yes
grid_score_grid_prefix The prefix to the grid files containing the desired nrg/bmp grid grid

Grid Score Output Components

These values will be printed in the header of the mol2 file post DOCK process.

Output Component Description
Grid_score Sum of the VDW and ES interactions
Grid_vdw_energy VDW interaction between the ligand grid
Grid_es_energy ES interaction betweent eh ligand and grid

MultiGrid Score

The MultiGrid Score is similar to the Footprint Score described below, except that in the case of the Multi-Grid Score, the pair-wise interaction energies are computed over multiple grids rather than in Cartesian space. This is done to improve the tractability of FPS calculations as well as to make it simple to combine FPS and standard Grid Score. If the multiple grids are prepared as recommended than the sum of the interactions with each grid should equal the interaction of a standard DOCK grid representing the entire target. By default MultiGrid score will equal the sum of the interactions with all grids plus a FPS component generated by treating each grid as a protein residue. User defined scaling factors allow MultiGrid score to be set to equal Grid score, FPS score or any combination thereof. Generally, "important" receptor residues are identified before-hand based on the magnitude of their interaction with the reference ligand, then a unique grid is generated to represent each of those residues. Finally, a "remainder" grid is generated to represent all remaining receptor residues. The scoring function itself will then calculate intermolecular VDW and ES energies for the reference ligand and pose ligand on each of the grids (also called footprints), then it will calculate the footprint similarity using either the Standard Euclidean, Normalized Euclidean, or Pearson Correlation similarity metrics.

MultiGrid can be called under descriptor score but it is suggested to not mix grid-based and cartesian space as that will dramatically increase computing time.

MultiGrid Score Parameters

Parameter Description Default Value
multigrid_score_primary Does the user want to perform grid-based energy scoring as the primary scoring function? yes-
multigrid_score_rep_rad_scale Scalar multiplier of the radii for the repulsive portion of the VDW energy component only when grid score is turned on 1.0
multigrid_score_vdw_scale Scalar multiplier of the VDW energy component 1
multigrid_score_es_scale Flag to scale up or down the es portion of the scoring function when es scale is turned on 1
multigrid_score_number_of_grids Path to the reference txt file - only used when footprint reference txt is turned on. ligand_footprint.txt
multigrid_score_grid_prefix0 Provide prefixes to identify the grids. Note that the first grid starts at '0'. The last grid should be the remainder grid. This must be done for each grid. multigrid0
multigrid_score_individual_rec_ensemble Flag for individual receptor (standard) or multiple receptor (not implemented yet). no
multigrid_score_weights_text Flag for providing a textfile as input for the reference footprint. no
multigrid_score_footprint_text Name of the reference footprint input text file, when multigrid_score_weight_text is turned on. reference.txt
multigrid_score_fp_ref_mol Flag for providing a MOL2 as input for the reference footprint. no
multigrid_score_footprint_ref Name of the reference footprint input MOL2 file, when multigrid_score_fp_ref_mol is turned on. reference.mol2
multigrid_score_foot_compare_type Footprint similarity calculation methods (Options: Euclidean, Pearson). If Pearson, the correlation coefficient as the metric to compare the footprints. When the value is 1 then there is perfect agreement between the two footprints. WHen the value is 0 then there is poor agreement between the two footprints. If Euclidean, the Euclidean distance as the metric to compare the footprints. When the value is 0 then there is perfect agreement btween the two footprints. As the agreement gets worse between the two footprints the value increases. Euclidean
multigrid_score_normalize_foot normalization is used only with Euclidean distance. no
multigrid_score_vdw_euc_scale Scaling factor for VDW term. when using euclidean 1.0
multigrid_score_es_euc_scale Scaling factor for ES term when using euclidean 1.0
multigrid_score_vdw_norm_scale Scaling factor for VDW term. euclidean and normalize 10.0
multigrid_score_es_norm_scale Scaling factor for ES term. Flags if using Pearson Correlation similarity metric for footprint comparison. 10.0
multigrid_score_vdw_cor_scale Scaling factor for VDW term (-10.0)
multigrid_score_es_cor_scale Scaling factor for ES term (-10.0)

MultiGrid Score Output Components

These values will be printed in the header of the mol2 file post DOCK process.

Output Component Description
MultiGrid_score Sum of the VDW and ES interactions
MultiGrid_vdw_energy VDW interaction between the ligand grid
MultiGrid_es_energy ES interaction betweent eh ligand and grid
MGS_vdw+es_energy sum of VDW and ES components
MGS_vdw_fps VDW footprint similarity score
MGS_es_fps ES footprint similarity score
MGS_vdw+es_fps sum of VDW adn ES footprint similarity scores

Footprint Similarity Score

The Footprint Similarity Score is a scoring function that calculates intermolecular hydrogen bonds and footprint comparisons, in addition to standard intermolecular energies (VDW and ES).

Intermolecular Energies (VDW, ES) are calculated the same way as in Continuous Score.

A geometric definition of Hydrogen bonds is employed. We define 3 atoms XD, HD, and XA as the heavy atom donor, donated hydrogen, and heavy atom acceptor, respectively. There is a hydrogen bond present if the following two conditions are met:

  1. The distance between HD and XA is less than or equal to 2.5 angstroms;
  2. The angle defined by XD, HD, and XA is between 120 and 180 degrees.

Footprints are a per-residue decomposition of interactions between the ligand and the receptor. This can be performed for all three terms VDW, ES, HB. Two footprints can be compared in three ways: Standard Euclidean, Normalized Euclidean, Pearson Correlation.

Footprints are used to gauge how similar two poses or two molecules are to one-another. For applications to virtual screening applications a reference is required.

There are to choices for a reference:

  1. One can give a mol2 file containing a reference molecule, and footprints will be calculated.
  2. One can pass a text file containing VDW, ES, and H-bond footprints.

There are different choices for selection of residues:

  1. All residues.
  2. Residues chosen using a threshold (union of the sets of reference and pose). The VDW, ES, and HB footprints may have different residues chosen in this case.
  3. Selected residues.

Note that for (2) and (3) the remaining residue interaction may be placed in a remainder value included in the footprint.

Footprint Similarity Score can be called under Descriptor Score.

Footprint Similarity Score Parameters

Parameter Description Default Value
footprint_similarity_score_primary Flag to perform footprint scoring as the primary scoring function no
footprint_similarity_score_secondary Flag to perform footprint scoring as the secondary scoring function. no
fps_score_use_footprint_reference_mol2 Use a molecule to calculate footprint reference. no
fps_score_footprint_reference_mol2_filename Path to the reference mol2 file - only used when footprint reference mol2 is turned on. ligand_footprint.mol2
fps_score_use_footprint_reference_txt Use a pre-calculated footprint reference in text format. no
fps_score_footprint_reference_txt_filename Path to the reference txt file - only used when footprint reference txt is turned on. ligand_footprint.txt
fps_score_foot_compare_type Footprint similarity calculation methods (Options: Euclidean, Pearson). If Pearson, the correlation coefficient as the metric to compare the footprints. When the value is 1 then there is perfect agreement between the two footprints. WHen the value is 0 then there is poor agreement between the two footprints. If Euclidean, the Euclidean distance as the metric to compare the footprints. When the value is 0 then there is perfect agreement btween the two footprints. As the agreement gets worse between the two footprints the value increases. Euclidean
fps_score_normalize_foot normalization is used only with Euclidean distance. no
fps_score_foot_comp_all_residue If yes all residues are used for calculating the footprint. yes
fps_score_choose_foot_range_type User can use to determine the type of the range of the footprint by either specifying a residue range or defining a threshold. If specify_range, the user chooses to use a residue range and all footprints will be evaluated only on this residue range. First residue id = 1 not 0. If threshold, the user chose to use a residue range that is defined by only residues that have magnitudes that exceed the specified thresholds. (Options: specify_range, threshold) specify_range
fps_score_vdw_threshold Specify threshold for van der Waals energy, when threshold is turned on. 1
fps_score_es_threshold Specify threshold for electrostatic energy, when threshold is turned on. 1
fps_score_hb_threshold specify threshold for hydrogen bonds (integers). 0.5 means that all none zeros are used, when threshold is turned on. 0.5
fps_score_use_remainder Interaction remainder is all remaining residues not included individually yes
fps_score_rec_filename File that contains receptor coordinates receptor.mol2
fps_score_att_exp VDW Lennard-Jones potential attractive exponent 6
fps_score_rep_exp VDW Lennard-Jones potential repulsive exponent 12
fps_score_rep_rad_scale Scalar multiplier of the radii for the repulsive portion of the VDW energy component ONLY 1
fps_score_use_distance_dependent_dielectric Distance dependent dielectric switch yes
fps_score_dielectric Dielectric constant for electrostatic term 4.0
fps_score_vdw_scale Scalar multiplier of vdw energy component 1
fps_score_es_scale Scalar multiplier of es energy component 1
fps_score_hb_scale Scalar multiplier of hb energy component 0
fps_score_internal_scale Scalar multiplier of internal energy component 0
fps_score_fp_vwd_scale Scalar multiplier of vdw footprint component 0
fps_score_fp_es_scale Scalar multiplier of es footprint component 0
fps_score_fp_hb_scale Scalar multiplier of hb footprint component 0

Footprint Similarity Score Output Components

Output Component Description
Footprint_similarity_score sum of the van der Waals, electostatic, and hbond footprint similarity scores
FPS_vdw_energy VDW interaction between ligand and receptor
FPS_es_energy ES interaction between the ligand and receptor
FPS_num_hbond number of hydrogen bonds
FPS_vdw+es_energy sum of the van der Waals and electrostatic components
FPS_vdw_fps vdw footprint similarity score
FPS_es_fps ES footprint similarity score
FPS_hb_fps hbond footprint similarity score
FPS_vdw_fp_numres number of residues in the receptor considered during the calculation
FPS_es_fp_numres number of residues in the receptor considered during the calculation
FPS_hb_fp_numres number of residues in the receptor considered during the calculation

Pharmacophore Matching Similarity Score

The Pharmacophore Matching Similarity score is a scoring function that calculates the level of pharmacophore overlap between a reference molecule and a candidate molecule in three dimensional space.The functional form for quantifying the pharmacophore overlap in a virtual screening experiment using DOCK, termed pharmacophore matching similarity (FMS), is as follows:

Fms equation.jpg

Pharmacophore Matching Similarity Score can be called under Descriptor Score.

Pharmacophore Matching Similarity Score Parameters

Parameter Description Default Value
pharmacophore_score_primary Flag to perform FMS scoring as the primary scoring function no
fms_score_use_ref_mol2 Use a molecule to calculate pharmacophore reference no
fms_score_ref_mol2_filename molecule reference input file name. Ph4.mol2
fms_score_use_ref_txt Use a text format pharmacophore reference. no
fms_score_ref_txt_filename text reference input file name. Ph4.txt
fms_score_write_reference_pharmacophore_mol2 Flag to write the reference pharmacophore model as a mol2 output file. no
fms_score_write_reference_ph4_txt Flag to write the reference pharmacophore model as a txt output file. no
fms_score_reference_output_mol2_filename reference pharmacophore mol2 output file name. ref_ph4.mol2
fms_score_reference_output_txt_filename Reference pharmacophore txt output file name. ref_ph4.txt
fms_score_write_candidate_pharmacophore Flag to write the candidate pharmacophore model as a mol2 output file. no
fms_score_candidate_output_filename Candidate pharmacophore output file name cad_ph4.mol2
fms_score_write_matched_pharmacophore Flag to write the matched pharmacophore model as a mol2 output file. The matched pharmacophore model, which is consist of pharmacophore points well-matched to any reference pharmacophore point, is a subset of the candidate pharmacophore model. no
fms_score_matched_output_filename matched pharmacophore output file name. mat_ph4.mol2
fms_score_compare_type Flag to determine comparison method between reference and candidate ph4. If overlap user is using a ligand-based reference for computing the FMS. When the value is 0 then there is a perfect overlap. When the value is negative then you have multi-matched ph4. When the value is positive then you have matches with residual. If compatible (This is under development and not currently available) user is using a receptor based reference for computing the FMS. When the value is X then there is a perfect overlap. When the value is Y then you have multi-matched ph4. When the value is Z then you have matches with residual.(Options: overlap, compatible) overlap
fms_score_full_match Flag to determine if full match is desired. Currently only full match is considered. yes
fms_score_match_rate_weight Specify the constant parameter k (weight on the match rate term) in FMS score 5
fms_score_match_proj_cutoff Specify the scalar projection cutoff σ in the pharmacophore matching protocol. Default value cos(45 � ) ≈ 0.7071 corresponds to a vector angle cutoff of 45 � 0.7071
fms_score_max_score Specify the FMS score value for pharmacophore model pairs with no matches. This maximum FMS score depends on k, r and σ. 20

Descriptor Score

The Descriptor Score is a newly developed scoring function that is a linear combination of scoring functions that allows users to guide sampling and evaluate docked molecules with one or more scoring criteria that emphasize different properties of the molecule. Descriptor score, as defined above, is a linear combination of various existing and newly developed scoring functions of DOCK, in the following formula:

Descriptorscore.jpg

Here the total score can consist of different scoring functions including grid-based score, multigrid FPS score, continuous score, footprint score, pharmacophore matching similarity score, Tanimoto score, Hungarian matching similarity score and volume overlap score.

These scoring functions can be categorized into two groups, interaction-based scoring functions and similarity-based scoring functions.

Interaction-based scoring functions include grid score, multigrid FPS score, continuous score, and footprint score. These four scoring functions report the interaction energy between the ligand and the receptor in either Cartesian space (continuous score and footprint score) or grid space (grid score and multigrid FPS score), thus the program does not allow users to combine scoring functions from separate groups. Users can however combine continuous score and footprint score. (Grid score is inherently computed in multigrid FPS score, thus these two scores are typically not combined). Not choosing any of the interaction-based scoring function is also an option of using descriptor score.

Similarity-based scoring functions include pharmacophore matching similarity score, Tanimoto score, Hungarian matching similarity score, and volume overlap score. This set of scoring functions always require a reference for comparison, and user can choose any number of scoring functions from this group for the desired descriptor score formula.