2021 DOCK tutorial 4 with PDBID 1EFY

From Rizzo_Lab
Revision as of 07:30, 5 April 2021 by Stonybrook (talk | contribs) (Single Molecule Docking and Pose Reproduction)
Jump to: navigation, search



DOCK is a member of a large suite of molecular docking packages. Notably, it was the first of its kind, and it was created by Irwin Kuntz’s group in the 1980s. The original authors of DOCK include Brian K. Shoichet, David A. Case, and Robert C.Rizzo (link respective group pages). Molecular docking plays a key role in computer aided drug design (CADD). Generally, molecular docking allows us to explore the interactions between small molecules, or ligands, and proteins by evaluating the shape and energetics of the ligand. Add more about the specific algorithm. We will be using the 6.9 release of DOCK, the latest as of November 2018, for this tutorial. New features of DOCK6.9 include DOCK_DN, a new ligand searching method for ligand de novo design and fragment library generation, which allows users to generate their own fragment libraries from mol2 files. For a comprehensive list of what’s new in DOCK6.9, see: http://dock.compbio.ucsf.edu/DOCK_6/new_in_6.9.txt (link).


UCSF Chimera is a standalone program that allows us to visualize and analyze macromolecules. General features of Chimera include automatic identification of atom, hydrogen addition and partial charge assignment, measurements of distances, angles, surface area, volume, and many more. For a complete description of everything Chimera can do, see https://www.rbvi.ucsf.edu/chimera/.


In this tutorial, we will be working with the crystal structure of a member of the Poly (ADP-Ribose) polymerase (PARP) family. Specifically, we will be working with the catalytic fragment of PARP along with a benzimidazole inhibitor. PARP family proteins are involved in DNA repair. It has been shown that PARP may allow cancer cells to repair their DNA after exposure to chemotherapy, which allows for resistance to develop. As such, it is of particular interest to develop novel inhibitors for PARP. Goal for this tutorial: The goal for this tutorial is to demonstrate an end to end example of performing a virtual screen

Directory Setup

Before we begin our molecular docking and virtual screen, we should setup our directory structure:


Preparing the Ligand and Receptor

PDB Structure

Visit https://www.rcsb.org/ and search 1EFY. Click on Download files > PDB Format This file contains the crystal structure of the catalytic fragment of Poly (ADP-ribose) Polymerase complexed with a benzimidazole inhibitor. We will be visualizing the structure in Chimera. Open Chimera and click File > Open Select the PDB file that you have just downloaded

Unedited pdb.png


In order to prepare our system for DOCK, we must perform some preprocessing steps. Broadly speaking, we will be creating two new mol2 files out of the original PDB file: one for the receptor and one for the ligand. Both of these new structures will be “clean,” that is, they will not contain any of the extra ions that were present during the experiment but not biologically relevant. We will first create the mol2 file for the receptor. Using the original PDB structure, we will remove everything but the receptor. There are many ways to do this, but the easiest way is the following: Click Select > Residue > all nonstandard The benzimidazole inhibitor and the two water molecules should now be highlighted like so

Selected residues.png

Now select Actions > Atoms/Bonds > delete

1efy deleted residues.png

This leaves us with a “cleaned” version of the receptor which we can now save as a mol2 file. Click File > Save Mol2… and save the file as “1EFY_rec_woH.mol2” where “woH” indicates that this is the receptor without hydrogens. Next we must add hydrogens and charges so that … // add why it’s important Select Tools > Structure Editing > AddH A window should pop up like this:

1efy add hydrogens.png

Select “OK.” Now each residue in the receptor should be properly protonated. However, it is important that we validate the hydrogens that Chimera has added, as they may not be correct. Similarly, we must add partial charges to each atom: Select Tools > Structure Editing > Add Charge. Make sure that AM1-BCC is selected in the window that pops up and select “OK”: Now each atom of the receptor should have the appropriate partial charge. As with the protonation states, it is important to doublecheck Chimera’s work. This can be done by // This receptor, which has now been prepared for DOCK, can be saved as a mol2 file named “1EFY_rec_dockprep.mol2”

Ligand Preparation

Just as we preprocessed the receptor, we will now prepare the ligand. Open the original PDB structure in Chimera Now, we will be deleting all of the constituents in the structure except for the benzimidazole inhibitor. Click Select > Residue > BZC. This should highlight the inhibitor. Click Select > Invert (all models). This should highlight everything except the inhibitor. Click Actions > Atoms/Bonds > delete Only the ligand should now be present:

1efy ligand noH.png

We will save this mol2 file File > save as mol2 > 1EFY_ligand_noH.mol2” where noH indicates no hydrogens Just as before, we must now add the hydrogens and charge.

Bzc wH.png

Add charge: All four of the files should now be scp'd to the 001 directory.

Surface Generation and Spehere Selection

Identification of the binding site on a receptor is driven by identifying its concave surfaces. Once the molecular surface has been calculated, we can generate overlapping spheres to describe the negative image of the molecular surface. The centers of the spheres constituting the largest cluster represents possible positions where the ligand can bind. In Chimera, we can calculate the molecular surface according to the following steps: Open “1EFY_rec_woH.mol2” Select Actions > Surface > Show

1efy surface.png

This shows the solvent excluded molecular surface of the receptor. Now we can use DMS, an open source program, to calculate the molecular surface of the receptor. Tools > Structure Editing > Write DMS and save the file name as “1EFY_rec_surphace.dms” This file can be moved to 002 Now that the molecular surface has been calculated and written to the DMS file, we can construct spheres to fill in any concave regions of the surface of the receptor by using the sphgen program in DOCK. The centers of the spheres represent positions of ligand atoms. The template for sphere generation using sphgen looks like the following:

rec.ms #molecular surface file
R #sphere outside of surface (R) or inside surface (L)
X #specifies subset of surface points to be used (X=all points)
0.0 #prevents generation of large spheres with close surface contacts (default=0.0)
4.0 #maximum sphere radius in angstroms (default=4.0)
1.4 #minimum sphere radius in angstroms (default=radius of probe)
rec.sph #clustered spheres file

Our file looked like this:


You can run the sphgen program like this:

sphgen -i INSPH -o OUTSPH

Chimera can read the output of the sphgen program

1efy spheres.png
1efy wspheres.png

Generating Box and Grid

Many individual floating point calculations are required for the computation of Lennard-Jones and Coulombic Interactions. Calculating the nonbonded interactions for a macromolecule with 100,000 atoms would be prohibitively expensive. We can calculate the overlapping volume of two sets of atoms using a grid based method. The intuition behind this method is the following: Generate a bounding box that contains all of the atoms. Generate grid points with a predefined separation. Tradeoff between grid separation and computational resources We will be using the accessory programs showbox and grid in order to generate the grids required for grid based scoring (which is based on the non-bonded terms of the molecular mechanic force field). The first step is to generate a box enclosing the active site of the receptor. Create an input file for showbox: Touch showbox.in

compute_grids                  yes
grid_spacing                   0.4
output_molecule                no
contact_score                  no
energy_score                   yes
energy_cutoff_distance         9999
atom_model                     a
attractive_exponent            6
repulsive_exponent             9
distance_dielectric            yes
dielectric_factor              4
bump_filter                    yes
bump_overlap                   0.75
receptor_file                  ../00.files/1efy_receptor.mol2
box_file                       1EFY.box.pdb
vdw_definition_file            /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
score_grid_prefix              grid

We can pipe in the input file into the showbox with the command “showbox < box.in” The output of the program will be a PDB file containing the structure of the bounding box

  • The grid calculation can take up to 45 minutes*

The Grid program generates the grid that allows for rapid score evaluation in DOCK

1efy box.png

Single Molecule Energy Minimization

In order to meaningfully dock new ligands into the receptor, the current model must be verified for its accuracy. To accomplish this, the co-crystallized ligand will be energy minimized and docked into the 1efy receptor. The resulting pose should closely resemble the crystal structure. If significant deviations are noted, the ligand and receptor should be scrutinized for correct charge, protonation, and resonance.

First the input file should be constructed

touch min.in
dock6 -i min.in

Input the following

conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100.0
ligand_atom_file                                             ../001.dockprep/3jqz_lig_withH.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                yes      
rmsd_reference_filename                               ../001.dockprep/3jqz_lig_withH.mol2
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../003.gridbox/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        yes
simplex_coefficient_restraint                                10.0
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl
ligand_outfile_prefix                                        3jqz.lig.min
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Virtual Screen

Cartesian Minimization