Difference between revisions of "2021 DOCK tutorial 4 with PDBID 1EFY"

From Rizzo_Lab
Jump to: navigation, search
(Energy Minimization)
 
(82 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
=Introduction=
 
=Introduction=
 
==DOCK==
 
==DOCK==
DOCK is a member of a large suite of molecular docking packages. Notably, it was the first of its kind, and it was created by Irwin Kuntz’s group in the 1980s. The original authors of DOCK include Brian K. Shoichet, David A. Case, and Robert C.Rizzo (link respective group pages). Molecular docking plays a key role in computer aided drug design (CADD). Generally, molecular docking allows us to explore the interactions between small molecules, or ligands, and proteins by evaluating the shape and energetics of the ligand. Add more about the specific algorithm. We will be using the 6.9 release of DOCK, the latest as of November 2018, for this tutorial. New features of DOCK6.9 include DOCK_DN, a new ligand searching method for ligand de novo design and fragment library generation, which allows users to generate their own fragment libraries from mol2 files. For a comprehensive list of what’s new in DOCK6.9, see: http://dock.compbio.ucsf.edu/DOCK_6/new_in_6.9.txt (link).
+
DOCK is a member of a large suite of molecular docking packages. Notably, it was the first molecular docking package of its kind, and it was created by [http://dock.compbio.ucsf.edu/group/group.html Irwin Kuntz’s group] in the 1980s. The original authors of DOCK include [http://shoichetlab.compbio.ucsf.edu/ Brian K. Shoichet], [http://casegroup.rutgers.edu/ David A. Case], and [http://ringo.ams.sunysb.edu/~rizzo/StonyBrook/index.html Robert C.Rizzo]. Molecular docking plays a key role in computer aided drug design (CADD). Generally, molecular docking allows us to explore the interactions between small molecules, or ligands, and proteins by evaluating the shape and energetics of the ligand. We will be using the 6.9 release of DOCK, the latest as of November 2018, for this tutorial. New features of DOCK6.9 include DOCK_DN, a new ligand searching method for ligand de novo design and fragment library generation, which allows users to generate their own fragment libraries from mol2 files. For a comprehensive list of what’s new in DOCK6.9, [http://dock.compbio.ucsf.edu/DOCK_6/new_in_6.9.txt click here.]
 +
 
 +
In this tutorial, we will be using DOCK to predict the interactions between a small molecule inhibitor and a protein target. Here is the general workflow that we will be following:
 +
*Download a PDB file from the [https://www.rcsb.org/ Protein Data Bank]
 +
*Add protons and partial charges to the small molecule inhibitor and the target (independently).
 +
*Calculate the molecular surface of the target protein.
 +
*Generate spheres surrounding the binding site.
 +
*Calculate the bounding box and energy grid for the target protein.
 +
*Match the ligand atoms to the sphere centers.
 +
*Score the putative ligand atom centers using the energy grid.
 +
*Rank the orientations by score.
 +
 
 
==Chimera==
 
==Chimera==
 
UCSF Chimera is a standalone program that allows us to visualize and analyze macromolecules. General features of Chimera include automatic identification of atom, hydrogen addition and partial charge assignment, measurements of distances, angles, surface area, volume, and many more. For a complete description of everything Chimera can do, see https://www.rbvi.ucsf.edu/chimera/.  
 
UCSF Chimera is a standalone program that allows us to visualize and analyze macromolecules. General features of Chimera include automatic identification of atom, hydrogen addition and partial charge assignment, measurements of distances, angles, surface area, volume, and many more. For a complete description of everything Chimera can do, see https://www.rbvi.ucsf.edu/chimera/.  
 
==1EFY==
 
==1EFY==
In this tutorial, we will be working with the crystal structure of a member of the Poly (ADP-Ribose) polymerase (PARP) family. Specifically, we will be working with the catalytic fragment of PARP along with a benzimidazole inhibitor. PARP family proteins are involved in DNA repair. It has been shown that PARP may allow cancer cells to repair their DNA after exposure to chemotherapy, which allows for resistance to develop. As such, it is of particular interest to develop novel inhibitors for PARP.  
+
In this tutorial, we will be working with the crystal structure of a member of the Poly (ADP-Ribose) polymerase (PARP) family. Specifically, we will be working with the catalytic fragment of PARP along with a benzimidazole inhibitor. This crystal structure has a resolution of 2.20 Å and an R-Value free of 0.274. PARP family proteins are involved in DNA repair, and it has been shown that they may allow cancer cells to repair their DNA after exposure to chemotherapy, which allows for resistance to develop [https://pubs.acs.org/doi/10.1021/jm000950v]. As such, it is of particular interest to develop novel inhibitors for PARP. The goal for this tutorial is to demonstrate an end-to-end example of docking the benzimidazole inhibitor against the PARP target, as well as perform a virtual screen to find other small molecule inhibitors of PARP.
Goal for this tutorial: The goal for this tutorial is to demonstrate an end to end example of performing a virtual screen  
+
 
 +
==Software Requirements==
 +
In order to successfully follow this tutorial, please make sure to have the following programs installed:
 +
*[https://mobaxterm.mobatek.net/ MobaXterm] (for Windows users)
 +
*[https://www.cgl.ucsf.edu/chimera/download.html Chimera 1.15] (Please choose the right specifications for your OS)
 +
*DOCK 6.9
 +
 
 +
We will be using a shell environment (either the built-in terminal for MacOS/Linux users or MobaXTerm for Windows users) to run several of the programs in this tutorial. As such, it may be useful to familiarize yourself with running [https://lweb.cfa.harvard.edu/rtdc/shell_cmds/basic/ basic UNIX commands] and [https://www.openvim.com/ VIM].
 +
 
 +
In addition, this tutorial assumes you have access to a High Performance Computing cluster, i.e. [https://it.stonybrook.edu/help/kb/understanding-seawulf Seawulf].
 +
 
 
==Directory Setup==
 
==Directory Setup==
Before we begin our molecular docking and virtual screen, we should setup our directory structure:
+
Before we begin our molecular docking and virtual screen, it will be useful to establish our directory structure.
  mkdir INSERT DIRECTORY NAMES
+
 
 +
First ssh into your account on Seawulf:
 +
ssh <your netid>@login.seawulf.stonybrook.edu
 +
Navigate into your student directory using the '''cd''' command.
 +
Once you are in your student directory, create a new directory for this tutorial and name it 1EFY. The shell command to make a new directory is '''mkdir'''.
 +
mkdir 1EFY
 +
Now move into the 1EFY directory. Next, we will be creating the following directories:
 +
  mkdir 001.structure 002.surface_spheres 003.gridbox 004.dock 005.virtual_screen 006.virtual_screen_mpi 007.cartesian_min 008.rescore
 +
In order to move back up the tree out of the current directory, use:
 +
cd ../
 +
 
 
=Preparing the Ligand and Receptor=
 
=Preparing the Ligand and Receptor=
 
==PDB Structure==
 
==PDB Structure==
Visit https://www.rcsb.org/ and search 1EFY.
+
For this tutorial, we will be utilizing the Protein Data Bank for information about our target protein. The Protein Databank is a database containing the structural data for biological macromolecules. Visit the [https://www.rcsb.org/ Protein Data Bank] and search for 1EFY. You should be redirected to a page containing the X-ray crystal structure of our protein, as well as the primary literature about the protein.
Click on Download files > PDB Format
+
 
This file contains the crystal structure of the catalytic fragment of Poly (ADP-ribose) Polymerase complexed with a benzimidazole inhibitor.
+
Click on '''Download files > PDB Format.'''
 +
This file contains the crystal structure of the catalytic fragment of Poly (ADP-ribose) Polymerase complexed with a benzimidazole inhibitor. It contains information about the atomic coordinates, secondary structure assignments, and atomic connectivity of the protein and ligand.
 +
 
 
We will be visualizing the structure in Chimera.
 
We will be visualizing the structure in Chimera.
Open Chimera and click File > Open
+
Open Chimera and click '''File > Open.'''
Select the PDB file that you have just downloaded
+
Once we select the PDB file that we have just downloaded, we should see the following.
[[File:Unedited_pdb.png|thumb|center|800px]]
+
[[File:Unedited_pdb.png|thumb|center|700px]]
==Receptor==
+
We can see the receptor, the benzimidazole inhibitor, and two water molecules. In order to see the complex from different angles or positions, we may perform three dimensional manipulations. It is possible to rotate, translate, or scale the the view of the structures using the mouse. Please refer to [https://www.cgl.ucsf.edu/chimera/docs/UsersGuide/mouse.html this documentation] to see which mouse controls are appropriate for your setup.
In order to prepare our system for DOCK, we must perform some preprocessing steps. Broadly speaking, we will be creating two new mol2 files out of the original PDB file: one for the receptor and one for the ligand. Both of these new structures will be “clean,” that is, they will not contain any of the extra ions that were present during the experiment but not biologically relevant.
+
 
We will first create the mol2 file for the receptor. Using the original PDB structure, we will remove everything but the receptor. There are many ways to do this, but the easiest way is the following:
+
We will next describe how to separate the receptor and ligand into two separate MOL2 formatted files.
Click Select > Residue > all nonstandard
+
 
The benzimidazole inhibitor and the two water molecules should now be highlighted like so  
+
==Preparing the Receptor==
[[File:Selected_residues.png|thumb|center|800px]]
+
In order to prepare our system for DOCK, we must perform some preprocessing steps, since DOCK can only process Tripos MOL2 formatted files as input. Luckily, we can use Chimera to create two new MOL2 files out of the original PDB file: one for the receptor and one for the ligand. In addition, prior to docking, we must also use Chimera to specify the charges for every atom in the receptor and ligand.  
 +
 
 +
Let's first create the MOL2 file for the receptor. Using the original PDB structure, we will strip everything but the receptor. In Chimera, choose
 +
Select > Residue > all nonstandard
 +
The benzimidazole inhibitor and the two water molecules should now be highlighted like this:  
 +
[[File:Selected_residues.png|thumb|center|600px]]
 +
Again in Chimera, select
 +
Actions > Atoms/Bonds > delete
 +
This leaves us with a “cleaned” version of the receptor:
 +
[[File:1efy_deleted_residues.png|thumb|center|600px]]
 +
We can now save the receptor as a MOL2 file. Click
 +
File > Save Mol2… 
 +
Save the file as '''1EFY_rec_woH.mol2''' where '''woH''' indicates the receptor without hydrogens.
 +
Next we must add hydrogens and charges, which is required for docking. Select
 +
Tools > Structure Editing > AddH
 +
A window should pop up like this:
 +
[[File:1efy_add_hydrogens.png|thumb|center|600px]]
 +
Select '''OK.''' Now each residue in the receptor should be properly protonated. However, it is important that we validate the hydrogens that Chimera has added, as they may not be correct.
 +
 
 +
Similarly, we must add partial charges to each atom in the receptor. Select
 +
Tools > Structure Editing > Add Charge.
 +
Make sure that '''AM1-BCC''' is selected in the window that pops up and select '''OK.'''
 +
Now each atom of the receptor should have the appropriate partial charge. As with the protonation states, it is important to double check Chimera’s work.
 +
 
 +
This receptor, which has now been prepared for DOCK, can be saved as a MOL2 file named '''1EFY_rec_dockprep.mol2.'''
 +
 
 +
==Ligand Preparation==
 +
We will now prepare the ligand in a similar fashion.
 +
Open the original PDB structure in Chimera.
 +
In this case, we will be deleting all of the constituents in the structure ''except'' for the benzimidazole inhibitor. Click
 +
Select > Residue > BZC.
 +
This should highlight the inhibitor. Click
 +
Select > Invert (all models).
 +
This should highlight everything ''except'' the inhibitor. Click
 +
Actions > Atoms/Bonds > delete
 +
Only the ligand should now be present:
 +
[[File:1efy_ligand_noH.png|thumb|center|600px]]
 +
We will save this structure as MOL2 file.
 +
File > save as mol2 > 1EFY_ligand_noH.mol2
 +
where noH indicates no hydrogens.
 +
Just as before, we must now add the hydrogens and charge.
 +
Tools > Structure Editing > AddH
 +
The ouput should look like:
 +
[[File:bzc_wH.png|thumb|center|600px]]
 +
In its current state, the ligand protonation is incorrect and will lead to complications during docking. To fix this it is important to forcibly protonate the N4 atom.
 +
Tool > Structure Editing > Build Structure
 +
Set the number of bonds at position N4 to 3
 +
[[File:1efyBuildStructure.PNG|thumb|center|600px]]
 +
Click Apply. Add charges as shown previously
 +
We will save this charged and protonated structure as a MOL2 file:
 +
File > save as mol2 > 1EFY_ligand_H.mol2
 +
[[File:1efyligandwH.PNG|thumb|center|600px]]
 +
Now that we have prepared our files for DOCK, we can scp or rsync all four of the files into the 001.structure directory.
 +
 
 +
=Surface Generation and Sphere Selection=
 +
Identification of the binding site on a receptor is driven by identifying its concave surfaces. Once the molecular surface has been identified, we can generate overlapping spheres to describe the negative image of the molecular surface. The centers of the spheres composing the largest cluster represent potential positions where the ligand can bind.
 +
==Calculate the Molecular Surface==
 +
In Chimera, we can calculate the molecular surface according to the following steps:
 +
Open '''1EFY_rec_woH.mol2''' in Chimera. Select
 +
Actions > Surface > Show
 +
[[File:1efy_surface.png|thumb|center|600px]]
 +
This shows the solvent excluded molecular surface of the receptor.
 +
We can use DMS, an open source program, to calculate the molecular surface of the receptor. Select
 +
Tools > Structure Editing > Write DMS
 +
Save the file name as '''1EFY_rec_surphace.dms'''
 +
This file can be uploaded to 002.surface_spheres.
 +
==Generate Spheres==
 +
Now that the molecular surface has been calculated and written to the DMS file, we can construct spheres to fill in any concave regions of the surface of the receptor by using the '''sphgen''' program in DOCK. The centers of the spheres represent putative positions of ligand atoms. Navigate to the 002.surface_spheres directory.
 +
The template for sphere generation using sphgen looks like the following:
 +
INPUT FILE*:
 +
rec.ms #molecular surface file
 +
R #sphere outside of surface (R) or inside surface (L)
 +
X #specifies subset of surface points to be used (X=all points)
 +
0.0 #prevents generation of large spheres with close surface contacts (default=0.0)
 +
4.0 #maximum sphere radius in angstroms (default=4.0)
 +
1.4 #minimum sphere radius in angstroms (default=radius of probe)
 +
rec.sph #clustered spheres file
 +
Our file looked like the following:
 +
1efy_rec_surface.dms
 +
R
 +
X
 +
0.0
 +
4.0
 +
1.4
 +
1EFY_rec_woH.sph
 +
 
 +
You can run the sphgen program like this:
 +
sphgen -i INSPH -o OUTSPH
 +
Chimera can read the output of the sphgen program.
 +
 
 +
The generated spheres should look like the following:
 +
[[File:1efy_spheres.png|thumb|center|600px]]
 +
 
 +
Now, the juxtaposition of the surface spheres with the receptor should look like the following:
 +
Notice how the spheres are clustered near the active site.
 +
[[File:1efy_wspheres.png|thumb|center|600px]]
 +
 
 +
=Generating Box and Grid=
 +
Many individual floating point calculations are required for the computation of Lennard-Jones and Coulombic Interactions requisite for docking.
 +
In order to make these calculations feasible, we can calculate the  overlapping volume of two sets of atoms using a grid based method. The intuition behind this method is the following:
 +
*Generate a bounding box that contains all of the atoms.
 +
*Generate grid points with a predefined separation.
 +
 
 +
==Generating the Box==
 +
We will be using the DOCK's accessory programs '''showbox''' and '''grid''' in order to generate the grids required for grid based scoring (which is based on the non-bonded terms of the molecular mechanic force field).
 +
The first step is to generate a box enclosing the active site of the receptor.
 +
Create an input file for showbox in the 003.gridbox directory.
 +
touch showbox.in
 +
 
 +
10.0
 +
../002.surface_spheres/selected_spheres.sph
 +
1
 +
receptor.box.pdb
 +
 
 +
 
 +
We can pipe in the input file into the showbox with the command '''showbox < box.in'''
 +
The output of the program will be a PDB file containing the structure of the bounding box, which can be visualized in Chimera.
 +
 
 +
==Generating the Grid==
 +
Next we will be generating the grid. In the same directory, create an input file called '''grid.in'''
 +
touch grid.in
 +
Now, edit the input file such that it looks like the following:
 +
compute_grids                            yes
 +
grid_spacing                              0.4
 +
output_molecule                          no
 +
contact_score                            no
 +
energy_score                              yes
 +
energy_cutoff_distance                    9999
 +
atom_model                                a
 +
attractive_exponent                      6
 +
repulsive_exponent                        9
 +
distance_dielectric                      yes
 +
dielectric_factor                        4
 +
bump_filter                              yes
 +
bump_overlap                              0.75
 +
receptor_file                            ../001.structure/1EFY_rec_dockprep.mol2
 +
box_file                                  1EFY.box.pdb
 +
vdw_definition_file                      /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 +
score_grid_prefix                        grid
 +
In order to run the grid program, use the following command:
 +
grid -i grid.in -o gridinfo.out
 +
Note that the grid calculation can take up to 45 minutes.
 +
 
 +
The output of the showbox and grid calculation can be visualized in Chimera as the following:
 +
[[File:1efy_box.png|thumb|center|600px]]
 +
 
 +
=Single Molecule Docking and Pose Reproduction=
 +
==Energy Minimization==
 +
In order to meaningfully dock new ligands into the receptor, the current model must be verified for its accuracy. To accomplish this, the co-crystallized ligand will be energy minimized and docked into the 1efy receptor. The resulting pose should closely resemble the crystal structure. If significant deviations are noted, the ligand and receptor should be scrutinized for correct charge, protonation, and resonance.
 +
 
 +
Move to the 004.dock directory and create the input file:
 +
touch min.in
 +
 
 +
dock6 -i min.in
 +
 
 +
Input the following
 +
conformer_search_type                                        rigid
 +
use_internal_energy                                          yes
 +
internal_energy_rep_exp                                      9
 +
internal_energy_cutoff                                      100.0
 +
ligand_atom_file                                            ../001.structure/1EFY_ligand_H.mol2
 +
limit_max_ligands                                            no
 +
skip_molecule                                                no
 +
read_mol_solvation                                          no
 +
calculate_rmsd                                              yes
 +
use_rmsd_reference_mol                                      yes     
 +
rmsd_reference_filename                                      ../001.structure/1EFY_ligand_H.mol2
 +
use_database_filter                                          no
 +
orient_ligand                                                no
 +
bump_filter                                                  no
 +
score_molecules                                              yes
 +
contact_score_primary                                        no
 +
contact_score_secondary                                      no
 +
grid_score_primary                                          yes
 +
grid_score_secondary                                        no
 +
grid_score_rep_rad_scale                                    1
 +
grid_score_vdw_scale                                        1
 +
grid_score_es_scale                                          1
 +
grid_score_grid_prefix                                      ../003.gridbox/grid
 +
multigrid_score_secondary                                    no
 +
dock3.5_score_secondary                                      no
 +
continuous_score_secondary                                  no
 +
footprint_similarity_score_secondary                        no
 +
pharmacophore_score_secondary                                no
 +
descriptor_score_secondary                                  no
 +
gbsa_zou_score_secondary                                    no
 +
gbsa_hawkins_score_secondary                                no
 +
SASA_score_secondary                                        no
 +
amber_score_secondary                                        no
 +
minimize_ligand                                              yes
 +
simplex_max_iterations                                      1000
 +
simplex_tors_premin_iterations                              0
 +
simplex_max_cycles                                          1
 +
simplex_score_converge                                      0.1
 +
simplex_cycle_converge                                      1.0
 +
simplex_trans_step                                          1.0
 +
simplex_rot_step                                            0.1
 +
simplex_tors_step                                            10.0
 +
simplex_random_seed                                          0
 +
simplex_restraint_min                                        yes
 +
simplex_coefficient_restraint                                10.0
 +
atom_model                                                  all
 +
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 +
flex_defn_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 +
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 +
ligand_outfile_prefix                                        1EFY_ligand_min
 +
write_orientations                                          no
 +
num_scored_conformers                                        1
 +
rank_ligands                                                no
 +
 
 +
==Rigid Docking==
 +
Rigid docking is the simplest and lowest cost of the three main docking methods, as it essentially attempts to dock the desired ligand in its native pose, only allowing 3d transformation and rotations, as well as very minor torsional corrections.
 +
 
 +
Create the input file
 +
touch rigid.in
 +
 
 +
dock6 -i rigid.in
 +
 
 +
conformer_search_type                                        rigid
 +
use_internal_energy                                          yes
 +
internal_energy_rep_exp                                      12
 +
internal_energy_cutoff                                      100.0
 +
ligand_atom_file                                            1EFY_ligand_min_scored.mol2
 +
limit_max_ligands                                            no
 +
skip_molecule                                                no
 +
read_mol_solvation                                          no
 +
calculate_rmsd                                              yes
 +
use_rmsd_reference_mol                                      yes
 +
rmsd_reference_filename                                      1EFY_ligand_min_scored.mol2
 +
use_database_filter                                          no
 +
orient_ligand                                                yes
 +
automated_matching                                          yes
 +
receptor_site_file                                          ../002.surface_spheres/selected_spheres.sph
 +
max_orientations                                            1000
 +
critical_points                                              no
 +
chemical_matching                                            no
 +
use_ligand_spheres                                          no
 +
bump_filter                                                  no
 +
score_molecules                                              yes
 +
contact_score_primary                                        no
 +
contact_score_secondary                                      no
 +
grid_score_primary                                          yes
 +
grid_score_secondary                                        no
 +
grid_score_rep_rad_scale                                    1
 +
grid_score_vdw_scale                                        1
 +
grid_score_es_scale                                          1
 +
grid_score_grid_prefix                                      ../003.gridbox/grid
 +
multigrid_score_secondary                                    no
 +
dock3.5_score_secondary                                      no
 +
continuous_score_secondary                                  no
 +
footprint_similarity_score_secondary                        no
 +
pharmacophore_score_secondary                                no
 +
descriptor_score_secondary                                  no
 +
gbsa_zou_score_secondary                                    no
 +
gbsa_hawkins_score_secondary                                no
 +
SASA_score_secondary                                        no
 +
amber_score_secondary                                        no
 +
minimize_ligand                                              yes
 +
simplex_max_iterations                                      1000
 +
simplex_tors_premin_iterations                              0
 +
simplex_max_cycles                                          1
 +
simplex_score_converge                                      0.1
 +
simplex_cycle_converge                                      1.0
 +
simplex_trans_step                                          1.0
 +
simplex_rot_step                                            0.1
 +
simplex_tors_step                                            10.0
 +
simplex_random_seed                                          0
 +
simplex_restraint_min                                        no
 +
atom_model                                                  all
 +
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 +
flex_defn_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 +
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 +
ligand_outfile_prefix                                        rigid.out
 +
write_orientations                                          no
 +
num_scored_conformers                                        1
 +
rank_ligands                                                no
 +
 
 +
At this point, it is important to visualize the resulting pose in comparison to the crystallized ligand. Open both in Chimera along with the receptor file to ensure no major changes have occurred
 +
 
 +
[[File:1efy rigid.png|thumb|center|800px]]
 +
 
 +
==Fixed Anchor Docking==
 +
In fixed anchor docking, the ligand of interest is "anchored" into the binding site of the receptor, and rotatable bonds and torsions are adjusted to minimize the energy of the structure. As this method allows for greater variance of the ligand, but still maintains a particular location in the receptor, this method blends accuracy and efficiency when compared to rigid and flexible methods.
 +
 
 +
Create an input file
 +
touch fixed.in
 +
 +
 
 +
dock6 -i fixed.in
 +
 
 +
conformer_search_type                                        flex
 +
write_fragment_libraries                                    no
 +
user_specified_anchor                                        no
 +
limit_max_anchors                                            no
 +
min_anchor_size                                              5
 +
pruning_use_clustering                                      yes
 +
pruning_max_orients                                          1000
 +
pruning_clustering_cutoff                                    100
 +
pruning_conformer_score_cutoff                              100.0
 +
pruning_conformer_score_scaling_factor                      1.0
 +
use_clash_overlap                                            no
 +
write_growth_tree                                            no
 +
use_internal_energy                                          yes
 +
internal_energy_rep_exp                                      12
 +
internal_energy_cutoff                                      100.0
 +
ligand_atom_file                                            1EFY_ligand_min_scored.mol2
 +
limit_max_ligands                                            no
 +
skip_molecule                                                no
 +
read_mol_solvation                                          no
 +
calculate_rmsd                                              yes
 +
use_rmsd_reference_mol                                      yes
 +
rmsd_reference_filename                                      1EFY_ligand_min_scored.mol2
 +
use_database_filter                                          no
 +
orient_ligand                                                no
 +
bump_filter                                                  no
 +
score_molecules                                              yes
 +
contact_score_primary                                        no
 +
contact_score_secondary                                      no
 +
grid_score_primary                                          yes
 +
grid_score_secondary                                        no
 +
grid_score_rep_rad_scale                                    1
 +
grid_score_vdw_scale                                        1
 +
grid_score_es_scale                                          1
 +
grid_score_grid_prefix                                      ../003.gridbox/grid
 +
multigrid_score_secondary                                    no
 +
dock3.5_score_secondary                                      no
 +
continuous_score_secondary                                  no
 +
footprint_similarity_score_secondary                        no
 +
pharmacophore_score_secondary                                no
 +
descriptor_score_secondary                                  no
 +
gbsa_zou_score_secondary                                    no
 +
gbsa_hawkins_score_secondary                                no
 +
SASA_score_secondary                                        no
 +
amber_score_secondary                                        no
 +
minimize_ligand                                              yes
 +
minimize_anchor                                              yes
 +
minimize_flexible_growth                                    yes
 +
use_advanced_simplex_parameters                              no
 +
simplex_max_cycles                                          1
 +
simplex_score_converge                                      0.1
 +
simplex_cycle_converge                                      1
 +
simplex_trans_step                                          1
 +
simplex_rot_step                                            0.1
 +
simplex_tors_step                                            10.0
 +
simplex_anchor_max_iterations                                500
 +
simplex_grow_max_iterations                                  500
 +
simplex_grow_tors_premin_iterations                          0
 +
simplex_random_seed                                          0
 +
simplex_restraint_min                                        no
 +
atom_model                                                  all
 +
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 +
flex_defn_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 +
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 +
ligand_outfile_prefix                                        fixed.out
 +
write_orientations                                          no
 +
num_scored_conformers                                        1
 +
rank_ligands                                                no
 +
 
 +
[[File:1efyfixedanchor.PNG|thumb|center|800px]]
 +
 
 +
==Flexible Docking==
 +
In flexible docking, the rotatable bonds and torsions, as well as the ligand location is searched within the defined box and grid. For this reason, flexible methods are typically the most meaningful of the three methods when docking new ligands to the receptor of interest.
 +
 
 +
Create and input file:
 +
touch flex.in
 +
 
 +
dock6 -i flex.in
 +
 
 +
conformer_search_type                                        flex
 +
write_fragment_libraries                                    no
 +
user_specified_anchor                                        no
 +
limit_max_anchors                                            no
 +
min_anchor_size                                              5
 +
pruning_use_clustering                                      yes
 +
pruning_max_orients                                          1000
 +
pruning_clustering_cutoff                                    100
 +
pruning_conformer_score_cutoff                              100.0
 +
pruning_conformer_score_scaling_factor                      1.0
 +
use_clash_overlap                                            no
 +
write_growth_tree                                            no
 +
use_internal_energy                                          yes
 +
internal_energy_rep_exp                                      12
 +
internal_energy_cutoff                                      100.0
 +
ligand_atom_file                                            1EFY_ligand_min_scored.mol2
 +
limit_max_ligands                                            no
 +
skip_molecule                                                no
 +
read_mol_solvation                                          no
 +
calculate_rmsd                                              yes
 +
use_rmsd_reference_mol                                      yes
 +
rmsd_reference_filename                                      1EFY_ligand_min_scored.mol2
 +
use_database_filter                                          no
 +
orient_ligand                                                yes
 +
automated_matching                                          yes
 +
receptor_site_file                                          ../002.surface_spheres/selected_spheres.sph
 +
max_orientations                                            1000
 +
critical_points                                              no
 +
chemical_matching                                            no
 +
use_ligand_spheres                                          no
 +
bump_filter                                                  no
 +
score_molecules                                              yes
 +
contact_score_primary                                        no
 +
contact_score_secondary                                      no
 +
grid_score_primary                                          yes
 +
grid_score_secondary                                        no
 +
grid_score_rep_rad_scale                                    1
 +
grid_score_vdw_scale                                        1
 +
grid_score_es_scale                                          1
 +
grid_score_grid_prefix                                      ../003.gridbox/grid
 +
multigrid_score_secondary                                    no
 +
dock3.5_score_secondary                                      no
 +
continuous_score_secondary                                  no
 +
footprint_similarity_score_secondary                        no
 +
pharmacophore_score_secondary                                no
 +
descriptor_score_secondary                                  no
 +
gbsa_zou_score_secondary                                    no
 +
gbsa_hawkins_score_secondary                                no
 +
SASA_score_secondary                                        no
 +
amber_score_secondary                                        no
 +
minimize_ligand                                              yes
 +
minimize_anchor                                              yes
 +
minimize_flexible_growth                                    yes
 +
use_advanced_simplex_parameters                              no
 +
simplex_max_cycles                                          1
 +
simplex_score_converge                                      0.1
 +
simplex_cycle_converge                                      1.0
 +
simplex_trans_step                                          1.0
 +
simplex_rot_step                                            0.1
 +
simplex_tors_step                                            10.0
 +
simplex_anchor_max_iterations                                500
 +
simplex_grow_max_iterations                                  500
 +
simplex_grow_tors_premin_iterations                          0
 +
simplex_random_seed                                          0
 +
simplex_restraint_min                                        no
 +
atom_model                                                  all
 +
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 +
flex_defn_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 +
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 +
ligand_outfile_prefix                                        flex.out
 +
write_orientations                                          no
 +
num_scored_conformers                                        1
 +
rank_ligands                                                no
 +
 
 +
[[File:1efyflex.PNG|thumb|center|800px]]
 +
 
 +
==Molecular Footprint==
 +
A typical molecular footprint plot shows the van der waals and electrostatic interaction between the ligand and nearby amino acid residues. This is useful for determining which interactions are important for suitable affinity of the ligand. This also serves as an additional method to ensure the ligand wasn't significantly distorted in the minimization process.
 +
 
 +
First, generate an input file
 +
 
 +
touch footprint.in
 +
 
 +
dock6 -i footprint.in
 +
 
 +
conformer_search_type                                        rigid
 +
use_internal_energy                                          no
 +
ligand_atom_file                                            1EFY_ligand_min_scored.mol2
 +
limit_max_ligands                                            no
 +
skip_molecule                                                no
 +
read_mol_solvation                                          no
 +
calculate_rmsd                                              no
 +
use_database_filter                                          no
 +
orient_ligand                                                no
 +
bump_filter                                                  no
 +
score_molecules                                              yes
 +
contact_score_primary                                        no
 +
contact_score_secondary                                      no
 +
grid_score_primary                                          no
 +
grid_score_secondary                                        no
 +
multigrid_score_primary                                      no
 +
multigrid_score_secondary                                    no
 +
dock3.5_score_primary                                        no
 +
dock3.5_score_secondary                                      no
 +
continuous_score_primary                                    no
 +
continuous_score_secondary                                  no
 +
footprint_similarity_score_primary                          yes
 +
footprint_similarity_score_secondary                        no
 +
fps_score_use_footprint_reference_mol2                      yes
 +
fps_score_footprint_reference_mol2_filename                  ../001.structure/1EFY_ligand_H.mol2
 +
fps_score_foot_compare_type                                  Euclidean
 +
fps_score_normalize_foot                                    no
 +
fps_score_foot_comp_all_residue                              yes
 +
fps_score_receptor_filename                                  ../001.structure/1efy_rec_dockprep.mol2
 +
fps_score_vdw_att_exp                                        6
 +
fps_score_vdw_rep_exp                                        9
 +
fps_score_vdw_rep_rad_scale                                  1
 +
fps_score_use_distance_dependent_dielectric                  yes
 +
fps_score_dielectric                                        4.0
 +
fps_score_vdw_fp_scale                                      1
 +
fps_score_es_fp_scale                                        1
 +
fps_score_hb_fp_scale                                        0
 +
pharmacophore_score_secondary                                no
 +
descriptor_score_secondary                                  no
 +
gbsa_zou_score_secondary                                    no
 +
gbsa_hawkins_score_secondary                                no
 +
SASA_score_secondary                                        no
 +
amber_score_secondary                                        no
 +
minimize_ligand                                              no
 +
atom_model                                                  all
 +
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 +
flex_defn_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 +
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 +
ligand_outfile_prefix                                        footprint.out
 +
write_footprints                                            yes
 +
write_hbonds                                                yes
 +
write_orientations                                          no
 +
num_scored_conformers                                        1
 +
rank_ligands                                                no
 +
 
 +
Upon completion, several .txt files should be produced. In order to convert these to .pdf files for easy viewing, use the plot_footprint_single_magnitude.py python script to show the results for the 50 highest scoring residues. The result should look similar to this:
 +
 
 +
[[File:1efyfootprint.PNG|thumb|center|800px]]
 +
 
 +
=Virtual Screen=
 +
Virtual screening is a widely used method for screening many drug-like molecules from a curated library for binding affinity via Dock in order to identify potential lead molecules. In this example we will be utilizing the VS_library_5k.mol2 library which contains over 5,000 drug-like molecules from the ZINC database. This library can be found in the /gpfs/projects/AMS536/2021/ directory. Copy this file into the 005.virtual_screen directory.
 +
 
 +
Create an input file:
 +
 
 +
touch virtual.in
 +
 
 +
dock6 -i virtual.in
 +
 
 +
conformer_search_type                                        flex
 +
write_fragment_libraries                                    no
 +
user_specified_anchor                                        no
 +
limit_max_anchors                                            no
 +
min_anchor_size                                              5
 +
pruning_use_clustering                                      yes
 +
pruning_max_orients                                          1000
 +
pruning_clustering_cutoff                                    100
 +
pruning_conformer_score_cutoff                              100.0
 +
pruning_conformer_score_scaling_factor                      1.0
 +
use_clash_overlap                                            no
 +
write_growth_tree                                            no
 +
use_internal_energy                                          yes
 +
internal_energy_rep_exp                                      9
 +
internal_energy_cutoff                                      100.0
 +
ligand_atom_file                                            VS_library_25K.mol2
 +
limit_max_ligands                                            no
 +
skip_molecule                                                no
 +
read_mol_solvation                                          no
 +
calculate_rmsd                                              no
 +
use_database_filter                                          no
 +
orient_ligand                                                yes
 +
automated_matching                                          yes
 +
receptor_site_file                                          ../002.surface_spheres/selected_spheres.sph
 +
max_orientations                                            1000
 +
critical_points                                              no
 +
chemical_matching                                            no
 +
use_ligand_spheres                                          no
 +
bump_filter                                                  no
 +
score_molecules                                              yes
 +
contact_score_primary                                        no
 +
contact_score_secondary                                      no
 +
grid_score_primary                                          yes
 +
grid_score_secondary                                        no
 +
grid_score_rep_rad_scale                                    1
 +
grid_score_vdw_scale                                        1
 +
grid_score_es_scale                                          1
 +
grid_score_grid_prefix                                      ../003.gridbox/grid
 +
multigrid_score_secondary                                    no
 +
dock3.5_score_secondary                                      no
 +
continuous_score_secondary                                  no
 +
footprint_similarity_score_secondary                        no
 +
pharmacophore_score_secondary                                no
 +
descriptor_score_secondary                                  no
 +
gbsa_zou_score_secondary                                    no
 +
gbsa_hawkins_score_secondary                                no
 +
SASA_score_secondary                                        no
 +
amber_score_secondary                                        no
 +
minimize_ligand                                              yes
 +
minimize_anchor                                              yes
 +
minimize_flexible_growth                                    yes
 +
use_advanced_simplex_parameters                              no
 +
simplex_max_cycles                                          1
 +
simplex_score_converge                                      0.1
 +
simplex_cycle_converge                                      1.0
 +
simplex_trans_step                                          1.0
 +
simplex_rot_step                                            0.1
 +
simplex_tors_step                                            10.0
 +
simplex_anchor_max_iterations                                500
 +
simplex_grow_max_iterations                                  500
 +
simplex_grow_tors_premin_iterations                          0
 +
simplex_random_seed                                          0
 +
simplex_restraint_min                                        no
 +
atom_model                                                  all
 +
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 +
flex_defn_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 +
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 +
ligand_outfile_prefix                                        virtual.out
 +
write_orientations                                          no
 +
num_scored_conformers                                        1
 +
rank_ligands                                                no
 +
 
 +
 
 +
Allow the job to run for approximately a minute to ensure that it is working properly. '''Important: After ensuring the job is working, press ctrl+c to stop the job''' As docking several thousand molecules to the 1efy receptor requires a significant amount of computing power, we do not want to run the job on the head node. We will use slurm scripts to submit the job to the seawulf cluster. Before determining which queue to use, it may be beneficial to determine their availability. Use the command ''sinfo'' and look for queues with idle nodes. In addition, to prevent your job from being automatically rejected use [https://it.stonybrook.edu/help/kb/seawulf-queues] to determine the max time, min/max nodes, min/max cores, etc. for each particular queue. SCP the VS_library_5k.mol2 and virtual.in file into the 006.virtual_screen_mpi directory.
 +
 
 +
To submit the job to the queue create the following input file:
 +
vi virtual.sh
 +
 
 +
#!/bin/bash
 +
#SBATCH --time=1-8:00:00
 +
#SBATCH --nodes=1
 +
#SBATCH --ntasks=40
 +
#SBATCH --job-name=1EFY_vs
 +
#SBATCH --output=1EFY_vs.out
 +
#SBATCH -p extended-40core
 +
#SBATCH --mail-type=BEGIN,END
 +
#SBATCH --mail-user=EXAMPLE@stonybrook.edu
 +
 
 +
cd $SLURM_SUBMIT_DIR
 +
mpirun -np 40 dock6.mpi -i virtual.in -o virtual.out
 +
 
 +
This script will submit the job to the extended-40core queue, using 1 node, 40 cores, for a maximum of 8 hours. In addition, you can input your email address in the line '''#SBATCH --mail-user=EXAMPLE@stonybrook.edu''' , and you will be emailed at the start and end of the job.
 +
 
 +
Submit the job using :
 +
qsub virtual.sh
 +
 
 +
You can check the status of the job using:
 +
squeue -u NETID
 +
 
 +
And inputting your netid where the command says NETID.
 +
 
 +
=Cartesian Minimization=
 +
Once we have docked the library to the receptor, it is then necessary to perform a cartesian minimization of them. Move to the 007.cartesian_min directory.
 +
 
 +
Create the following input file:
 +
touch min.in
 +
 
 +
dock6 -i min.in
 +
 
 +
conformer_search_type                                        rigid
 +
use_internal_energy                                          yes
 +
internal_energy_rep_exp                                      9
 +
internal_energy_cutoff                                      100.0
 +
ligand_atom_file                                            ../006.virtual_screen_mpi/virtual.out_scored.mol2
 +
limit_max_ligands                                            no
 +
skip_molecule                                                no
 +
read_mol_solvation                                          no
 +
calculate_rmsd                                              no
 +
use_database_filter                                          no
 +
orient_ligand                                                no
 +
bump_filter                                                  no
 +
score_molecules                                              yes
 +
contact_score_primary                                        no
 +
contact_score_secondary                                      no
 +
grid_score_primary                                          no
 +
grid_score_secondary                                        no
 +
multigrid_score_primary                                      no
 +
multigrid_score_secondary                                    no
 +
dock3.5_score_primary                                        no
 +
dock3.5_score_secondary                                      no
 +
continuous_score_primary                                    yes
 +
continuous_score_secondary                                  no
 +
cont_score_rec_filename                                      ../001.structures/1EFY_rec_dockprep.mol2
 +
cont_score_att_exp                                          6
 +
cont_score_rep_exp                                          9
 +
cont_score_rep_rad_scale                                    1
 +
cont_score_use_dist_dep_dielectric                          yes
 +
cont_score_dielectric                                        4.0
 +
cont_score_vdw_scale                                        1
 +
cont_score_es_scale                                          1
 +
footprint_similarity_score_secondary                        no
 +
pharmacophore_score_secondary                                no
 +
descriptor_score_secondary                                  no
 +
gbsa_zou_score_secondary                                    no
 +
gbsa_hawkins_score_secondary                                no
 +
SASA_score_secondary                                        no
 +
amber_score_secondary                                        no
 +
minimize_ligand                                              yes
 +
simplex_max_iterations                                      1000
 +
simplex_tors_premin_iterations                              0
 +
simplex_max_cycles                                          1
 +
simplex_score_converge                                      0.1
 +
simplex_cycle_converge                                      1.0
 +
simplex_trans_step                                          1.0
 +
simplex_rot_step                                            0.1
 +
simplex_tors_step                                            10.0
 +
simplex_random_seed                                          0
 +
simplex_restraint_min                                        no
 +
atom_model                                                  all
 +
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 +
flex_defn_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 +
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 +
ligand_outfile_prefix                                        1efy.virtual_screen.min
 +
write_orientations                                          no
 +
num_scored_conformers                                        1
 +
rank_ligands                                                no
 +
 
 +
Run the cartesian min by submitting to the seawulf queue as shown in the previous section. Be sure to properly edit the names of the input and output files.
 +
 
 +
=Rescore=
 +
Finally, the energy minimized structures will be ranked according to several different parameters. This will allow us to choose potential leads based on a variety of factors such as footprint overlap, hungarian, pharmacophore, tanimoto, etc. Navigate to the 008.rescore directory.
 +
 
 +
First create the input file:
 +
touch rescore.in
 +
 
 +
dock6 -i rescore.in
 +
 
 +
conformer_search_type                                        rigid
 +
use_internal_energy                                          yes
 +
internal_energy_rep_exp                                      9
 +
internal_energy_cutoff                                      100.0
 +
ligand_atom_file                                            ../007.cartesian_min/1efy.virtual_screen.min_scored.mol2
 +
limit_max_ligands                                            no
 +
skip_molecule                                                no
 +
read_mol_solvation                                          no
 +
calculate_rmsd                                              no
 +
use_database_filter                                          no
 +
orient_ligand                                                no
 +
bump_filter                                                  no
 +
score_molecules                                              yes
 +
contact_score_primary                                        no
 +
contact_score_secondary                                      no
 +
grid_score_primary                                          no
 +
grid_score_secondary                                        no
 +
multigrid_score_primary                                      no
 +
multigrid_score_secondary                                    no
 +
dock3.5_score_primary                                        no
 +
dock3.5_score_secondary                                      no
 +
continuous_score_primary                                    no
 +
continuous_score_secondary                                  no
 +
footprint_similarity_score_primary                          no
 +
footprint_similarity_score_secondary                        no
 +
pharmacophore_score_primary                                  no
 +
pharmacophore_score_secondary                                no
 +
descriptor_score_primary                                    yes
 +
descriptor_score_secondary                                  no
 +
descriptor_use_grid_score                                    no
 +
descriptor_use_multigrid_score                              no
 +
descriptor_use_continuous_score                              no
 +
descriptor_use_footprint_similarity                          yes
 +
descriptor_use_pharmacophore_score                          yes
 +
descriptor_use_tanimoto                                      yes
 +
descriptor_use_hungarian                                    yes
 +
descriptor_use_volume_overlap                                yes
 +
descriptor_fps_score_use_footprint_reference_mol2            yes
 +
descriptor_fps_score_footprint_reference_mol2_filename      ../004.dock/1EFY_ligand_min_scored.mol2
 +
descriptor_fps_score_foot_compare_type                      Euclidean
 +
descriptor_fps_score_normalize_foot                          no
 +
descriptor_fps_score_foot_comp_all_residue                  yes
 +
descriptor_fps_score_receptor_filename                      ../001.structure/1EFY_rec_dockprep.mol2
 +
descriptor_fps_score_vdw_att_exp                            6
 +
descriptor_fps_score_vdw_rep_exp                            9
 +
descriptor_fps_score_vdw_rep_rad_scale                      1
 +
descriptor_fps_score_use_distance_dependent_dielectric      yes
 +
descriptor_fps_score_dielectric                              4.0
 +
descriptor_fps_score_vdw_fp_scale                            1
 +
descriptor_fps_score_es_fp_scale                            1
 +
descriptor_fps_score_hb_fp_scale                            0
 +
descriptor_fms_score_use_ref_mol2                            yes
 +
 
 +
Submit to the seawulf queue as shown previously.
 +
 
 +
For example, the following image depicts a comparison of the crystallized ligand with the molecule with the highest Footprint similarity score - ZINC02877436
 +
 
 +
[[File:1eftVScomparison.PNG|thumb|center|800px]]

Latest revision as of 20:52, 3 May 2021

Introduction

DOCK

DOCK is a member of a large suite of molecular docking packages. Notably, it was the first molecular docking package of its kind, and it was created by Irwin Kuntz’s group in the 1980s. The original authors of DOCK include Brian K. Shoichet, David A. Case, and Robert C.Rizzo. Molecular docking plays a key role in computer aided drug design (CADD). Generally, molecular docking allows us to explore the interactions between small molecules, or ligands, and proteins by evaluating the shape and energetics of the ligand. We will be using the 6.9 release of DOCK, the latest as of November 2018, for this tutorial. New features of DOCK6.9 include DOCK_DN, a new ligand searching method for ligand de novo design and fragment library generation, which allows users to generate their own fragment libraries from mol2 files. For a comprehensive list of what’s new in DOCK6.9, click here.

In this tutorial, we will be using DOCK to predict the interactions between a small molecule inhibitor and a protein target. Here is the general workflow that we will be following:

  • Download a PDB file from the Protein Data Bank
  • Add protons and partial charges to the small molecule inhibitor and the target (independently).
  • Calculate the molecular surface of the target protein.
  • Generate spheres surrounding the binding site.
  • Calculate the bounding box and energy grid for the target protein.
  • Match the ligand atoms to the sphere centers.
  • Score the putative ligand atom centers using the energy grid.
  • Rank the orientations by score.

Chimera

UCSF Chimera is a standalone program that allows us to visualize and analyze macromolecules. General features of Chimera include automatic identification of atom, hydrogen addition and partial charge assignment, measurements of distances, angles, surface area, volume, and many more. For a complete description of everything Chimera can do, see https://www.rbvi.ucsf.edu/chimera/.

1EFY

In this tutorial, we will be working with the crystal structure of a member of the Poly (ADP-Ribose) polymerase (PARP) family. Specifically, we will be working with the catalytic fragment of PARP along with a benzimidazole inhibitor. This crystal structure has a resolution of 2.20 Å and an R-Value free of 0.274. PARP family proteins are involved in DNA repair, and it has been shown that they may allow cancer cells to repair their DNA after exposure to chemotherapy, which allows for resistance to develop [1]. As such, it is of particular interest to develop novel inhibitors for PARP. The goal for this tutorial is to demonstrate an end-to-end example of docking the benzimidazole inhibitor against the PARP target, as well as perform a virtual screen to find other small molecule inhibitors of PARP.

Software Requirements

In order to successfully follow this tutorial, please make sure to have the following programs installed:

We will be using a shell environment (either the built-in terminal for MacOS/Linux users or MobaXTerm for Windows users) to run several of the programs in this tutorial. As such, it may be useful to familiarize yourself with running basic UNIX commands and VIM.

In addition, this tutorial assumes you have access to a High Performance Computing cluster, i.e. Seawulf.

Directory Setup

Before we begin our molecular docking and virtual screen, it will be useful to establish our directory structure.

First ssh into your account on Seawulf:

ssh <your netid>@login.seawulf.stonybrook.edu

Navigate into your student directory using the cd command. Once you are in your student directory, create a new directory for this tutorial and name it 1EFY. The shell command to make a new directory is mkdir.

mkdir 1EFY

Now move into the 1EFY directory. Next, we will be creating the following directories:

mkdir 001.structure 002.surface_spheres 003.gridbox 004.dock 005.virtual_screen 006.virtual_screen_mpi 007.cartesian_min 008.rescore

In order to move back up the tree out of the current directory, use:

cd ../

Preparing the Ligand and Receptor

PDB Structure

For this tutorial, we will be utilizing the Protein Data Bank for information about our target protein. The Protein Databank is a database containing the structural data for biological macromolecules. Visit the Protein Data Bank and search for 1EFY. You should be redirected to a page containing the X-ray crystal structure of our protein, as well as the primary literature about the protein.

Click on Download files > PDB Format. This file contains the crystal structure of the catalytic fragment of Poly (ADP-ribose) Polymerase complexed with a benzimidazole inhibitor. It contains information about the atomic coordinates, secondary structure assignments, and atomic connectivity of the protein and ligand.

We will be visualizing the structure in Chimera. Open Chimera and click File > Open. Once we select the PDB file that we have just downloaded, we should see the following.

Unedited pdb.png

We can see the receptor, the benzimidazole inhibitor, and two water molecules. In order to see the complex from different angles or positions, we may perform three dimensional manipulations. It is possible to rotate, translate, or scale the the view of the structures using the mouse. Please refer to this documentation to see which mouse controls are appropriate for your setup.

We will next describe how to separate the receptor and ligand into two separate MOL2 formatted files.

Preparing the Receptor

In order to prepare our system for DOCK, we must perform some preprocessing steps, since DOCK can only process Tripos MOL2 formatted files as input. Luckily, we can use Chimera to create two new MOL2 files out of the original PDB file: one for the receptor and one for the ligand. In addition, prior to docking, we must also use Chimera to specify the charges for every atom in the receptor and ligand.

Let's first create the MOL2 file for the receptor. Using the original PDB structure, we will strip everything but the receptor. In Chimera, choose

Select > Residue > all nonstandard

The benzimidazole inhibitor and the two water molecules should now be highlighted like this:

Selected residues.png

Again in Chimera, select

Actions > Atoms/Bonds > delete

This leaves us with a “cleaned” version of the receptor:

1efy deleted residues.png

We can now save the receptor as a MOL2 file. Click

File > Save Mol2…  

Save the file as 1EFY_rec_woH.mol2 where woH indicates the receptor without hydrogens. Next we must add hydrogens and charges, which is required for docking. Select

Tools > Structure Editing > AddH 

A window should pop up like this:

1efy add hydrogens.png

Select OK. Now each residue in the receptor should be properly protonated. However, it is important that we validate the hydrogens that Chimera has added, as they may not be correct.

Similarly, we must add partial charges to each atom in the receptor. Select

Tools > Structure Editing > Add Charge. 

Make sure that AM1-BCC is selected in the window that pops up and select OK. Now each atom of the receptor should have the appropriate partial charge. As with the protonation states, it is important to double check Chimera’s work.

This receptor, which has now been prepared for DOCK, can be saved as a MOL2 file named 1EFY_rec_dockprep.mol2.

Ligand Preparation

We will now prepare the ligand in a similar fashion. Open the original PDB structure in Chimera. In this case, we will be deleting all of the constituents in the structure except for the benzimidazole inhibitor. Click

Select > Residue > BZC. 

This should highlight the inhibitor. Click

Select > Invert (all models). 

This should highlight everything except the inhibitor. Click

Actions > Atoms/Bonds > delete

Only the ligand should now be present:

1efy ligand noH.png

We will save this structure as MOL2 file.

File > save as mol2 > 1EFY_ligand_noH.mol2 

where noH indicates no hydrogens. Just as before, we must now add the hydrogens and charge.

Tools > Structure Editing > AddH

The ouput should look like:

Bzc wH.png

In its current state, the ligand protonation is incorrect and will lead to complications during docking. To fix this it is important to forcibly protonate the N4 atom.

Tool > Structure Editing > Build Structure

Set the number of bonds at position N4 to 3

1efyBuildStructure.PNG

Click Apply. Add charges as shown previously We will save this charged and protonated structure as a MOL2 file:

File > save as mol2 > 1EFY_ligand_H.mol2
1efyligandwH.PNG

Now that we have prepared our files for DOCK, we can scp or rsync all four of the files into the 001.structure directory.

Surface Generation and Sphere Selection

Identification of the binding site on a receptor is driven by identifying its concave surfaces. Once the molecular surface has been identified, we can generate overlapping spheres to describe the negative image of the molecular surface. The centers of the spheres composing the largest cluster represent potential positions where the ligand can bind.

Calculate the Molecular Surface

In Chimera, we can calculate the molecular surface according to the following steps: Open 1EFY_rec_woH.mol2 in Chimera. Select

Actions > Surface > Show
1efy surface.png

This shows the solvent excluded molecular surface of the receptor. We can use DMS, an open source program, to calculate the molecular surface of the receptor. Select

Tools > Structure Editing > Write DMS 

Save the file name as 1EFY_rec_surphace.dms This file can be uploaded to 002.surface_spheres.

Generate Spheres

Now that the molecular surface has been calculated and written to the DMS file, we can construct spheres to fill in any concave regions of the surface of the receptor by using the sphgen program in DOCK. The centers of the spheres represent putative positions of ligand atoms. Navigate to the 002.surface_spheres directory. The template for sphere generation using sphgen looks like the following:

INPUT FILE*:
rec.ms #molecular surface file
R #sphere outside of surface (R) or inside surface (L)
X #specifies subset of surface points to be used (X=all points)
0.0 #prevents generation of large spheres with close surface contacts (default=0.0)
4.0 #maximum sphere radius in angstroms (default=4.0)
1.4 #minimum sphere radius in angstroms (default=radius of probe)
rec.sph #clustered spheres file

Our file looked like the following:

1efy_rec_surface.dms
R
X
0.0
4.0
1.4
1EFY_rec_woH.sph

You can run the sphgen program like this:

sphgen -i INSPH -o OUTSPH

Chimera can read the output of the sphgen program.

The generated spheres should look like the following:

1efy spheres.png

Now, the juxtaposition of the surface spheres with the receptor should look like the following: Notice how the spheres are clustered near the active site.

1efy wspheres.png

Generating Box and Grid

Many individual floating point calculations are required for the computation of Lennard-Jones and Coulombic Interactions requisite for docking. In order to make these calculations feasible, we can calculate the overlapping volume of two sets of atoms using a grid based method. The intuition behind this method is the following:

  • Generate a bounding box that contains all of the atoms.
  • Generate grid points with a predefined separation.

Generating the Box

We will be using the DOCK's accessory programs showbox and grid in order to generate the grids required for grid based scoring (which is based on the non-bonded terms of the molecular mechanic force field). The first step is to generate a box enclosing the active site of the receptor. Create an input file for showbox in the 003.gridbox directory.

touch showbox.in
10.0
../002.surface_spheres/selected_spheres.sph
1
receptor.box.pdb


We can pipe in the input file into the showbox with the command showbox < box.in The output of the program will be a PDB file containing the structure of the bounding box, which can be visualized in Chimera.

Generating the Grid

Next we will be generating the grid. In the same directory, create an input file called grid.in

touch grid.in

Now, edit the input file such that it looks like the following:

compute_grids                             yes
grid_spacing                              0.4
output_molecule                           no
contact_score                             no
energy_score                              yes
energy_cutoff_distance                    9999
atom_model                                a
attractive_exponent                       6
repulsive_exponent                        9
distance_dielectric                       yes
dielectric_factor                         4
bump_filter                               yes
bump_overlap                              0.75
receptor_file                             ../001.structure/1EFY_rec_dockprep.mol2
box_file                                  1EFY.box.pdb
vdw_definition_file                       /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
score_grid_prefix                         grid

In order to run the grid program, use the following command:

grid -i grid.in -o gridinfo.out

Note that the grid calculation can take up to 45 minutes.

The output of the showbox and grid calculation can be visualized in Chimera as the following:

1efy box.png

Single Molecule Docking and Pose Reproduction

Energy Minimization

In order to meaningfully dock new ligands into the receptor, the current model must be verified for its accuracy. To accomplish this, the co-crystallized ligand will be energy minimized and docked into the 1efy receptor. The resulting pose should closely resemble the crystal structure. If significant deviations are noted, the ligand and receptor should be scrutinized for correct charge, protonation, and resonance.

Move to the 004.dock directory and create the input file:

touch min.in
dock6 -i min.in

Input the following

conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      9
internal_energy_cutoff                                       100.0
ligand_atom_file                                             ../001.structure/1EFY_ligand_H.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes      
rmsd_reference_filename                                      ../001.structure/1EFY_ligand_H.mol2
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../003.gridbox/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        yes
simplex_coefficient_restraint                                10.0
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        1EFY_ligand_min
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Rigid Docking

Rigid docking is the simplest and lowest cost of the three main docking methods, as it essentially attempts to dock the desired ligand in its native pose, only allowing 3d transformation and rotations, as well as very minor torsional corrections.

Create the input file

touch rigid.in
dock6 -i rigid.in
conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100.0
ligand_atom_file                                             1EFY_ligand_min_scored.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes
rmsd_reference_filename                                      1EFY_ligand_min_scored.mol2
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           ../002.surface_spheres/selected_spheres.sph
max_orientations                                             1000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../003.gridbox/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        rigid.out
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

At this point, it is important to visualize the resulting pose in comparison to the crystallized ligand. Open both in Chimera along with the receptor file to ensure no major changes have occurred

1efy rigid.png

Fixed Anchor Docking

In fixed anchor docking, the ligand of interest is "anchored" into the binding site of the receptor, and rotatable bonds and torsions are adjusted to minimize the energy of the structure. As this method allows for greater variance of the ligand, but still maintains a particular location in the receptor, this method blends accuracy and efficiency when compared to rigid and flexible methods.

Create an input file

touch fixed.in

dock6 -i fixed.in
conformer_search_type                                        flex
write_fragment_libraries                                     no
user_specified_anchor                                        no
limit_max_anchors                                            no
min_anchor_size                                              5
pruning_use_clustering                                       yes
pruning_max_orients                                          1000
pruning_clustering_cutoff                                    100
pruning_conformer_score_cutoff                               100.0
pruning_conformer_score_scaling_factor                       1.0
use_clash_overlap                                            no
write_growth_tree                                            no
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100.0
ligand_atom_file                                             1EFY_ligand_min_scored.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes
rmsd_reference_filename                                      1EFY_ligand_min_scored.mol2
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../003.gridbox/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
minimize_anchor                                              yes
minimize_flexible_growth                                     yes
use_advanced_simplex_parameters                              no
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1
simplex_trans_step                                           1
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_anchor_max_iterations                                500
simplex_grow_max_iterations                                  500
simplex_grow_tors_premin_iterations                          0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        fixed.out
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no
1efyfixedanchor.PNG

Flexible Docking

In flexible docking, the rotatable bonds and torsions, as well as the ligand location is searched within the defined box and grid. For this reason, flexible methods are typically the most meaningful of the three methods when docking new ligands to the receptor of interest.

Create and input file:

touch flex.in
dock6 -i flex.in
conformer_search_type                                        flex
write_fragment_libraries                                     no
user_specified_anchor                                        no
limit_max_anchors                                            no
min_anchor_size                                              5
pruning_use_clustering                                       yes
pruning_max_orients                                          1000
pruning_clustering_cutoff                                    100
pruning_conformer_score_cutoff                               100.0
pruning_conformer_score_scaling_factor                       1.0
use_clash_overlap                                            no
write_growth_tree                                            no
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100.0
ligand_atom_file                                             1EFY_ligand_min_scored.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes
rmsd_reference_filename                                      1EFY_ligand_min_scored.mol2
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           ../002.surface_spheres/selected_spheres.sph
max_orientations                                             1000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../003.gridbox/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
minimize_anchor                                              yes
minimize_flexible_growth                                     yes
use_advanced_simplex_parameters                              no
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_anchor_max_iterations                                500
simplex_grow_max_iterations                                  500
simplex_grow_tors_premin_iterations                          0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        flex.out
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no
1efyflex.PNG

Molecular Footprint

A typical molecular footprint plot shows the van der waals and electrostatic interaction between the ligand and nearby amino acid residues. This is useful for determining which interactions are important for suitable affinity of the ligand. This also serves as an additional method to ensure the ligand wasn't significantly distorted in the minimization process.

First, generate an input file

touch footprint.in
dock6 -i footprint.in
conformer_search_type                                        rigid
use_internal_energy                                          no
ligand_atom_file                                             1EFY_ligand_min_scored.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           no
grid_score_secondary                                         no
multigrid_score_primary                                      no
multigrid_score_secondary                                    no
dock3.5_score_primary                                        no
dock3.5_score_secondary                                      no
continuous_score_primary                                     no
continuous_score_secondary                                   no
footprint_similarity_score_primary                           yes
footprint_similarity_score_secondary                         no
fps_score_use_footprint_reference_mol2                       yes
fps_score_footprint_reference_mol2_filename                  ../001.structure/1EFY_ligand_H.mol2
fps_score_foot_compare_type                                  Euclidean
fps_score_normalize_foot                                     no
fps_score_foot_comp_all_residue                              yes
fps_score_receptor_filename                                  ../001.structure/1efy_rec_dockprep.mol2
fps_score_vdw_att_exp                                        6
fps_score_vdw_rep_exp                                        9
fps_score_vdw_rep_rad_scale                                  1
fps_score_use_distance_dependent_dielectric                  yes
fps_score_dielectric                                         4.0
fps_score_vdw_fp_scale                                       1
fps_score_es_fp_scale                                        1
fps_score_hb_fp_scale                                        0
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        footprint.out
write_footprints                                             yes
write_hbonds                                                 yes
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Upon completion, several .txt files should be produced. In order to convert these to .pdf files for easy viewing, use the plot_footprint_single_magnitude.py python script to show the results for the 50 highest scoring residues. The result should look similar to this:

1efyfootprint.PNG

Virtual Screen

Virtual screening is a widely used method for screening many drug-like molecules from a curated library for binding affinity via Dock in order to identify potential lead molecules. In this example we will be utilizing the VS_library_5k.mol2 library which contains over 5,000 drug-like molecules from the ZINC database. This library can be found in the /gpfs/projects/AMS536/2021/ directory. Copy this file into the 005.virtual_screen directory.

Create an input file:

touch virtual.in
dock6 -i virtual.in
conformer_search_type                                        flex
write_fragment_libraries                                     no
user_specified_anchor                                        no
limit_max_anchors                                            no
min_anchor_size                                              5
pruning_use_clustering                                       yes
pruning_max_orients                                          1000
pruning_clustering_cutoff                                    100
pruning_conformer_score_cutoff                               100.0
pruning_conformer_score_scaling_factor                       1.0
use_clash_overlap                                            no
write_growth_tree                                            no
use_internal_energy                                          yes
internal_energy_rep_exp                                      9
internal_energy_cutoff                                       100.0
ligand_atom_file                                             VS_library_25K.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           ../002.surface_spheres/selected_spheres.sph
max_orientations                                             1000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../003.gridbox/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
minimize_anchor                                              yes
minimize_flexible_growth                                     yes
use_advanced_simplex_parameters                              no
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_anchor_max_iterations                                500
simplex_grow_max_iterations                                  500
simplex_grow_tors_premin_iterations                          0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        virtual.out
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no


Allow the job to run for approximately a minute to ensure that it is working properly. Important: After ensuring the job is working, press ctrl+c to stop the job As docking several thousand molecules to the 1efy receptor requires a significant amount of computing power, we do not want to run the job on the head node. We will use slurm scripts to submit the job to the seawulf cluster. Before determining which queue to use, it may be beneficial to determine their availability. Use the command sinfo and look for queues with idle nodes. In addition, to prevent your job from being automatically rejected use [2] to determine the max time, min/max nodes, min/max cores, etc. for each particular queue. SCP the VS_library_5k.mol2 and virtual.in file into the 006.virtual_screen_mpi directory.

To submit the job to the queue create the following input file:

vi virtual.sh
#!/bin/bash
#SBATCH --time=1-8:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=40
#SBATCH --job-name=1EFY_vs
#SBATCH --output=1EFY_vs.out
#SBATCH -p extended-40core
#SBATCH --mail-type=BEGIN,END
#SBATCH --mail-user=EXAMPLE@stonybrook.edu
cd $SLURM_SUBMIT_DIR
mpirun -np 40 dock6.mpi -i virtual.in -o virtual.out

This script will submit the job to the extended-40core queue, using 1 node, 40 cores, for a maximum of 8 hours. In addition, you can input your email address in the line #SBATCH --mail-user=EXAMPLE@stonybrook.edu , and you will be emailed at the start and end of the job.

Submit the job using :

qsub virtual.sh

You can check the status of the job using:

squeue -u NETID

And inputting your netid where the command says NETID.

Cartesian Minimization

Once we have docked the library to the receptor, it is then necessary to perform a cartesian minimization of them. Move to the 007.cartesian_min directory.

Create the following input file:

touch min.in
dock6 -i min.in
conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      9
internal_energy_cutoff                                       100.0
ligand_atom_file                                             ../006.virtual_screen_mpi/virtual.out_scored.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           no
grid_score_secondary                                         no
multigrid_score_primary                                      no
multigrid_score_secondary                                    no
dock3.5_score_primary                                        no
dock3.5_score_secondary                                      no
continuous_score_primary                                     yes
continuous_score_secondary                                   no
cont_score_rec_filename                                      ../001.structures/1EFY_rec_dockprep.mol2
cont_score_att_exp                                           6
cont_score_rep_exp                                           9
cont_score_rep_rad_scale                                     1
cont_score_use_dist_dep_dielectric                           yes
cont_score_dielectric                                        4.0
cont_score_vdw_scale                                         1
cont_score_es_scale                                          1
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        1efy.virtual_screen.min
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Run the cartesian min by submitting to the seawulf queue as shown in the previous section. Be sure to properly edit the names of the input and output files.

Rescore

Finally, the energy minimized structures will be ranked according to several different parameters. This will allow us to choose potential leads based on a variety of factors such as footprint overlap, hungarian, pharmacophore, tanimoto, etc. Navigate to the 008.rescore directory.

First create the input file:

touch rescore.in
dock6 -i rescore.in
conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      9
internal_energy_cutoff                                       100.0
ligand_atom_file                                             ../007.cartesian_min/1efy.virtual_screen.min_scored.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           no
grid_score_secondary                                         no
multigrid_score_primary                                      no
multigrid_score_secondary                                    no
dock3.5_score_primary                                        no
dock3.5_score_secondary                                      no
continuous_score_primary                                     no
continuous_score_secondary                                   no
footprint_similarity_score_primary                           no
footprint_similarity_score_secondary                         no
pharmacophore_score_primary                                  no
pharmacophore_score_secondary                                no
descriptor_score_primary                                     yes
descriptor_score_secondary                                   no
descriptor_use_grid_score                                    no
descriptor_use_multigrid_score                               no
descriptor_use_continuous_score                              no
descriptor_use_footprint_similarity                          yes
descriptor_use_pharmacophore_score                           yes
descriptor_use_tanimoto                                      yes
descriptor_use_hungarian                                     yes
descriptor_use_volume_overlap                                yes
descriptor_fps_score_use_footprint_reference_mol2            yes
descriptor_fps_score_footprint_reference_mol2_filename       ../004.dock/1EFY_ligand_min_scored.mol2
descriptor_fps_score_foot_compare_type                       Euclidean
descriptor_fps_score_normalize_foot                          no
descriptor_fps_score_foot_comp_all_residue                   yes
descriptor_fps_score_receptor_filename                       ../001.structure/1EFY_rec_dockprep.mol2
descriptor_fps_score_vdw_att_exp                             6
descriptor_fps_score_vdw_rep_exp                             9
descriptor_fps_score_vdw_rep_rad_scale                       1
descriptor_fps_score_use_distance_dependent_dielectric       yes
descriptor_fps_score_dielectric                              4.0
descriptor_fps_score_vdw_fp_scale                            1
descriptor_fps_score_es_fp_scale                             1
descriptor_fps_score_hb_fp_scale                             0
descriptor_fms_score_use_ref_mol2                            yes

Submit to the seawulf queue as shown previously.

For example, the following image depicts a comparison of the crystallized ligand with the molecule with the highest Footprint similarity score - ZINC02877436

1eftVScomparison.PNG