2021 DOCK tutorial 3 with PDBID 1S19

From Rizzo_Lab
Revision as of 15:55, 24 February 2021 by Stonybrook (talk | contribs) (Energy Minimization)
Jump to: navigation, search



Developed by Irwin D. Kuntz, Jr. and colleagues at UCSF, DOCK is a program used to dock molecules to one another. Docking is a process in which given a small molecule or ligand in the active site of a receptor, the program will try to predict the lowest-energy binding mode of the ligand to the receptor. This process is very important in drug discovery as small molecules that bind to or interact with the active site of a receptor associated with a disease could inhibit its function, acting as a drug. DOCK is a particularly helpful tool as it is used to screen massive libraries of molecules, containing millions of compounds, against a receptor to identify the most promising drug lead compounds.

DOCK historically used a geometric matching algorithm to superimpose the ligand onto a negative image of the active site of the receptor. Over the years, features were added that would improve the programs ability to find the lowest-energy binding mode. Somem of these features include force-field based scoring, on-the-fly optimization and an algorithm for flexible ligand docking.

In this tutorial DOCK6 will be used. New features for DOCK6 include: additional scoring options during minimization; DOCK 3.5 scoring-including Delphi electrostatics, ligand conformational entropy corrections, ligand desolvation, receptor desolvation; Hawkins-Cramer-Truhlar GB/SA solvation scoring with optional salt screening and more (see UCSF DOCK). These new features improved the programs ability to predict binding poses.


This tutorial will use PDB code: 1S19. 1S19 is the crystal structure of VDR ligand binding domain complexed to calcipotriol (find structure here. The resolution is 2.10 Å.


UCSF Chimera was developed by Resource for Biocomputing, Visualization, and Informatics (RBVI) at the University of California, San Francisco. Chimera is a program made for the interactive visualization and analysis of molecular structures. Some features of Chimera include general structure analysis (automatic identification of an atom, hydrogen addition and partial charge assignment, structure building and bond rotation, etc.) presenting images and movies (high-res images, visual effects, standard molecular representations, etc.) and sequence structure tools (sequence alignments, structure superposition, etc.)

Making Directories

To make it easier for us to locate certain files, we are going to create directories for each step of the docking process. The mkdir command creates a new directory in which new files can be saved. The cd command allows for you to navigate between directories.

In your Bash Shell environment, create a directory containing all of the information for the project by typing:

  mkdir 1S19

To change into that directory type:

  cd 1S19

To create the directories for each step:

  mkdir 001.structure 002.surface_spheres 003.gridbox 004.dock 005.virtual_screen 006.virtual_screen_mpi 007.cartesianmin 008.rescore

To confirm that the directories have been created:


If you made a mistake and need to delete a directory:

  rm *insert name of directory to be deleted*

Preparing the Ligand and Receptor

PDB Structure

Download the structure of 1S19 from the Protein Data Bank (PDB).

  Download Files -> PDB Format

This file contains the coordinates for the 3D structure of the receptor, ligand and any other molecules present during the experiment (typically water or metal ions). To visualize the structure, we will be using Chimera.

Visualization with Chimera To open the newly downloaded PDB coordinates:

  File -> Open 

The protein when you first open the file should look like the image below. You can change the view of the structure by rotating it with your mouse or touchpad. Currently, some of the side chains of the backbone are shown, there are no hydrogens or partial charges, and there are some water molecules present.

File:1s19 unedited.png
Figure 1. Unedited PDB file of 1S19 visualized in Chimera.


In order to dock, the ligand and receptor have to be separated and saved into different files. To do this is a simple process and can be done in Chimera. To hide all sidechain bonds:

  Select -> standard amino acids 
  Actions -> Atoms/Bonds -> hide

To prepare the receptor, we are going to want to delete everythign except the protein from the PDB file. To do this in Chimera:

  Select -> Residue -> All nonstandard
  Actions -> Atoms/Bonds -> Delete

You will now be left with only the desired protein. To save this file, we are going to save it as a mol2 file. This can be done by:

  File -> Save Mol2 -> "1s19_receptor_woH.mol2" 

Adding Hydrogens and Charge PDB structures are reported without hydrogens, so it is important that we add them to the receptor in order to gain accurate calculations for the interactions between the protein and ligand. This can also be done in Chimera by doing the following:

  Tools -> Structure Editing -> Add H -> Ok

We will also need to add charges to the receptor

  Tools -> Structure Editing -> Add Charge -> (have Amber ff14SB and AM1-BCC selected) -> Ok

Once this is done it is very important to check that the charges of the receptor match that of the experiments. Chimera adds the standard protonation states to the amino acids, so it's important to read the paper associated with the PDB file (see here) to make sure that there are no amino acids that are specifically protonated or deprotonated.

Once you have checked to make sure that the protonation states are okay, save this as a mol2 file:

  File -> Save Mol2 -> "1s19_receptor_dockprep.mol2"


Like the receptor, we will need to save the ligand as a separate mol2 file in order to perform the docking. For this model, the ligand is named as MC9 in Chimera.

To isolate the ligand:

  Select -> Residue -> MC9
  Select -> Invert (all models)
  Actions -> Atoms/Bonds -> Delete

We are now left with just the ligand as pictured below.

Figure 2. Ligand MC9 without Hydrogrens

We will save this as a mol2 file by:

  File -> Save Mol2 -> 1s19_lig_woH.mol2

Adding Hydrogens and Charges In the same steps as the receptor, we will add hydrogens and partial charges to the ligand. For the Hydrogens:

  Tools -> Structure Editing -> Add H

After doing so, the structure should look like the one below.

Figure 2. Ligand MC9 with Hydrogrens

Now we have to add the charges to the ligand. Remember to double check the crystal structure and validate the charge. In this case the charge is neutral (+/- 0), and Chimera was able to model it correctly. However this is not always the case and you will sometimes be required to manually change the charges.

Save this final structure as a mol2 file with the name "1s19_lig_dockprep.mol2"

Generating Receptor Surface and Spheres

Generating Surface

In Chimera we can generate a representation of the receptor that creates a negative image of the protein. This surface will guide the ligand to the active site of the receptor during docking. To do this, we are going to open the receptor without hydrogens in Chimera, 1s19_receptor_woH.mol2. Then in Chimera, do:

  Actions -> Surface -> Show
  Tools -> Structure Editing -> Write DMS -> "1s19_rec_surface.dms"

Move this to the "002.surface_spheres" directoy in the Bash Shell environment. THe following Figure is the depiction of the surface in Chimera.

Figure 4. Receptor surface

Generating Spheres

Spheres in docking represent empty space in the receptor. Now that we have the surface representation of our receptor, we are going to use the sphgen ("sphere generation") script to create the largest possible sphere for any given empty space. It is okay that spheres overlap with each other, however we do not want any spheres overlapping with the protein.

Still in the "002.surface_spheres" directory, we are going to create a file calle INSPH (short for "in spheres"). To do this type:

  vim INSPH

In this new file, we are going to write a script in order to automate the process of generating the spheres. Below is an example of the script that we used.

  1s19_rec_surface.dms    # your receptor as DMS file
  R                       # <R flag> - enables sphere generation outside the protein surface (no eclipsing)
  X                       # <X flag  - uses all coordinates
  0.0                     # <double> - distance that steric interactions are checked
  4.0                     # <double> - Maximum sphere radius of generated sphere
  1.4                     #  <double> - Size of sphere that rolls over dms file surface for cavities
  1s19_receptor_woH.sph   # the name of your output file

With this script saved, you can now run it through sphgen by the following:

  sphgen -i INSPH -o OUTSPH

When this is run, the program will generate the "1s19_receptor_woH.sph" file and also an "OUTSPH" file. Looking at 1s19_receptor_woH.sph in Chimera will give you the following image.

Figure 5. Generated spheres by sphgen program

Selecting Spheres We are more concerned with the empty space near the active site of the receptor, since that is where we are going to be docking our ligand. To only select particular spheres within 10 angstroms of the ligand, we executed the following command:

  sphere_selector 3vjk_receptor_woH.sph 3vjk_ligand_H.mol2 10.0

This generated a new file called "selected_spheres.sph" in our working directory. The result of the sphere selection is show in the image below.

Figure 6. Selected spheres within 10 angstrom of ligand and receptor

Creating Box and Grid

To make the amount of calculations we have to do smaller, and thus more efficient, we will create a grid in which relevent residues/atoms are selected. This method of creating the box and grid will eliminate any long-distance interactions that the ligand may have.

Generating Box

We will create the box by using the Showbox program. To do this, we will create the following file in our "003.gridbox" directory.

  vim showbox.in

We put the following script inside the showbox.in file:

  Y                                           #generate box#
  8.0                                         #how many angstroms the box edges should be from the spheres
  ./../002.surface_spheres/select_spheres.sph #the location of the selected spheres
  1s19.box.pdb                                #name of the output file

To execute this script, type the following:

  showbox < showbox.in

Once this program has run, the 1s19.box.pdb file should be in the working directory.

Generating Grid

Next, we will be generating the grid. To do so, we will create a file called grid.in

  vim grid.in

The following script will be copied into the grid.in file

  compute_grids                             yes
  grid_spacing                              0.4
  output_molecule                           no
  contact_score                             no
  energy_score                              yes
  energy_cutoff_distance                    9999
  atom_model                                a
  attractive_exponent                       6
  repulsive_exponent                        9
  distance_dielectric                       yes
  dielectric_factor                         4.   
  bump_filter                               yes
  bump_overlap                              0.75
  receptor_file                             ./../001.structure/1s19_receptor_dockprep.mol2
  box_file                                  1s19.box.pbd
  vdw_definition_file                       /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  score_grid_prefix                         grid

To run this script, type:

  grid -i grid.in -o gridinfo.out

When this program finishes running, you should have three new files: gridinfo.out, grid.nrg, and grid.bmp

Energy Minimization

The ligand pose in the crystal structure may not accurately represent the lowest energy conformation of the ligand in a biological system. Therefore, to accurately dock the ligand, we first need to reduce the overall potential energy of the ligand to its global minimum. To do this, we are going to create an energy minimization script:

  vim min.in

We used the following script for the minimizaiton:

 conformer_search_type                                        rigid
 use_internal_energy                                          yes
 internal_energy_rep_exp                                      12
 internal_energy_cutoff                                       100.0
 ligand_atom_file                                             ./../001.build/1s19_ligand_dockprep.mol2
 limit_max_ligands                                            no
 skip_molecule                                                no
 read_mol_solvation                                           no
 calculate_rmsd                                               yes
 use_rmsd_reference_mol                                       yes
 rmsd_reference_filename                                      ./../001.build/1s19_ligand_dockprep.mol2
 use_database_filter                                          no
 orient_ligand                                                no
 bump_filter                                                  no
 score_molecules                                              yes
 contact_score_primary                                        no
 contact_score_secondary                                      no
 grid_score_primary                                           yes
 grid_score_secondary                                         no
 grid_score_rep_rad_scale                                     1
 grid_score_vdw_scale                                         1
 grid_score_es_scale                                          1
 grid_score_grid_prefix                                       ./../003.gridbox/grid
 multigrid_score_secondary                                    no
 dock3.5_score_secondary                                      no
 continuous_score_secondary                                   no
 footprint_similarity_score_secondary                         no
 pharmacophore_score_secondary                                no
 descriptor_score_secondary                                   no
 gbsa_zou_score_secondary                                     no
 gbsa_hawkins_score_secondary                                 no
 SASA_score_secondary                                         no
 amber_score_secondary                                        no
 minimize_ligand                                              yes
 simplex_max_iterations                                       1000
 simplex_tors_premin_iterations                               0
 simplex_max_cycles                                           1
 simplex_score_converge                                       0.1
 simplex_cycle_converge                                       1.0
 simplex_trans_step                                           1.0
 simplex_rot_step                                             0.1
 simplex_tors_step                                            10.0
 simplex_random_seed                                          0 
 simplex_restraint_min                                        yes
 simplex_coefficient_restraint                                10.0
 atom_model                                                   all
 ligand_outfile_prefix                                        1s19.lig.min
 write_orientations                                           no
 num_scored_conformers                                        1
 rank_ligands                                                 no

Now that the script is made, we can run the energy minimization by the following line

  dock6 -i min.in -o min.out

Running this command will yield two new files in the direectory: min.out and 1s19.lig.min_scored.mol2

You can visualize the minimized ligand and the original ligand in Chimera to observe the difference between the two.

Figure 7. Difference in ligands after minimization

The beige structure is the minimized ligand and the blue structure is the original ligand. The difference between the two is not large. To see the RMSD (room mean squared deviation) between the two, you can vim into the 1s19.lig.min_scored.mol2 file that was generated. The picture below shows the top of that file, and RMSD between the two molecules is 0.2281.

Figure 7. Top of the output mol2 file from minimization