Introduction

DOCK

DOCK 6 is a molecular modeling software that is used for investigating ligand binding geometry and ligand interactions[1]. Consequently, its relevance in the field of drug discovery is clear. This program was initially developed by Dr. Irwin Kuntz and colleagues at the University of California San Francisco. A major feature of DOCK 6 is the search algorithm that is used: “anchor-and-grow”; this sets the software apart from its counterparts [2]. This method first identifies the rigid structure of a particular ligand--anchors-- then the program docks the ligand using its geometry. Following the docking, a partial conformational search is performed. In simple terms, the positions of the anchor are allowed independently however once a favored conformation is found it is retained. Once this step is completed, energy minimization is carried out Full details here.

3VJK

3VJK is the PDB code for the crystal structure of human dipeptidyl peptidase IV, also known as DPP-4, with MP-513, which is called Teneligliptin [3]. DPP-4 is a symmetrical dimer and has 729 residues per chain. The crystal has a resolution of 2.49 Å, a R-value of 0.279, and a R-free value of 0.225. To add on, the molecule in the crystal--Teneligliptin--has been approved for the treatment of type II Diabetes in Japan and has shown promising results in vivo [4].

Software packages

To follow this tutorial you will need to have the following programs installed:

DOCK

Chimera

This tutorial used Dock 6.9 & Chimera 1.13.1

At several points this tutorial will reference these programs as commands in a shell environment. The students who did this ran their programs on a UNIX (CoreOS or Ubuntu) server, although this process should generalize to your specific setup. For help, please reference available documentation.

Directory Organization

The following tutorial will use the organization of directories prepared below. This specific organization is not required but is recommended. The "mkdir" command will be employed which creates a new folder in which files can be saved. To navigate into a directory use the command "cd" followed by the directory name. To change to the directory the next level up, use the command "cd .." .

Within the Bash Shell environment:

   mkdir 3VJK
   cd 3VJK
   mkdir 001.structure 002.surface_spheres 003.gridbox 004.dock 005.virtual_screen 006.virtual_screen_mpi 007.cartesianmin 008.rescore

All eight directories should be created now and this can be visually confirmed with the command "ls".

Receptor Preparation

Preparing the Structure for Docking

Downloading and Opening PDB File

Download the PDB Format file from the associated rcsb page here. This web page includes associated articles, files, and other meta data.

   Download files -> PDB Format

This file provides information on the 3D orientation of the atoms within the protein and ligand as well as any co-factors (any other molecules present during the crystallization experiment, typically water and metal ions). The file can be opened up and manipulated in the program Chimera.

Open Chimera:

   File -> Open -> (Location where you downloaded PDB file)

The protein should appear the same as the image above. The image can be rotated to view from different angles. This is called a Ribbon diagram and shows the backbone of the protein, however some amino acid side chains are shown by default. Also shown explicitly are NAG amino acid modifications, the Oxygen of several water molecules and M51 (the ligand that is complexed with the protein). There are no Hydrogen atoms represented anywhere. This is because PDB files do not contain information on Hydrogen atoms.

Preparation of the Protein Receptor for Docking

Docking requires that the protein receptor and ligand be separated into different files. First, the receptor file will be prepared. This particular protein is a homo-dimer (two identical units of the same peptide). For simplicity and to avoid possible complications in later steps, only one of the peptide chains will be retained. This step should be applied judiciously in protein systems where the ligand is at the interface of two dimers.

   Select -> Chain -> B 
   Actions -> Atoms/ Bonds -> Delete

Only one monomer of protein should remain now.

Next the NAG amino acid modifications, waters and ligand will be removed. They are not crucial for the Docking experiment, and may be problematic and cause failure if retained.

   Select -> Residue -> All nonstandard
   Actions -> Atoms/ Bonds -> Delete

The receptor is now "clean" and should be saved prior to the next step.

   File -> Save Mol2 -> "3VJK_rec_woH.mol2"

It is important to give files a logical naming scheme. The woH portion is to specify Hydrogens have not yet been added. Move this file to the directory "001.structure"

Adding Hydrogens and Charge

In order to calculate interactions between the protein and ligand, Hydrogens must be added to the receptor. Chimera will apply standard protonation states to the amino acids. It is important to check these protonation states afterwards, as they may not match the crystallization experiment. For example, the paper associated with the PDB being worked with may specify a certain residue is protonated. It would then be crucial to check this after the following step, and if it is incorrect, to adjust it manually.

   Structure Editing -> Add H -> Ok

Next partial charges will be added to each atom in the receptor.

   Structure Editing -> Add Charge -> (AM1BCC charges should be selected) -> Ok

Now save this as a mol2 file "3VJK_rec_dockprep.mol2" and move it to the directory "001.structure"

Ligand Preparation

Preparing Ligand We will now need to prepare the Ligand, M-513. In a similar manner to receptor preparation, open the PDB file on Chimera. Likewise, you will also need to delete Chain B as previously stated. Now, you will be able to isolate the ligand by doing the following:

  select->residue->M-51
  select->invert
  Actions->Atoms/Bonds->Delete

You should be left with the following:

This figure shows what should be left after deleting everything but the ligand. This should show no hydrogens.

Next, we will save this as a mol2 file:

  File->save as mol2 ->3VJK_ligand_noH.mol2

Add Hydrogens and Charge The crystal structure does not have any hydrogens because of technical limitations; hydrogen electron densities are too small to be detected. Consequently, we must add hydrogens to the ligand.

  Tools->Structure editing-> add H

this figure shows how the ligand will appear once hydrogen are added

In a similar fashion, DOCK will need charges to perform calculations.

  Tools->Structure editing-> Add Charge

It is important to make a note about the net charge of the ligand. you should not assume that chimera has the correct charge''. You should look at the ligand and attempt to validate the charge, which should be +2. You can now save this as a mol2 file and name it: 3VJK_ligand_with_H.mol2

Surface Generation & Sphere Selection

Surface Generation In Chimera a file which represents the surface of the protein will be created. The surface will be used to create a negative image of the protein (spheres which occupy the cavities and external face of the protein). These spheres are used to guide the ligand during docking.

In Chimera open "3VJK_rec_woH.mol2" :

   Actions -> Surface -> Show
   Tools -> Structure Editing -> Write DMS -> "3VJK_rec_surface.dms"

Move this to the directory "002.surface_spheres"

Sphere Selection By this step, you should have the mol2 extractions of ligand and protein, in both hydrogenated and unhydrogenated forms (4 files). The next activity is to create an efficient representation of empty space inside the protein. This is done with the sphgen script, which tries to generate the largest possible sphere for any given empty space. In general, it is desirable for the spheres will eclipse with each other, but not with the protein itself.

The sphgen software takes in a series of inputs from prompts to the user, but we can automate this by piping these arguments through a file. We shall can this file INSPH. Generate your INSPH file with the following syntax:

   [your_receptor].dms
   <R flag> - enables sphere generation outside the protein surface (no eclipsing)
   <X flag  - uses all coordinates 
   <double> - distance that steric interactions are checked (units?)
   <double> - Maximum sphere radius of generated sphere (units?)
   <double> - Size of sphere that rolls over dms file surface for cavities (units?)
   [your_receptor].sph

This is an example of how we wrote our file:

   3VJK_rec_surface.dms
   R 
   X 
   0.0 
   4.0 
   1.4 
   3VJK_receptor_woH.sph

Does it matter if the dms is generated with the hydrogens?

This should produce an sph file that you can then run through sphgen

sphgen -i INSPH -o OUTSPH

3vjk selected sphere -- The spheres represent empty space in the protein structure

Sphere Selection

Using dock's sphere_selector script, we are able to produce a subset of spheres that are close (within 10 angstroms) to the ligand

sphere_selector 3vjk_receptor.sph 3vjk_ligand_H.mol2 10.0

3vjk receptor with desired empty spaces highlighted

Generation of the Box and Grid

Energy calculations can be computationally expensive. Consequently, steps must be performed in order to reduce the number of calculations that are performed. In more detail, DOCK will be calculating the energy using a grid. We will be generating the grid; anything that is beyond the grid generated will not be in the calculation. This means that will ignore long distant interactions with ligand.

To start, we will be making a directory for the grid and the box

  mkdir 003.gridbox

Generating the box

Next, we will be creating an input file that contains information for the Showbox programs. This file will contain parameters for the box.

  vi showbox.in

we will put the following in to the file:

  Y #generate box#
  8.0 #how many angstroms the box edges should be from the spheres#
  ./../002.surface_spheres/select_spheres.sph #the location of the selected spheres#
  1 
  3VJK.box.pdb #name of the output file#

In order to run this you do this:

  showbox < showbox.in

After you run this command a file called 3VJK.box.pdb will be generated. This files contains the grid. And can be visualized.

Generating the grid

In a similar manner, we will have to generate the grid. In order to do this we will need to make the input file for the grid program that contains:

  compute_grids                             yes
  grid_spacing                              0.4
  output_molecule                           no
  contact_score                             no
  energy_score                              yes
  energy_cutoff_distance                    9999
  atom_model                                a
  attractive_exponent                       6
  repulsive_exponent                        9
  distance_dielectric                       yes
  dielectric_factor                         4.   
  bump_filter                               yes
  bump_overlap                              0.75
  receptor_file                             d
  box_file                                  d.box.pbd
  vdw_definition_file                       /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  score_grid_prefix                         grid

we will called this file grid.in. In order to generate the grid do the following:

  grid -i grid.in -o gridinfo.out

the "-o" flag is used to specify the name of the output file. Once the program is completed there should be three files generated: gridinfo.out, grid.nrg, and grid.bmp. It is a good idea to make sure that gridinfo.out matches with the known information of the system. In other words, this is a good spot to double check your work.

Energy Minimization

Before running any dock calculations, we must take a moment to minimize the ligand. This is important because the current state of the ligand may not be at its lowest energy. We must take into consideration that crystallization can result in packing and other discrepancies that can impact our results. By minimizing the structure, we can make sure that none of the byproducts of crystalization will impact the results of the calculation.

We will make a new directory for Energy Minimization.

  mkdir 004.energy_min

We will move into this directory. Now, we can conduct the first step to conducting energy minimization is to create an input file. We will call this file min.in:

  conformer_search_type                                        rigid
  use_internal_energy                                          yes
  internal_energy_rep_exp                                      12
  internal_energy_cutoff                                       100.0
  ligand_atom_file                                             ./../001.build/3VJK_ligand_hydrogens.mol2
  limit_max_ligands                                            no
  skip_molecule                                                no
  read_mol_solvation                                           no
  calculate_rmsd                                               yes
  use_rmsd_reference_mol                                       yes
  rmsd_reference_filename                                      ./../001.build/3VJK_ligand_hydrogens.mol2
  use_database_filter                                          no
  orient_ligand                                                no
  bump_filter                                                  no
  score_molecules                                              yes
  contact_score_primary                                        no
  contact_score_secondary                                      no
  grid_score_primary                                           yes
  grid_score_secondary                                         no
  grid_score_rep_rad_scale                                     1
  grid_score_vdw_scale                                         1
  grid_score_es_scale                                          1
  grid_score_grid_prefix                                       ./../003.gridbox/grid
  multigrid_score_secondary                                    no
  dock3.5_score_secondary                                      no
  continuous_score_secondary                                   no
  footprint_similarity_score_secondary                         no
  pharmacophore_score_secondary                                no
  descriptor_score_secondary                                   no
  gbsa_zou_score_secondary                                     no
  gbsa_hawkins_score_secondary                                 no
  SASA_score_secondary                                         no
  amber_score_secondary                                        no
  minimize_ligand                                              yes
  simplex_max_iterations                                       1000
  simplex_tors_premin_iterations                               0
  simplex_max_cycles                                           1
  simplex_score_converge                                       0.1
  simplex_cycle_converge                                       1.0
  simplex_trans_step                                           1.0
  simplex_rot_step                                             0.1
  simplex_tors_step                                            10.0
  simplex_random_seed                                          0 

  simplex_restraint_min                                        yes
  simplex_coefficient_restraint                                10.0
  atom_model                                                   all
  vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
  flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
  ligand_outfile_prefix                                        3VJK.lig.min
  write_orientations                                           no
  num_scored_conformers                                        1
  rank_ligands                                                 no

Now that our input file is made we can now start running minimization.

  dock6 -i min.in -o min.out

once this command is run two files will be generated: min.out and 3VJK.lig.min.mol2. The mol2 file that is generated can be visualized on chimera.

This figure shows the minimized ligand on top of the original ligand. There are some slight differences between the two structures.

Short Cut Using Bash Scripting

Running the previous steps can become tedious when one is working with a massive set of systems. Moreover, the use of a bash script can help to keep a record of what parameters were used for an experiment. A quick way to run these steps is with the following script:

  #!/bin/sh
  echo PDB name
  read pdb
  echo receptor file with hydrogen
  read receptor
  echo receptor file DMA
  read receptor_DMS
  echo ligand file 
  read ligand
  EOF
  echo generating spheres
   cat > INSPH << EOF
  $receptor_DMS
  R 
  X 
  0.0 
  4.0 
  1.4
  ${pdb}_receptor.sph 
  EOF
  echo selecting spheres 
  sphgen -i INSPH -o OUTSPH
  sphere_selector pdb_receptor.sph $ligand 10.0
  #generate grid 
  echo generating grid 
  cat > showbox.in << EOF
  Y
  8.0
  ${pdb}_receptor.sph
  1 
  ${pdb}.box.pdb
  EOF
  showbox<showbox.in
  cat > grid.in << EOF
  compute_grids                             yes
  grid_spacing                              0.4
  output_molecule                           no
  contact_score                             no
  energy_score                              yes
  energy_cutoff_distance                    9999
  atom_model                                a
  attractive_exponent                       6
  repulsive_exponent                        9
  distance_dielectric                       yes
  dielectric_factor                         4
  bump_filter                               yes
  bump_overlap                              0.75
  receptor_file                             $receptor
  box_file                                  pdb.box.pdb
  vdw_definition_file                       /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  score_grid_prefix                         grid
  EOF
  grid -i grid.in -o gridinfo.out
  cat > min.in << EOF
  conformer_search_type                                        rigid
  use_internal_energy                                          yes
  internal_energy_rep_exp                                      12
  enternal_energy_cutoff                                       100.0
  ligand_atom_file                                             $ligand 
  limit_max_ligands                                            no
  skip_molecule                                                no
  read_mol_solvation                                           no
  calculate_rmsd                                               yes
  use_rmsd_reference_mol                                       yes     
  rmsd_reference_filename                                      $ligand
  use_database_filter                                          no
  orient_ligand                                                no
  bump_filter                                                  no
  score_molecules                                              yes
  contact_score_primary                                        no
  contact_score_secondary                                      no
  grid_score_primary                                           yes
  grid_score_secondary                                         no
  grid_score_rep_rad_scale                                     1
  grid_score_vdw_scale                                         1
  grid_score_es_scale                                          1
  grid_score_grid_prefix                                       grid
  multigrid_score_secondary                                    no
  dock3.5_score_secondary                                      no
  continuous_score_secondary                                   no
  footprint_similarity_score_secondary                         no
  pharmacophore_score_secondary                                no
  descriptor_score_secondary                                   no
  gbsa_zou_score_secondary                                     no
  gbsa_hawkins_score_secondary                                 no
  SASA_score_secondary                                         no
  amber_score_secondary                                        no
  minimize_ligand                                              yes
  simplex_max_iterations                                       1000
  simplex_tors_premin_iterations                               0
  simplex_max_cycles                                           1
  simplex_score_converge                                       0.1
  simplex_cycle_converge                                       1.0
  simplex_trans_step                                           1.0
  simplex_rot_step                                             0.1
  simplex_tors_step                                            10.0
  simplex_random_seed                                          0
  simplex_restraint_min                                        yes
  simplex_coefficient_restraint                                10.0
  atom_model                                                   all
  vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
  flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
  ligand_outfile_prefix                                        pdb.lig.min
  write_orientations                                           no
  num_scored_conformers                                        1
  rank_ligands                                                 no
  EOF
  echo minimized 
  dock6 -i min.in -o min.out

This can all be copied into a document named dock_setup.sh. This can become an executable file by using the command:

  chmod u+x dock_setup.sh

Footprint Analysis

Using the electrostatic interactions and Van Der Waals Interactions, the molecular footprint can be used to understand how the ligand binds the receptor. Residues that are more negative are predominantly involved in the interaction.

First we will create a directory for the molecular footprint

  mkdir 005.footprint

In this space, we will be creating the files that will be used for the molecular footprint. The first thing we have to do is make the footprint.in file:

  conformer_search_type                                        rigid
  use_internal_energy                                          no
  ligand_atom_file                                             ./../004.energy_min/3VJK.lig.min_scored.mol2
  limit_max_ligands                                            no
  skip_molecule                                                no
  read_mol_solvation                                           no
  calculate_rmsd                                               no
  use_database_filter                                          no
  orient_ligand                                                no
  bump_filter                                                  no
  score_molecules                                              yes
  contact_score_primary                                        no
  contact_score_secondary                                      no
  grid_score_primary                                           no
  grid_score_secondary                                         no
  multigrid_score_primary                                      no
  multigrid_score_secondary                                    no
  dock3.5_score_primary                                        no
  dock3.5_score_secondary                                      no
  continuous_score_primary                                     no
  continuous_score_secondary                                   no
  footprint_similarity_score_primary                           yes
  footprint_similarity_score_secondary                         no
  fps_score_use_footprint_reference_mol2                       yes
  fps_score_footprint_reference_mol2_filename                  ./../001.build/3VJK_ligand_hydrogens.mol2
  fps_score_foot_compare_type                                  Euclidean
  fps_score_normalize_foot                                     no
  fps_score_foot_comp_all_residue                              yes
  fps_score_receptor_filename                                  ./../001.build/3VJK_hydrogen_protein.mol2
  fps_score_vdw_att_exp                                        6
  fps_score_vdw_rep_exp                                        9
  fps_score_vdw_rep_rad_scale                                  1
  fps_score_use_distance_dependent_dielectric                  yes
  fps_score_dielectric                                         4.0
  fps_score_vdw_fp_scale                                       1
  fps_score_es_fp_scale                                        1
  fps_score_hb_fp_scale                                        0
  pharmacophore_score_secondary                                no
  descriptor_score_secondary                                   no
  gbsa_zou_score_secondary                                     no
  gbsa_hawkins_score_secondary                                 no
  SASA_score_secondary                                         no
  amber_score_secondary                                        no 
  minimize_ligand                                              no
  atom_model                                                   all
  vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
  flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
  ligand_outfile_prefix                                        footprint.out
  write_footprints                                             yes 
  write_hbonds                                                 yes 
  write_orientations                                           no
  num_scored_conformers                                        1
  rank_ligands                                                 no

Now you can use the following command to make the footprint:

  dock6 -i footprint.in

Once this is completed you should have three new files: footprint.out_footprint_scored.txt footprint.out_hbond_scored.txt and footprint.out_scored.mol2

We can visualize the results using a python script: /gpfs/projects/AMS536/zzz.programs/plot_footprint_single_magnitude.py . This should be copied to the current directory.

  cp /gpfs/projects/AMS536/2020/536_class/steve_ta/footprint_test_1.21.2020/plot_footprint_single_magnitude.py

This command can run by doing the following:

  python plot_footprint_single_magnitude.py footprint.out_footprint_scored.txt  50

This will show the energetic distribution for the 50 most significant residues.

This distribution shows the relative stability of each residue with respect to the ligand before and after the docking process. In general, the docking process should arrange the ligand such that the global stability of the compound has increased.

Docking

Rigid Docking

This is the least computationally expensive docking method. This docking routine does not sample internal degrees of freedom (bond angles). The ligand is treated as a rigid object which is why minimization was performed in the prior step.

The first step is to create a dictory--where we will be putting are input and output files.

  mkdir 006.rigid_docking

Now that we have the directory ready, we can make documents within this. The first thing we will want to do is set up the input file

  touch rigid.in 
  dock6 -i rigid.in

This will bring up an interactive input screen. Unlike the prior sets, it is highly suggested to manually complete this task because some options may be moved around depending on the version used. Below you can find the parameters that were used for this tutorial.

  conformer_search_type                                        flex
  write_fragment_libraries                                     no
  user_specified_anchor                                        no
  limit_max_anchors                                            no
  min_anchor_size                                              5
  pruning_use_clustering                                       yes
  pruning_max_orients                                          1000
  pruning_clustering_cutoff                                    100
  pruning_conformer_score_cutoff                               100.0
  pruning_conformer_score_scaling_factor                       1.0
  use_clash_overlap                                            no
  write_growth_tree                                            no
  use_internal_energy                                          yes
  internal_energy_rep_exp                                      12
  internal_energy_cutoff                                       100.0
  ligand_atom_file                                             ./../004.energy_min/3VJK.lig.min_scored.mol2
  limit_max_ligands                                            no
  skip_molecule                                                no
  read_mol_solvation                                           no
  calculate_rmsd                                               yes
  use_rmsd_reference_mol                                       yes
  rmsd_reference_filename                                      ./../004.energy_min/3VJK.lig.min_scored.mol2
  use_database_filter                                          no
  orient_ligand                                                yes
  automated_matching                                           yes
  receptor_site_file                                           ./../002.surface_spheres/selected_spheres.sph   
  max_orientations                                             1000
  critical_points                                              no
  chemical_matching                                            no
  use_ligand_spheres                                           no   
  bump_filter                                                  no
  score_molecules                                              yes
  contact_score_primary                                        no
  contact_score_secondary                                      no
  grid_score_primary                                           yes
  grid_score_secondary                                         no
  grid_score_rep_rad_scale                                     1
  grid_score_vdw_scale                                         1
  grid_score_es_scale                                          1
  grid_score_grid_prefix                                       ./../003.gridbox/grid
  multigrid_score_secondary                                    no
  dock3.5_score_secondary                                      no
  continuous_score_secondary                                   no
  footprint_similarity_score_secondary                         no
  pharmacophore_score_secondary                                no
  descriptor_score_secondary                                   no
  gbsa_zou_score_secondary                                     no
  gbsa_hawkins_score_secondary                                 no
  SASA_score_secondary                                         no
  amber_score_secondary                                        no
  minimize_ligand                                              yes
  minimize_anchor                                              yes
  minimize_flexible_growth                                     yes
  use_advanced_simplex_parameters                              no
  simplex_max_cycles                                           1
  simplex_score_converge                                       0.1
  simplex_cycle_converge                                       1.0
  simplex_trans_step                                           1.0
  simplex_rot_step                                             0.1
  simplex_tors_step                                            10.0
  simplex_anchor_max_iterations                                500
  simplex_grow_max_iterations                                  500
  simplex_grow_tors_premin_iterations                          0
  simplex_random_seed                                          0
  simplex_restraint_min                                        no
  atom_model                                                   all
  vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
  flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
  ligand_outfile_prefix                                        rigid.out
  write_orientations                                           no
  num_scored_conformers                                        1
  rank_ligands                                                 no

Once the input is complete, there should be two output files: rigid.out and rigid.out_scored.mol2. Like the previous set with minimization, the results can be seen with chimera.

the figure shows the protein with the minimized ligand and the ligand from the rigid dock experiment

Fixed Anchor Docking

In this type of docking, the largest molecular fragment (the anchor) is oriented as the first step. The anchor is defined as the moiety containing no internal rotatable bonds possessing the most atoms. Subsequent rigid portions are attached and are oriented to be as energetically favorable as possible. This allows a larger space of sampling, but is still confined.

Make a directory called 007.fixed_anchor_docking/

In this directory, we will be creating an input file (like we did for the rigid docking) that is called flex.in

  conformer_search_type                                        flex
  write_fragment_libraries                                     no
  user_specified_anchor                                        no
  limit_max_anchors                                            no
  min_anchor_size                                              5
  pruning_use_clustering                                       yes
  pruning_max_orients                                          1000
  pruning_clustering_cutoff                                    100
  pruning_conformer_score_cutoff                               100.0
  pruning_conformer_score_scaling_factor                       1.0
  use_clash_overlap                                            no
  write_growth_tree                                            no
  use_internal_energy                                          yes
  internal_energy_rep_exp                                      12
  internal_energy_cutoff                                       100.0
  ligand_atom_file                                             ./../004.energy_min/3VJK.lig.min_scored.mol2
  limit_max_ligands                                            no
  skip_molecule                                                no
  read_mol_solvation                                           no
  calculate_rmsd                                               yes
  use_rmsd_reference_mol                                       yes
  rmsd_reference_filename                                      ./../004.energy_min/3VJK.lig.min_scored.mol2
  use_database_filter                                          no
  orient_ligand                                                no
  bump_filter                                                  no
  score_molecules                                              yes
  contact_score_primary                                        no
  contact_score_secondary                                      no
  grid_score_primary                                           yes
  grid_score_secondary                                         no
  grid_score_rep_rad_scale                                     1
  grid_score_vdw_scale                                         1
  grid_score_es_scale                                          1
  grid_score_grid_prefix                                       ./../003.gridbox/grid
  multigrid_score_secondary                                    no
  dock3.5_score_secondary                                      no
  continuous_score_secondary                                   no
  footprint_similarity_score_secondary                         no
  pharmacophore_score_secondary                                no
  descriptor_score_secondary                                   no
  gbsa_zou_score_secondary                                     no
  gbsa_hawkins_score_secondary                                 no
  SASA_score_secondary                                         no
  amber_score_secondary                                        no
  minimize_ligand                                              yes 
  minimize_anchor                                              yes
  minimize_flexible_growth                                     yes
  use_advanced_simplex_parameters                              no
  simplex_max_cycles                                           1
  simplex_score_converge                                       .1
  simplex_cycle_converge                                       1
  simplex_trans_step                                           1
  simplex_rot_step                                             .1
  simplex_tors_step                                            10.0
  simplex_anchor_max_iterations                                500
  simplex_grow_max_iterations                                  500
  simplex_grow_tors_premin_iterations                          0
  simplex_random_seed                                          0
  simplex_restraint_min                                        no
  atom_model                                                   all
  vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn

This can be run by doing the following:

   dock6 -i fixed.in -o fixed.out

Once this is completed you should have two output files: flex.out and 2vjk_fad_scored.mol2.

the figure shows the protein with the minimized ligand and the ligand from the fixed anchor dock experiment

Flexible Docking

Finally, this is the most expensive docking. This will sample all the internal degrees of freedom within the domain of bond angles.

Once again create a directory for this docking experiment. We will call it 008.flexible_docking

conformer_search_type                                        flex
write_fragment_libraries                                     no 
user_specified_anchor                                        no
limit_max_anchors                                            no
min_anchor_size                                              5
pruning_use_clustering                                       yes
pruning_max_orients                                          1000
pruning_clustering_cutoff                                    100
pruning_conformer_score_cutoff                               100.0
pruning_conformer_score_scaling_factor                       1.0
use_clash_overlap                                            no
write_growth_tree                                            no
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100.0
ligand_atom_file                                             ./../004.energy_min/3VJK.lig.min_scored.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes
rmsd_reference_filename                                      ./../004.energy_min/3VJK.lig.min_scored.mol2
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           ./../002.surface_spheres/selected_spheres.sph
max_orientations                                             1000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1 
grid_score_grid_prefix                                       ./../003.gridbox/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no 
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes 
minimize_anchor                                              yes
minimize_flexible_growth                                     yes 
use_advanced_simplex_parameters                              no
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_anchor_max_iterations                                500
simplex_grow_max_iterations                                  500
simplex_grow_tors_premin_iterations                          0
simplex_random_seed                                          0
simplex_restraint_min                                        no 
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        flex.out
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Like the previous docking experiments we will do the following to make it run:

  dock6 -i flex.in -o flex.out

Two files should have been generated: flex.out and flex.out_scored.mol2

the figure shows the protein with the minimized ligand and the ligand from the flexible dock experiment

Virtual Screening

Virtual screening is the process of passing a library of molecules through DOCK to a single receptor. This library of molecules can be tailored by the user to match the desired size and constituents. The advantage of a virtual screen is that many molecules can be analyzed with minimal user input, and the most strong binding molecules can be selected for further analysis. The screen performed for this tutorial used the library file "VS_library_5K.mol2" which can be found within the AMS536 directory. If desired to use a different library, all of the molecules must be provided within a single multi-mol2 file.

     touch virtual.in

Input the following prompts:

     conformer_search_type                                        flex
     write_fragment_libraries                                     no
     user_specified_anchor                                        no
     limit_max_anchors                                            no
     min_anchor_size                                              5
     pruning_use_clustering                                       yes
     pruning_max_orients                                          1000
     pruning_clustering_cutoff                                    100
     pruning_conformer_score_cutoff                               100.0
     pruning_conformer_score_scaling_factor                       1.0
     use_clash_overlap                                            no
     write_growth_tree                                            no
     use_internal_energy                                          yes
     internal_energy_rep_exp                                      9
     internal_energy_cutoff                                       100.0
     ligand_atom_file                                             VS_library_5K.mol2
     limit_max_ligands                                            no
     skip_molecule                                                no
     read_mol_solvation                                           no
     calculate_rmsd                                               no
     use_database_filter                                          no
     orient_ligand                                                yes
     automated_matching                                           yes
     receptor_site_file                                           ../002.surface_spheres/selected_spheres.sph
     max_orientations                                             1000
     critical_points                                              no
     chemical_matching                                            no
     use_ligand_spheres                                           no
     bump_filter                                                  no
     score_molecules                                              yes
     contact_score_primary                                        no
     contact_score_secondary                                      no
     grid_score_primary                                           yes
     grid_score_secondary                                         no
     grid_score_rep_rad_scale                                     1
     grid_score_vdw_scale                                         1
     grid_score_es_scale                                          1
     grid_score_grid_prefix                                       ../003.gridbox/grid
     multigrid_score_secondary                                    no
     dock3.5_score_secondary                                      no
     continuous_score_secondary                                   no
     footprint_similarity_score_secondary                         no
     pharmacophore_score_secondary                                no
     descriptor_score_secondary                                   no
     gbsa_zou_score_secondary                                     no
     gbsa_hawkins_score_secondary                                 no
     SASA_score_secondary                                         no
     amber_score_secondary                                        no
     minimize_ligand                                              yes
     minimize_anchor                                              yes
     minimize_flexible_growth                                     yes
     use_advanced_simplex_parameters                              no
     simplex_max_cycles                                           1
     simplex_score_converge                                       0.1
     simplex_cycle_converge                                       1.0
     simplex_trans_step                                           1.0
     simplex_rot_step                                             0.1
     simplex_tors_step                                            10.0
     simplex_anchor_max_iterations                                500
     simplex_grow_max_iterations                                  500
     simplex_grow_tors_premin_iterations                          0
     simplex_random_seed                                          0
     simplex_restraint_min                                        no
     atom_model                                                   all
     vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
     flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
     flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
     ligand_outfile_prefix                                        virtual.out
     write_orientations                                           no
     num_scored_conformers                                        1
     rank_ligands                                                 no

Thus far all work has likely been done on a local machine. Running Dock on 5000 molecules could take an extended time if done on a local machine. To maximize the use of resources, a virtual screen can be done on the Seawolf Cluster. In order to do this, a submission script must be written which gives the cluster information on how to allocate resources.

Name the submission script virtual_screen.sh

     vi virtual_screen.sh

     #!/bin/sh 
     #SBATCH --partition=long-40core
     #SBATCH --time=48:00:00
     #SBATCH --nodes=1
     #SBATCH --ntasks=1
     #SBATCH --job-name=536.vs 
     #SBATCH --output=536.vs.out

     dock6 -i virtual.in -o virtual.out

To submit the script in the bash shell:

     sbatch virtual_screen.sh

You may check the status of you job:

   squeue  -u {username}

Once the job is completed you should read the output files.

  cat 536.vs.out

Since only one node and 1 core, it is expected that the job was canceled due to reaching the time limit.

the image shows the error that will most likely occur when a virtual screening of 5k molecules is conducted on this system

We can also see how many molecules were processed using this command:

  grep Molecule: virtual_screening.out| wc -l

You may also visualize the results of the virtual screening by downloading virtual.out_scored.mol2 and using chimera

  Tools-> Surface/BindingAnalysis->View Dock-> Dock6

This will allow you to visualize each molecule that was screened. Next, you can look at the properties of the molecules by downing the following:

  Column->show

You can select from a variety of properties and rank the molecules based on these properties. This is useful for analyzing which molecules are favored in the active site

Consequently, we must re-run the job with a large number of nodes and cores using MPI.

Virtual Screening with MPI

In the previous section, it was found that the time limit was reached-- however--not all the molecules were screened. We simply cannot resubmit the job to complete the analysis. To compensate, we can submit the job to run on multiple cores. We can do this by using MPI.

You should first start off by making a new directory for the MPI Virtual Screening. This will be called 012.MPI_VS. You can simply copy the input file from the previous run. The only change will be in the submission script:

   #!/bin/sh 
   #SBATCH --partition=long-40core 
   #SBATCH --time=48:00:00
   #SBATCH --nodes=4
   #SBATCH --ntasks=32 
   #SBATCH --job-name=tutorial_run 
   #SBATCH --output=%x-%j.o
     
   echo "starting DOCK6.9 Simulation"
   module load intel/mpi/64/2018/18.0.3
 
   mpirun -np 32 dock6.mpi -i virtual_screening.in -o virtual_screening.out
   echo "done"

In this script, I indicated that I will be using 4 nodes and a total of 32 cores. Additionally, it is required to load the module for mpi and use the dock6.mpi executable. Once the job has finish running you should find that a major of the molecules were calculated. Once again you may check using this command:

  grep Molecule: virtual_screening.out| wc -l

In this tutorial, the 32 cores allowed for the calculation of 5665 out of the 5730 small molecules in the library. Thus, in the future, it may be required to allocate more resources to run through the whole library. Nevertheless, we may proceed using the available data.

Cartesian Minimization

We will now conduct minimization on the molecules that were docked. To do this, we must first create an input file:

  conformer_search_type                                        rigid
  use_internal_energy                                          yes
  internal_energy_rep_exp                                      12
  internal_energy_cutoff                                       100
  ligand_atom_file                                             ./../012.mpi_job/virtual.out_scored.mol2
  limit_max_ligands                                            no
  skip_molecule                                                no
  read_mol_solvation                                           no
  calculate_rmsd                                               no
  use_database_filter                                          no
  orient_ligand                                                no
  bump_filter                                                  no
  score_molecules                                              yes
  contact_score_primary                                        no
  contact_score_secondary                                      no
  grid_score_primary                                           no
  grid_score_secondary                                         no
  multigrid_score_primary                                      no
  multigrid_score_secondary                                    no
  dock3.5_score_primary                                        no
  dock3.5_score_secondary                                      no
  continuous_score_primary                                     yes
  continuous_score_secondary                                   no
  cont_score_rec_filename                                      ./../001.build/3VJK_hydrogen_protein.mol2
  cont_score_att_exp                                           6
  cont_score_rep_exp                                           12
  cont_score_rep_rad_scale                                     1
  cont_score_use_dist_dep_dielectric                           yes
  cont_score_dielectric                                        4.0
  cont_score_vdw_scale                                         1.0
  cont_score_es_scale                                          1.0
  footprint_similarity_score_secondary                         no
  pharmacophore_score_secondary                                no
  descriptor_score_secondary                                   no
  gbsa_zou_score_secondary                                     no
  gbsa_hawkins_score_secondary                                 no
  SASA_score_secondary                                         no
  amber_score_secondary                                        no
  minimize_ligand                                              yes
  simplex_max_iterations                                       1000
  simplex_tors_premin_iterations                               0
  simplex_max_cycles                                           1.0
  simplex_score_converge                                       0.1
  simplex_cycle_converge                                       1.0
  simplex_trans_step                                           1.0
  simplex_rot_step                                             0.1
  simplex_tors_step                                            10.0
  simplex_random_seed                                          0
  simplex_restraint_min                                        no
  atom_model                                                   all
  vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
  flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
  ligand_outfile_prefix                                        3vjk.virtual_screen.min 
  write_orientations                                           no
  num_scored_conformers                                        1
  rank_ligands                                                 no

We have named this min.in. It is important to make sure that the directory paths reflect your own paths. Once the input file has been made you should create a submission script:

  #!/bin/sh 
  #SBATCH --partition=long-40core 
  #SBATCH --time=48:00:00
  #SBATCH --nodes=3
  #SBATCH --ntasks=24 
  #SBATCH --job-name=tutorial_run 
  #SBATCH --dependency=afterok:227294
  #SBATCH --output=%x-%j.o
   
   echo "starting Cartesian Minimization"
   dock6 -i min.in -o min.out 
   echo "Minimization done"

You may name this job.sh. And now you should be able to submit the job.

In a similar manner to the previous section, you can download the output .mol2 files and load it using viewdock on chimera. This can be used to compare the unminimized ligands to the minimized ligands. You should expect to find that the energy is more negative after minimization.

the image shows the protein with both the minimized ligand and unminimized ligand. The minimized ligand is colored yellow

Rescoring Molecules

We will now rescore the molecules based on several factors. This includes footprint, pharmacophore, Tanimoto, Hungarian, and volume_overlap score. The first step is to make a directory. This can be called rescore. Now you must make the input file (rescore.in):

  conformer_search_type                                        rigid
  use_internal_energy                                          yes
  internal_energy_rep_exp                                      12 
  internal_energy_cutoff                                       100
  ligand_atom_file                                             ./../009.virtual_screening/virtual.out_scored.mol2
  limit_max_ligands                                            no
  skip_molecule                                                no
  read_mol_solvation                                           no
  calculate_rmsd                                               no
  use_database_filter                                          no
  orient_ligand                                                no
  bump_filter                                                  no
  score_molecules                                              yes
  contact_score_primary                                        no
  contact_score_secondary                                      no
  grid_score_primary                                           no
  grid_score_secondary                                         no
  multigrid_score_primary                                      no
  multigrid_score_secondary                                    no
  dock3.5_score_primary                                        no
  dock3.5_score_secondary                                      no
  continuous_score_primary                                     no
  continuous_score_secondary                                   no
  footprint_similarity_score_primary                           no
  footprint_similarity_score_secondary                         no
  pharmacophore_score_primary                                  no
  pharmacophore_score_secondary                                no
  descriptor_score_primary                                     yes
  descriptor_score_secondary                                   no
  descriptor_use_grid_score                                    no
  descriptor_use_multigrid_score                               no
  descriptor_use_continuous_score                              no
  descriptor_use_footprint_similarity                          yes
  descriptor_use_pharmacophore_score                           yes
  descriptor_use_hungarian                                     yes
  descriptor_use_volume_overlap                                yes
  descriptor_fps_score_use_footprint_reference_mol2            yes
  descriptor_fps_score_footprint_reference_mol2_filename       ./../004.energy_min/3VJK.lig.min_scored.mol2
  descriptor_fps_score_foot_compare_type                       Euclidean
  descriptor_fps_score_normalize_foot                          no
  descriptor_fps_score_foot_comp_all_residue                   yes
  descriptor_fps_score_receptor_filename                       ./../001.build/3VJK_hydrogen_protein.mol2
  descriptor_fps_score_vdw_att_exp                             6
  descriptor_fps_score_vdw_rep_exp                             12
  descriptor_fps_score_vdw_rep_rad_scale                       1
  descriptor_fps_score_use_distance_dependent_dielectric       yes
  descriptor_fps_score_dielectric                              4.0
  descriptor_fps_score_vdw_fp_scale                            1
  descriptor_fps_score_es_fp_scale                             1
  descriptor_fps_score_hb_fp_scale                             0
  descriptor_fms_score_use_ref_mol2                            yes
  descriptor_fms_score_ref_mol2_filename                       ./../004.energy_min/3VJK.lig.min_scored.mol2
  descriptor_fms_score_write_reference_pharmacophore_mol2      no
  descriptor_fms_score_write_reference_pharmacophore_txt       no
  descriptor_fms_score_write_candidate_pharmacophore           no
  descriptor_fms_score_write_matched_pharmacophore             no
  descriptor_fms_score_compare_type                            overlap
  descriptor_fms_score_full_match                              yes
  descriptor_fms_score_match_rate_weight                       5.0
  descriptor_fms_score_match_dist_cutoff                       1.0
  descriptor_fms_score_match_proj_cutoff                       .7071
  descriptor_fms_score_max_score                               20
  descriptor_fingerprint_ref_filename                          ./../004.energy_min/3VJK.lig.min_scored.mol2
  descriptor_hms_score_ref_filename                            ./../004.energy_min/3VJK.lig.min_scored.mol2
  descriptor_hms_score_matching_coeff                          -5
  descriptor_hms_score_rmsd_coeff                              1
  descriptor_volume_score_reference_mol2_filename              ./../004.energy_min/3VJK.lig.min_scored.mol2
  descriptor_volume_score_overlap_compute_method               analytical
  descriptor_weight_fps_score                                  1
  descriptor_weight_pharmacophore_score                        1
  descriptor_weight_fingerprint_tanimoto                       -1
  descriptor_weight_hms_score                                  1
  descriptor_weight_volume_overlap_score                       -1
  gbsa_zou_score_secondary                                     no
  gbsa_hawkins_score_secondary                                 no
  SASA_score_secondary                                         no
  amber_score_secondary                                        no
  minimize_ligand                                              no
  atom_model                                                   all
  vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
  flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
  chem_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/chem.defn
  pharmacophore_defn_file                                      /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/ph4.defn
  ligand_outfile_prefix                                        descriptor.output
  write_footprints                                             yes
  write_hbonds                                                 yes
  write_orientations                                           no
  num_scored_conformers                                        1
  rank_ligands                                                 no

In a similar fashion to submitting the virtual screening, you may chose to submit the job using script:

  #!/bin/sh 
  #SBATCH --partition=long-40core 
  #SBATCH --time=48:00:00
  #SBATCH --nodes=4
  #SBATCH --ntasks=32 
  #SBATCH --job-name=tutorial_run 
  #SBATCH --output=%x-%j.o
  
  echo "starting DOCK6.9 Simulation"
  module load intel/mpi/64/2018/18.0.3
  
  mpirun -np 32 dock6.mpi -i rescore.in -o rescore.out
  echo "done"

This can be submitted by doing the following

  sbatch mpi.sh

Once the job has run you should be able to see the results by loading descriptor.output_scored.mol2 into chimera using view docks. You can select which scoring properties to look at by using the column feature.

2020 DOCK tutorial 1 with PDBID 3VJK

Contents

Introduction

DOCK

3VJK

Software packages

Directory Organization

Receptor Preparation

Ligand Preparation

Surface Generation & Sphere Selection

Generation of the Box and Grid

Generating the box

Generating the grid

Energy Minimization

Short Cut Using Bash Scripting

Footprint Analysis

Docking

Rigid Docking

Fixed Anchor Docking

Flexible Docking

Virtual Screening

Virtual Screening with MPI

Cartesian Minimization

Rescoring Molecules

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Rizzo Lab

Courses

Toolbox