This tutorial contains, a step by step by approach to dock a known ligand to a known receptor.

I. Introduction

DOCK

DOCK is a molecular docking program used in drug discovery. It was developed by Irwin D. Kuntz, Jr. and colleagues at UCSF (see UCSF DOCK). This program, given a protein binding site and a small molecule, tries to predict the correct binding mode of the small molecule in the binding site, and the associated binding energy. Small molecules with highly favorable binding energies could be new drug leads. This makes DOCK a valuable drug discovery tool. DOCK is typically used to screen massive libraries of millions of compounds against a protein to isolate potential drug leads. These leads are then further studied, and could eventually result in a new, marketable drug. DOCK works well as a screening procedure for generating leads, but is not currently as useful for optimization of those leads.

DOCK 6 uses an incremental construction algorithm called anchor and grow. It is described by a three-step process:

Rigid portion of ligand (anchor) is docked by geometric methods.
Non-rigid segments added in layers; energy minimized.
The resulting configurations are 'pruned' and energy re-minimized, yielding the docked configurations.

2NNQ

The tutorial will be based on the PDB file 2NNQ downloaded from the PDB Database. 2NNQ is the crystal structure for a human adipocyte fatty acid binding protein in complex with ((2'-(5-ethyl-3,4-diphenyl-1H-pyrazol-1-yl)-3-biphenylyl)oxy)acetic acid.

Organization of Directories

Maintaining a clearly organized set of folders will be helpful in finding specific files, calling different files in input files and most importantly keeping track of everything you do. We would like to recommend to maintain the following set of files throughout the tutorial.

             0.files
             1.dockprep
             2.surface_spheres
             3.gridbox
             4.
             5.
             6.
             7.

II. Preparation of the ligand and receptor

Download the pdb file 2NNQ from PDB database save it in 0.files folder.

Checking the structure

 - Read the article related to the PDB file to understand protonation states, charges, environmental conditions and other important information regarding the receptor and the ligand.
 - Open the pdb file through chimera and look at the structure. Identify the main components of the model (receptor, ligand, solvent, surfactants, metal ions)
 - Carefully look to identify if there are any missing residues or missing loops. (This particular PDB file didn't contain any missing loops or missing residues)

Preparation of receptor

 - Open the PDB file (2NNQ.pdb) via Chimera
 - Isolate the receptor using select tool and delete tool in Chimera.
 - Save the isolated receptor as a mol2 file. (2nnq_rec_noH.mol2)

 - Open 2nnq_rec_noH.mol2 file again using Chimera and use the following instructions to prepare the receptor file to be used in DOCK.
          Tools -> Structure Editing -> Add H (To add Hydrogen atoms)
          Tools -> Structure Editing -> Add Charge (To add the charge use the latest AMBER force filed available for standard residues. Here we used AMBER ff14SB)
          Save as a mol2 file. (22nq_rec_withH.mol2)

 - If you follow the step below all the above stated steps will automatically appear one after the other to prepare the receptor. 
          Tools -> Structure/Binding Analysis -> DockPrep

Preparation of ligand

 - Open the PDB file via Chimera.
 - Using Chimera, isolate the ligand, add H atoms, add charge and save it as a mol2 file by following the same steps followed for the receptor.

Once all the files are prepared make sure to save the files in 1.dockprep folder.

III. Generating receptor surface and spheres

Preparation of DMS file

 - Open 2nnq_rec_noH.mol2 using chimera.
 - Action -> Surface -> Show
 - Tools -> Structure Editing -> Write DMS
 - Save the 2nnq_rec_withH.dms into 3.surface_spheres folder

Transfer all the folders created so far to seawulf cluster to be used in DOCK.

Generating spheres

 - Go to 2.surface_spheres folder
 - Create a new input file to create spheres by typing vim INSPH and type the following lines inside the file.

2nnq_rec_noH.dms
R
X
0.0
4.0
1.4
2nnq_rec.sph

The first line 2nnq_rec_noH.dms specifies the input file. R indicates that spheres generated will be outside of the receptor surface. X specifies all the points will be used. 0.0 is the distance in angstroms and it will avoid steric clashes. 4.0 is the maximum surface radius of the spheres and 1.4 is the minimum radius in angstroms.The last line 2nnq_spheres.sph creates the sph file that contains clustered spheres.

Once the INSPH file is ready, type the following command to generate the spheres.

 sphgen -i INSPH -o OUTSPH

Once sphgen command is successful, 2nnq_spheres.sph file will be created. Open it up using Chimera along with 2nnq_rec_noH.mol2 file. You should get a similar output like the image below.

All the spheres generated for 2nnq receptor

Selecting Spheres

Here we will be selecting the spheres which defines the binding pocket of the ligand because we are trying to direct the ligand towards that binding site rather than all over the receptor. To select the spheres type the following command.

 sphere_selector 2nnq_rec.sph ../1.dockprep/2nnq_lig_withH.mol2 10.0

This command will select all of the spheres within 10.0 angstroms of the ligand and output them to selected_spheres.sph. Visualize the selected spheres using Chimera to make sure the correct spheres are selected. Notice that, spheres around the ligand binding site are kept and all the other spheres are deleted in the image below.

2nnq receptor and selected spheres

IV. Generating box and grid

Generating box

Move to 3.boxgrid directory Create a new file showbox.in and write the following lines in the file.

 Y
 8.0
 ../2.surface_spheres/selected_spheres.sph
 1
 2nnq.box.pdb

Each of the above lines indicate that;

 We intend to generate a box
 The box length should be 8 Angstroms
 Use the selected_spheres file in the designated location
 The name of the file that contains generated box.

Use the following command to generate the box.

 showbox < showbox.in

If this step is successful, you should see a new file (2nnq.box.pdb) in 3.boxgrid folder.

Generating grid

Create a new file (grid.in)

Use the following command to generate the grid.

 grid -i grid.in -o gridinfo.out

Answer the prompted questions with the answers given below. (or you can use the following lines and include them in the grid.in file before entering the above command. If you do that these questions won't be prompted again. They will be automatically answered by grid.in file created)

compute_grids                             yes
grid_spacing                              0.4
output_molecule                           no
contact_score                             no
energy_score                              yes
energy_cutoff_distance                    9999
atom_model                                a
attractive_exponent                       6
repulsive_exponent                        12
distance_dielectric                       yes
dielectric_factor                         4
bump_filter                               yes
bump_overlap                              0.75
receptor_file                             ../1.dockprep/2nnq_rec_withH.mol2
box_file                                  2nnq.box.pdb
vdw_definition_file                       /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn
score_grid_prefix                         grid

If the command is successful, three new files will be generated. (gridinfo.out, grid.nrg, grid.bmp). Go through gridinfo.out file to make sure all the information about the receptor in the file matches with the original information of the receptor. (Eg:- Total charge, residues and their charges) If the information doesn't match, that means you have made an error in one of the steps that you followed so far.

V. Docking a single molecule for pose reproduction

Under this section, the ligand for 2nnq.pdb will be re-docked into the receptor. 3 Methods will be used to achieve this.

1. rigid docking

2. fixed anchor docking

3. flexible docking

Energy minimization

Before performing docking, here the ligand will be subjected to energy minimization in order to remove unfavorable clashes. These clashes will affect rigid docking because in rigid docking the ligand will be docked as the complete ligand, whereas in other docking methods the ligand will be broken into fragments and the ligand will be built step by step considering favorable orientations and torsion angles after each fragment addition.

Go to the directory 4.dock and a create a new file (min.in) and enter the command below.

 dock6 -i min.in

Answer the prompted questions using the answers given below or include the following lines in the min.in file at before entering the above command to avoid answering the questions manually.

conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100.0
ligand_atom_file                                             ../1.dockprep/2nnq_lig_withH.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       ../1.dockprep/2nnq_lig_withH.mol2
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../3.boxgrid/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        yes
simplex_coefficient_restraint                                10.0
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl
ligand_outfile_prefix                                        2nnq.lig.min
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

If the process is successful a new file (2nnq.lig.min_scored.mol2) will be generated. You can compare how is it changed from the initial structure by analyzing the RMSD value generated in the file. Visualize the new mol2 file along with receptor and the initial ligand mol2 files using Chimera to see the differences.

2nnq_receptor with the original ligand and the minimized ligand

Molecular Footprint

Molecular footprints can be used to determine how a ligand interacts with the receptor. Usually, the molecular footprint shows electrostatic interactions and Van der Waals interactions. Here, the molecular footprint will be used to determine how the ligand interacts with the receptor before and after minimization. To generate molecular footprints use following steps.

Go to directory 6.footprint

Generate an input file by typing;

touch footprint.in

Use DOCK6 to generate footprints

dock6 -i footprint.in

Use the following lines to answer the prompted questions.

conformer_search_type                                        rigid
use_internal_energy                                          no
ligand_atom_file                                             2nnq_lig_min.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           no
grid_score_secondary                                         no
multigrid_score_primary                                      no
multigrid_score_secondary                                    no
dock3.5_score_primary                                        no
dock3.5_score_secondary                                      no
continuous_score_primary                                     no
continuous_score_secondary                                   no
footprint_similarity_score_primary                           yes
footprint_similarity_score_secondary                         no
fps_score_use_footprint_reference_mol2                       yes
fps_score_footprint_reference_mol2_filename                  2nnq_lig_with.mol2
fps_score_foot_compare_type                                  Euclidean
fps_score_normalize_foot                                     no
fps_score_foot_comp_all_residue                              yes
fps_score_receptor_filename                                  ../1.dockprep/2nnq_rec_withH.mol2
fps_score_vdw_att_exp                                        6
fps_score_vdw_rep_exp                                        12
fps_score_vdw_rep_rad_scale                                  1
fps_score_use_distance_dependent_dielectric                  yes
fps_score_dielectric                                         4.0
fps_score_vdw_fp_scale                                        1
fps_score_es_fp_scale                                        1
fps_score_hb_fp_scale                                        0
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl
ligand_outfile_prefix                                        footprint.out
write_footprints                                             yes
write_hbonds                                                 yes
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Once everything is successful these output files should be generated. (footprint.out_footprint_scored.txt, footprint.out_hbond_scored.txt, footprint.out_scored.mol2)

Use the following script to visualize the molecular footprint.

#!/usr/bin/python

## Run: python plot_footprint_single_magnitude.py {footprint_similarity_output_text_file} 50
##      This plots the residues with the 50 highest peaks, the rest of residues are combined into Remaining

import sys
import math
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator, FormatStrFormatter

####################################################################################################

def identify_residues(filename, max_res):

    ### Read in the footprint txt file
    fp_file = open(filename,'r')
    lines = fp_file.readlines()
    fp_file.close()

    ### Count the number of residues in footprint.txt file
    num_res = 0
    
    for line in lines:
        linesplit = line.split()
        if (len(linesplit) == 8):
            if (linesplit[0] != 'resname'):
                num_res += 1

    ### Create an array to analyze the footprint info
    fp_array = [[0 for i in range(2)] for j in range(num_res)]

    for i in range(num_res):
        fp_array[i][0] = i


    ### Go through again and save the larger one between VDW and ES in the array

    count = 0
    for line in lines:
	linesplit = line.split()
        if (len(linesplit) == 8 and linesplit[0] != 'resname'):
           fp_array[count][1] = max(math.fabs(float(linesplit[2])), math.fabs(float(linesplit[3])), math.fabs(float(linesplit[5])), math.fabs(float(linesplit[6])))
            count += 1

    ### Sort the list by number of hits in count
    fp_array.sort(key=lambda x: x[1])
    resindex_selected = []
    resindex_remainder = []

    ### Get the index list for the residues with the highest counts and the remainder
    for i in range(max_res):
        resindex_selected.append(fp_array[(num_res-1)-i][0])

    for i in range(num_res - max_res):
        resindex_remainder.append(fp_array[i][0])

    resindex_selected.sort()
    resindex_remainder.sort()
    del fp_array[:][:]

    return resindex_selected, resindex_remainder

####################################################################################################

def plot_footprints(filename, resindex_selected, resindex_remainder):

    ### Plot the footprint
        footprint = open(filename,'r')
        lines = footprint.readlines()
        footprint.close()

        ### Store the resname, resid, and fp information appropriately
        resname = []; resid = []; vdw_ref = []; es_ref = []; vdw_pose = []; es_pose = []
        vdw_score = ""; es_score = ""
        vdw_energy = ""; es_energy = ""

        for line in lines:
            linesplit = line.split()
            if (len(linesplit) == 3):
                if (linesplit[1] == 'vdw_fp:'):
                    vdw_score = 'd = '+linesplit[2]
                if (linesplit[1] ==  'es_fp:'):
                    es_score = 'd = '+linesplit[2]
                if (linesplit[1] == 'vdw:'):
                    vdw_energy = 'vdw = '+linesplit[2]+' kcal/mol'
                if (linesplit[1] == 'es:'):
                    es_energy = 'es = '+linesplit[2]+' kcal/mol'
            if (len(linesplit) == 8):
                if (linesplit[0] != 'resname'):
                    resname.append(linesplit[0])
                    resid.append(linesplit[1])
                    vdw_ref.append(float(linesplit[2]))
                    es_ref.append(float(linesplit[3]))
                    vdw_pose.append(float(linesplit[5]))
                    es_pose.append(float(linesplit[6]))

        ### Put the selected residues onto a selected array
        resname_selected = []
        vdw_ref_selected = []; es_ref_selected = []; vdw_pose_selected = []; es_pose_selected = []

        for i in (resindex_selected):
            resname_selected.append(resname[i]+resid[i])
            vdw_ref_selected.append(vdw_ref[i])
            es_ref_selected.append(es_ref[i])
            vdw_pose_selected.append(vdw_pose[i])
            es_pose_selected.append(es_pose[i])

        ### Compute the sums for the remainder residues
        vdw_ref_remainder = 0; es_ref_remainder = 0; vdw_pose_remainder = 0; es_pose_remainder = 0

        for i in (resindex_remainder):
            vdw_ref_remainder += vdw_ref[i]
            es_ref_remainder += es_ref[i]
            vdw_pose_remainder += vdw_pose[i]
            es_pose_remainder += es_pose[i]

        ### Append the remainders to the end of the selected arrays
        resname_selected.append('REMAIN')
        vdw_ref_selected.append(vdw_ref_remainder)
        es_ref_selected.append(es_ref_remainder)
        vdw_pose_selected.append(vdw_pose_remainder)
        es_pose_selected.append(es_pose_remainder)
        
        ### Create an index for plotting
        residue = []
        
        for i in range(len(resname_selected)):
            residue.append(i)

        ### Check for self consistency
        #print (sum(vdw_pose_selected) - sum(vdw_pose)) + (sum(vdw_ref_selected) - sum(vdw_ref)) + (sum(es_pose_selected) - sum(es_pose)) + (sum(es_ref_selected) - sum (es_ref))

        ### Plot the figure
        fig = plt.figure(figsize=(12, 11))
        ax1 = fig.add_subplot(2,1,1)
        ax1.set_title(filename.strip())
        plt.plot(residue, vdw_ref_selected, 'b', linewidth=3)
        plt.plot(residue, vdw_pose_selected, 'r', linewidth=3)
        ax1.set_ylabel('VDW Energy')
        ax1.set_ylim(-10, 5)
        ax1.set_xlim(0, len(resname_selected))
        ax 1.xaxis.set_major_locator(MultipleLocator(1))
        ax1.xaxis.set_major_formatter(FormatStrFormatter('%s'))
        ax1.set_xticks(residue)
        ax1.xaxis.grid(which='major', color='black', linestyle='solid')
        ax1.set_xticklabels(resname_selected, rotation=90)
        ax1.legend(['Reference', 'Pose'])
        ax1.annotate(vdw_score, xy=(37,-8), backgroundcolor='white', bbox={'facecolor':'white', 'alpha':1.0, 'pad':10})
        ax1.annotate(vdw_energy, xy=(37,-9), backgroundcolor='white', bbox={'facecolor':'white', 'alpha':1.0, 'pad':10})
       
        ax2 = fig.add_subplot(2,1,2)
        plt.plot(residue, es_ref_selected, 'b', linewidth=3)
        plt.plot(residue, es_pose_selected, 'r', linewidth=3)
        ax2.set_ylabel('ES Energy')
        ax2.set_ylim(-10, 5)
        ax2.set_xlim(0, len(resname_selected))
        ax2.xaxis.set_major_locator(MultipleLocator(1))
        ax2.xaxis.set_major_formatter(FormatStrFormatter('%s'))
        ax2.set_xticks(residue)
        ax2.xaxis.grid(which='major', color='black', linestyle='solid')
        ax2.set_xticklabels(resname_selected, rotation=90)
        ax2.legend()
        ax2.annotate(es_score, xy=(37,-8), backgroundcolor='white', bbox={'facecolor':'white', 'alpha':1.0, 'pad':10})
        ax2.annotate(es_energy, xy=(37,-9), backgroundcolor='white', bbox={'facecolor':'white', 'alpha':1.0, 'pad':10})

plt.show()

        #plt.savefig(filename.strip()+'.pdf')
        plt.close()

        del resname[:]
        del resid[:]
        del vdw_ref[:]
        del es_ref[:]
        del vdw_pose[:]
        del es_pose[:]
        del resname_selected[:]
        del vdw_ref_selected[:]
        del es_ref_selected[:]
        del vdw_pose_selected[:]
        del es_pose_selected[:]
        del residue[:]

    	return

####################################################################################################

def main():

    ### Get the command line arguments
     filename = sys.argv[1]
    max_res = int(sys.argv[2])

   ### Go through the first time to identify interactions above the threshold
    (resindex_selected, resindex_remainder) = identify_residues(filename, max_res)

    #print resindex_selected; print "\n"; print resindex_remainder
    #print "\n"; print len(resindex_selected); print len(resindex_remainder)

    ### Go through a second time to write plots to file
    plot_footprints(filename, resindex_selected, resindex_remainder)


    return

####################################################################################################

main()

Rigid Docking

Create an input file for rigid docking

touch rigid.in

Run dock using the created input file.

dock6 -i rigid.in

Follow a similar approach as we did for minimization to answer the prompted questions by either answering them manually using the answers in the lines below or by including the following lines in the input file before running dock.

conformer_search_type                                        rigid
use_internal_energy                                          yes
ligand_atom_file                                             2nnq.lig.min_scored.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       2nnq.lig.min_scored.mol2
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           ../2.surface_spheres/selected_spheres.sph
max_orientations                                             1000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../3.boxgrid/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl
ligand_outfile_prefix                                        rigid.out
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Once rigid docking is successful, you will get an output file. (rigid.out_scored.mol2) Visualize the output file using Chimera by following steps to check the rigid docking success.

Open Chimera
File -> Open -> 2nnq_rec_withH.mol2
File -> Open -> 2nnq_lig_withH.mol2
Tools -> Surface/binding Analysis -> ViewDock -> Select the Rigid Dock output file. (rigid.out_scored.mol2)
In the loaded dialog box select Dock4,5 or 6

Once everything is loaded go to the ViewDock window and use it's menu to view all the calculated properties regarding the rigid docked ligand by following the steps below.

Column -> Show -> gridscore
Column -> Show -> HA_RMSDs
Follow the same steps to get all the properties

Your visualized structure should be similar to the image below.

Rigid docking results for 2nnq

2018 DOCK tutorial 1 with PDBID 2NNQ

Contents

I. Introduction

DOCK

2NNQ

Organization of Directories

II. Preparation of the ligand and receptor

Checking the structure

Preparation of receptor

Preparation of ligand

III. Generating receptor surface and spheres

Preparation of DMS file

Generating spheres

Selecting Spheres

IV. Generating box and grid

Generating box

Generating grid

V. Docking a single molecule for pose reproduction

Energy minimization

Molecular Footprint

Rigid Docking

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Rizzo Lab

Courses

Toolbox