2020 DOCK tutorial 2 with PDBID 2GQG

From Rizzo_Lab
Revision as of 15:33, 24 February 2020 by Stonybrook (talk | contribs) (V. Single-Molecule Docking)
Jump to: navigation, search

I. Introduction


DOCK, a commonly used computational tool used to sample a library of small molecules, ligands and attempts to successfully dock these within their target site, typically a rigid protein into their most energetically favorable positions. To accomplish this, first DOCK uses a global search of the entire protein to determine which locations will be the most energetically favorable, which will be the anchors. Following this, these ligands will take on a variety of different geometric poses to obtain the most conformationally favorable ligand positions. This software is commonly used to perform the hit to lead process in drug discovery to narrow down the drug possibilities from up to hundreds of millions to a few hundred. The drug discovery process then continues using this framework to further select and refine the potential drug candidates.


This rigid protein is of the ABL Kinase Domain with the ligand being Dasatinib. This structure was identified using X-ray crystallography with a 2.4A resolution, lower resolutions are preferred.


For this part of the experiment, create an initial directory in your linux operating systems to work on your experiment

          mkdir 2GQG_Experiment

Following this change the directory to this directory

          cd 2GQG_Experiment

Use the mkdir command in your linux operating system to make all these directories to store your files for the experiment


II. Protein and Ligand Preparation

For the first step open up your chimera, go to file and fetch by ID to retrieve the PDB file 2GQG.

The Kinase is dimerized with another kinase to form 2 different chains so delete one of these chains. To produce the resulting image shown in Figure 1.

Check the structure

Read over the information for this protein-ligand complex to determine the charge and environment of the structure to make sure everything is correct. Make sure that there are no issues with the molecules, that aren't physically possible because otherwise they do not accurately represent the experimentally known.

Receptor Preparation

Open up the 2GQG.pdb file, Delete all non standard residues in the structures including the solvent molecules, water and the ligand. To do this task, go in chimera to select->residue->all-nonstd atoms. Following this delete these molecules.

Unique features specifically to this receptor, Delete the Phosphorous atom from this file because it isn't close to the docking site and this phosphorous atom is significantly difficult to parameterize. Also swap two of the residues in the protein structure, the PTR residue on residue 172 with TYR and the ARG residue on 164 with another ARG. To swap these residues, change these residue structures

Use this command line to switch the residue. This swaps the current 172 residue on chain a with tyr. Use this same command to swap the argine residues.

          swapaa tyr :172.a

Delete all the H's remaining on the receptor using the chimera database select->chemistry->elements->H. Then delete using the Actions->Atoms/bonds->delete command. Save this receptor as 2GQG_receptor_noH.mol2 using the command File->save mol2

Using the 2GQG_receptor_noH.mol2 add H's to the receptor with the command tools->structural editing->add H. Following this add amber charges to the molecule using the command tools-> structural editing-> add charge. The charge you should add are AM1BCC charges to the residues. Without these charges the coulomb's interaction wouldn't be calculated in the docking software.

Save the receptor as 2GQG_receptor_wH.mol2

Ligand Preparation

Open up the original PDB file in chimera, identify which nonstandard reside is the ligand using the select->residues->nonstd amino acid. Once the ligand has been selected use the inverted selected models select->invert(selected models) to select everything besides the ligand and delete these structures.

Add the hydrogens and AM1BCC charges to the ligand structure. Save this structure as 2GQG_ligand_wH.mol2

Add all these files into the 01.dock_prep directory

III. Generating Surface and Spheres

The purpose of this part is determine the anchor positions on the protein which is where the ligands will grow from.

DMS File Preparation

Open the 2GQG_receptor_noH.mol2 within Chimera, make sure to use the receptor file with no H.

First calculate the surface shape of the receptor using action-> surface -> show

Save this sphere file into a DMS file using the command Tools-> Structural Editing -> Write DMS and save as 2GQG_receptor_noH.dms

Generating Spheres

change directories into the 03.surface_spheres directory

      cd 03.surface_spheres

Create a new file titled INSPH using the vim command in linux, then type the following into the INSPH file

      vim INSPH

This file generates all the spheres that are capable of occupying the protein.


The first line specifies the surface file of the receptor. The R tells the program to generate external spheres, not internal ones; the X represents all points on the surface; 0.0 specifies the minimum distance between spheres and the surface - 0.0 avoids steric clashes; 4.0 and 1.4 are the maximum and minimum sphere radii. Once the INSPH file is ready, generate the spheres themselves using the following command:

sphgen -i INSPH -o OUTSPH

The sphgen program will calculate the spheres and write them to the file specified in INSPH, and print a description of the program running to OUTSPH.

Selecting Spheres

This will selectively choose the spheres in the sphere file previously made and choose the spheres less than 10.0 angstroms. The purpose of this part is to narrow the docking process by choosing only the places that are in the binding pocket of the ligand. If this search isn't performed the dock search would run docking simulations attempting to dock the ligand into unfavorable binding sites, both wasting computational resources and producing inaccurate results.

sphere_selector 2GQG_receptor_noH.sph ../01.dock_prep/2GQG_ligand_wH.mol2

The selected spheres will be written to the file selected_spheres.sph

IV. Box and Energy Grid

When docking small molecules to proteins, DOCK software uses a pre-generated energy grid to calculate the energy score between any given ligand pose and the static, rigid receptor. DOCK also often ignores any part of the receptor that is too far from the ligand binding site, to avoid excessive long-distance contacts and focus on local contacts near the binding site. This cutoff is created in the form of a box around the selected spheres.

Generating the Box

First, move to the appropriate directory for grid and box preparation:

cd ../03.grid_box

Similar to sphgen above, the showbox program uses an input file with paramaters. Create this file:

vim showbox.in

and fill it with these lines:


The Y tells the program to generate a box; 8.0 is how far out (in angstroms) the box edges should be from the spheres; the third line is the path to the selected sphere; the last line is the path for the box output as a pdb file. Run the showbox command with showbox.in as an input file:

showbox < showbox.in

This will generate the file 2GQG.box.pdb.

Grid Calculation

Next, we are going to use the program 'grid' to generate a grid. This program needs an input file. This file can either be empty or pre-filled with parameters; if using an empty file, the program will prompt the user for the value of each parameter.

Create the input file:

touch grid.in

Then run the program, with this file after the input flag:

grid -i grid.in

Answer the prompts with the following parameters, or fill the grid.in file in advance using vim:

compute_grids                             yes
grid_spacing                              0.4
output_molecule                           no
contact_score                             no
energy_score                              yes
energy_cutoff_distance                    9999
atom_model                                a
attractive_exponent                       6
repulsive_exponent                        12
distance_dielectric                       yes
dielectric_factor                         4
bump_filter                               yes
bump_overlap                              0.75
receptor_file                             ../01.dock_prep/2GQG_receptor_wH.mol2
box_file                                  2GQG.box.pdb
vdw_definition_file                       /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
score_grid_prefix                         grid

The parameters generally have descriptive names. Essentially, this will generate an energy function grid with a potential at each point every 0.4 angstroms. The grid will have no contact score, an energy score using an all-atom force field with a 6-12 Lennard-Jones potential and a distance dielectric of 4. The last three lines specify the paths to the box defining the boundaries of the grid, the force field parameters, and the prefix for the output files.

If the program runs successfully, it will take a few minutes as it is a non-trivial calculation. The program will create three files: grid.bmp, grid.nrg, and gridinfo.out. Check the .out file to make sure the program was successful; make sure the overall charge is an integer and matches the charge of the original receptor file.

V. Single-Molecule Docking

To test the grid and other files prepared, the ligand from the crystal structure will be docked back into the receptor for pose reproduction.

Energy Minimization

First, energy minimization of the ligand is necessary, to make sure the ligand file used in docking is appropriate for the force field and will behave well. Move to the docking directory:

cd ../04.dock

From here, dock6 is the main program we are using. Similar to grid, this program can accept an input file full of parameters or an empty file filled based on answers to prompts on the command line. Create an input file for the minimization parameters:

touch min.in

Then run the DOCK program:

dock6 -i min.in

Answer the prompts or pre-fill the input file with the following parameters:

conformer_search_type                                        rigid                                          
use_internal_energy                                          yes                                            
internal_energy_rep_exp                                      12                                             
internal_energy_cutoff                                       100.0                                          
ligand_atom_file                                             ../1.dock_prep/2GQG_ligand_wH.mol2
limit_max_ligands                                            no                                             
skip_molecule                                                no                                             
read_mol_solvation                                           no                                             
calculate_rmsd                                               yes                                            
use_rmsd_reference_mol                                       yes                                            
rmsd_reference_filename                                      ../1.dock_prep/2GQG_ligand_wH.mol2
use_database_filter                                          no                                             
orient_ligand                                                no                                             
bump_filter                                                  no                                             
score_molecules                                              yes                                            
contact_score_primary                                        no                                             
contact_score_secondary                                      no                                             
grid_score_primary                                           yes                                            
grid_score_secondary                                         no                                             
grid_score_rep_rad_scale                                     1                                              
grid_score_vdw_scale                                         1                                              
grid_score_es_scale                                          1                                              
grid_score_grid_prefix                                       ../03.grid_box/grid                                         
multigrid_score_secondary                                    no                                             
dock3.5_score_secondary                                      no                                             
continuous_score_secondary                                   no                                             
footprint_similarity_score_secondary                         no                                             
pharmacophore_score_secondary                                no                                             
descriptor_score_secondary                                   no                                             
gbsa_zou_score_secondary                                     no                                             
gbsa_hawkins_score_secondary                                 no                                             
SASA_score_secondary                                         no                                             
amber_score_secondary                                        no                                             
minimize_ligand                                              yes                                            
simplex_max_iterations                                       1000                                           
simplex_tors_premin_iterations                               0                                              
simplex_max_cycles                                           1                                              
simplex_score_converge                                       0.1                                            
simplex_cycle_converge                                       1.0                                            
simplex_trans_step                                           1.0                                            
simplex_rot_step                                             0.1                                            
simplex_tors_step                                            10.0                                           
simplex_random_seed                                          0                                              
simplex_restraint_min                                        yes                                            
simplex_coefficient_restraint                                10.0                                           
atom_model                                                   all                                            
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        2GQG_ligand_min                                   
write_orientations                                           no                                             
num_scored_conformers                                        1                                              
rank_ligands                                                 no

This program will calculate the energy-minimized ligand by relaxing internal degrees of freedom within the energy grid, and generate the new file 2GQG_ligand_min_scored.mol2. The RMSD between the original and new ligand will also be shown, it should be less than 2.0 angstroms. Visualize the differences in chimera.