Difference between revisions of "2015 DOCK tutorial with Poly(ADP-ribose) polymerase (PARP)"

From Rizzo_Lab
Jump to: navigation, search
(QSUB File)
 
(103 intermediate revisions by the same user not shown)
Line 12: Line 12:
 
# Non-rigid segments added in layers; energy minimized.
 
# Non-rigid segments added in layers; energy minimized.
 
# The resulting configurations are 'pruned' and energy re-minimized, yielding the docked configurations.
 
# The resulting configurations are 'pruned' and energy re-minimized, yielding the docked configurations.
 +
 +
[[Image:2014_1HVR_receptor_surface.png|thumb|center|375px|1HVR Receptor surface]]
  
 
===Poly ADP Ribose Polymerase (PARP)===
 
===Poly ADP Ribose Polymerase (PARP)===
Line 33: Line 35:
 
In addition, most of the important files that are derived from the original crystal structure will be given a prefix that is the same as the PDB code, '4TKG'. The following sections in this tutorial will adhere to this directory structure/naming scheme.
 
In addition, most of the important files that are derived from the original crystal structure will be given a prefix that is the same as the PDB code, '4TKG'. The following sections in this tutorial will adhere to this directory structure/naming scheme.
  
===Generating Receptor(4TKG) Surface and Spheres===
 
  
The spheres generated in this section guides the placement of the ligand and as a result, reduces the search space in the binding site.
+
==II. Preparing the Receptor and Ligand==
 +
 
 +
 
 +
Go to the Protein Databank Website (pdb.org) and search for 4TKG.  This is the code for PARP protein crystal structure in complex with olaparib.  Download the PDB (text) file for this protein.  You will then want go to your /00.files directory and copy this file using the command below.
 +
 
 +
  cp ~/Downloads/4TKG.pdb ./
 +
 
 +
And then we will create 4 files in 01.dockprep/ directory:
 +
  4TKG.dockprep.mol2 
 +
  4TKG.lig.mol2 
 +
  4TKG.rec.mol2 
 +
  4TKG.rec.noH.pdb
 +
 
 +
''Create the dockprep file''
 +
 
 +
To create a temp.pdb file, you will first need to open 4TKG.pdb in Chimera. You will notice that there are four copies of this protein-ligand complex in the original crystal structure.  Since we only want to work with one of these, select 'Chain A' (select->chain->A).  Once Chain A is selected we will invert the selection by going the the 'Select' tab->invert(all models).  Next we will delete these chains by going to the 'Actions' tab->atoms/bonds->delete. Save this file as temp.pdb in your 00.files directory.
 +
 
 +
For the "4TKG.dockprep.mol2" file: open the temp.pdb in Chimera; delete the water molecules. In order to prepare your file for docking you will want to follow this workflow "Tools->Surface/Binding Analysis->Dock Prep".  Use all of the default settings and select charge model AMBER ff14SB. Save this file as 4TKG.dockprep.mol2 in your 01.dockprep directory. 
 +
 
 +
Next, we will want to make the ligand file.  To accomplish this, you will open your 4TKG.dockprep.mol2 file in Chimera.  Select the ligand and 'invert selection' as before.  You can then proceed to delete everything except the ligand.  Save this as 4TKG.lig.mol2 in your 01.dockprep directory.
 +
 
 +
Once again, open up the 4TKG.dockprep.mol2 file in Chimera.  Select the ligand and delete it. Save this as 4TKG.rec.mol2.  From here select all the hydrogens and delete them.  Save this as 4TKG.rec.noH.pdb.
 +
 
 +
 
 +
 
 +
==III. Generating Receptor Surface and Spheres==
 +
 
 +
 
 +
 
 +
=== Generating the Receptor Surface ===
 +
The following steps will be done in 002.surface-spheres directory:
 +
  cd 02.surface-spheres
 +
 
 +
Generating receptor surface will be done with Chimera, in the Terminal command line:
 +
  Chimera
 +
Do file-->open-->4TKG.rec.noH.pdb (The PDB file of 4TKG which hydrogens were removed and saved in 01.dockprep)
 +
Do Action-->Surface-->show (The surface should be shown.)
 +
 
 +
Then do Tools-->Structure editing-->Write DMS, and save the file as 4TKG.dms
 +
 
 +
=== Placing Spheres ===
 +
 
 +
'''Sphgen''' is a program that generate sets of overlapping spheres that define the shape of a molecule or molecular surface. Spheres are generated over the entire receptor and ligand surface.
 +
For further information on how '''Sphgen''' functions, please refer to the latest version of the DOCK manual:
 +
 
 +
''<http://dock.compbio.ucsf.edu/DOCK_6/dock6_manual.htm>''
 +
 
 +
To generate spheres using Sphgen follow the steps below:
 +
 
 +
'''Step 1.''' Create an input file name '''INSPH''' with the following information:
 +
 
 +
vim INSPH
 +
 
 +
4TKG.rec.dms #surface file generated above will be the input file
 +
R            #flag to place spheres outside (R) or inside (L) of the surface
 +
X            #flag that informs sphgen of the subset of surface points to be used (X = all points)
 +
0.0          #flag that prevent the generation of large spheres with close surface contacts(default= 0)
 +
4.0          #maximum radius of the spheres generated (default = 4.0 Angstroms)
 +
1.4          #minimum radius of the spheres generated (default = radius of probe)
 +
4TKG.rec.sph #this will be the file which contained the clustered spheres generated
 +
 
 +
'''Step 2.''' Run the program Sphgen using the command sphgen:
 +
 
 +
sphgen -i INSPH -o OUTSPH
 +
 
 +
'''-i''' is the flag that give sphgen the input file '''INSPH'''
 +
'''INSPH''' is the file created above that gives sphgen instructions
 +
'''-o''' is the flag to create the oputput file
 +
'''OUTSPH''' is the output file with the information of the spheres generated from sphgen
 +
 
 +
'''Step 3.''' Visualization of the spheres generated:
 +
 
 +
Visualization of the spheres can be done directly with '''chimera''' or with the program '''showsphere'''
 +
 
 +
'''3 a.''' Visualization directly with '''Chimera''':
 +
 
 +
* Launch '''Chimera''', choose ''File'' -> ''Open'', choose '''4TKG.rec.mol2'''
 +
* choose ''File'' -> ''Open'', choose '''4TKG.rec.sph'''
 +
 
 +
You should have an image like this:
 +
[[Image:4tkg.rec_allsph.png‎|thumb|center|375px|4TKG Receptor surface (light gray) with all spheres (various colors) generated]]
 +
 
 +
'''3 b.''' Visualization with '''showsphere''':
 +
 
 +
'''showsphere''' convert the '''.sph''' file into PDB format.
 +
 
 +
(i) Run '''showsphere''', by typing showsphere into the terminal:
 +
 
 +
showsphere
 +
 
 +
You will be prompted with the following questions:
 +
 
 +
Enter name of sphere cluster file:
 +
      '''4TKG.rec.sph'''
 +
Enter cluster number to process (<0 = all):
 +
      '''-1'''
 +
Generate surfaces as well as pdb files (<N>/Y)?
 +
      '''N'''
 +
Enter name for output file prefix:
 +
      '''output_spheres'''
 +
Process cluster 0 (contains ALL spheres) (<N>/Y)?
 +
      '''N'''
 +
 
 +
'''-1''' is a flag that allow you to see all possible spheres
 +
 
 +
(ii) Open '''Chimera'''
 +
* Launch '''Chimera''', choose ''File'' -> ''Open'', choose '''4TKG.rec.noH.pdb'''
 +
* Go ''File'' -> ''Open'', choose '''output_spheres.pdb'''
 +
You should see many spheres placed all over the receptor surface.
 +
 
 +
[[Image:4tkg.rec_allsph_selectsphere.png‎‎|thumb|center|375px|4TKG Receptor surface (light gray) with all spheres (various colors) using the program showsphere for visualization. The Yellow spheres are located in the binding site of the receptor]]
 +
 
 +
'''Step 4.''' Selecting spheres of interest:
 +
 
 +
To select spheres of interest you need to run a program name '''sphere_selector''' in the terminal. The idea is to allow the program to select spheres that are within a user-defined radius (in this case, 8.0 angstroms) of a target molecule or a known binding site:
 +
 
 +
sphere_selector 4TKG.rec.sph ../01.dockprep/4TKG.lig.mol2 8.0
 +
 
 +
A new file name '''selected_spheres.sph''' will be generated.
 +
 
 +
'''Step 5.''' Visualize the spheres using '''showsphere''' as previously done:
 +
 
 +
showsphere
 +
 
 +
When prompted on the command line, answering the questions as follows:
 +
 
 +
Enter name of sphere cluster file:
 +
      '''selected_spheres.sph'''
 +
Enter cluster number to process (<0 = all):
 +
      '''-1'''
 +
Generate surfaces as well as pdb files (<N>/Y)?
 +
      '''N'''
 +
Enter name for output file prefix:
 +
      '''output_spheres_selected'''
 +
Process cluster 0 (contains ALL spheres) (<N>/Y)?
 +
      '''N'''
 +
 
 +
View spheres in '''Chimera''':
 +
 
 +
* Launch '''Chimera''', choose ''File'' -> ''Open'', choose '''4TKG.rec.noH.pdb'''
 +
* Go ''File'' -> ''Open'', choosing '''output_spheres_selected.pdb'''
 +
* Go ''Select'' -> ''Residue'' -> ''SPH''
 +
* Go, ''Actions'' -> ''Atoms/Bonds'' -> ''sphere''
 +
 
 +
[[Image:4tkg.rec_selectedsph.png‎|thumb|center|375px|4TKG Receptor surface (light gray) with spheres (blue) within 8A]]
 +
 
 +
==IV. Generating Box and Grid==
 +
 
 +
=== Box Generation===
 +
 
 +
To further calculate the grid and do grid-based scoring in DOCK , a box is generated beforehand. The interactive program showbox is used to visualize and define the location and size of the grid to be calculated using grid. The output of the showbox is in PDB format and can be visualized using a program capable of displaying PDB files like Chimera. Here below is a step by step box generation:
 +
 
 +
* Go to the directory: '''03.box-grid/'''
 +
    cd ../03.box-grid
 +
 
 +
 
 +
* Make a new file in this directory and name it '''showbox.in'''
 +
    vim showbox.in
 +
 
 +
* This will automatically open the file '''showbox.in'''. Edit the file '''showbox.in''' as follows:
 +
 
 +
    Y                                              # Yes, generate a box
 +
    8.0                                            # Size of the box in Angstroms
 +
    ../02.surface-spheres/selected_spheres.sph      # Sphere.sph file
 +
    1                                              # Cluster number
 +
    4TKG.box.pdb                                    # Name of the output file
 +
 
 +
 
 +
* Save the file using the command:
 +
    :wq
 +
 
 +
* Run the command:
 +
    showbox < showbox.in
 +
 
 +
 
 +
View box in '''Chimera''':
 +
 
 +
* Launch '''Chimera''', choose ''File'' -> ''Open'', choose '''output_spheres_selected.pdb'''
 +
* Go ''File'' -> ''Open'', choosing '''4TKG.box.pdb'''
 +
 
 +
 
 +
 
 +
[[Image:box.png|thumb|center|375px|Selected sphere surface with generated box]]
 +
 
 +
=== Grid computing===
 +
In order to save computational resources and speed up the docking process, we let dock to pre-calculate the potential energy around the docking region, which defined by previous section, before we perform docking calculation.
 +
 
 +
In grid program, there are two ways to evaluate the potential energy in docking region: contact and energy scoring. The users could apply these two method independently to their docking system by simply typing “yes/no” in corresponding section of input file grid.in. Once the grid calculatin finished, the grid results will be saved in the corresponding extension files: *.cnt and *.nrg. Another important parameter in grid program is bump grid. This variable determines the degree of overlapping among atoms in receptor. The usage method is same as contact or energy scoring.
 +
 
 +
In this tutorial, we just use the energy scoring option to evaluate the potential energy in docking region. Mathematically grid use the empirical London-Jone's model and Coulomb electrostatic interaction function to approximate the potential energy in each grid points. For more details about the theoretical bases for grid calculation, please refer to:[[Energy Scoring Method in Grid]]
 +
 
 +
[[Image:Energy scoring general.png|thumb|center|650px|Energy scoring function]]
 +
 
 +
The equation form of Coulumb potential is fixed. However, you could specify the exponent orders for vdw interaction calculation(a and b) by setting the attractive_exponent and repulsive_exponent variable value. Other coefficients in London-Jone model are specified by the vdw_AMBER_parm99.defn file.
 +
 
 +
In practice, you have two ways to calculate the grid.
 +
 
 +
The more clear and efficient way:
 +
 
 +
* Create the grid input file:
 +
  vi grid.in
 +
 
 +
* Setting all variables in grid.in. And run the program:
 +
  grid -i grid.in
 +
 
 +
A more interactive and user friendly way:
 +
 
 +
* Run the grid program
 +
  grid
 +
Then follow the instruction of the program and setting all variables by answering each questions. If it were you first time to run dock, we strongly recommend you to use second method.
 +
For this tutorial, we summarize all parameters that will be needed and give a brief description.
 +
{| border="1" cellpadding="5" cellspacing="0" align="center"
 +
|+ '''grid.in'''
 +
|-
 +
! style="background: #efefef;" | Parameter
 +
! style="background: #efefef;" | Value
 +
! style="background: #efefef;" | Description
 +
|-
 +
| compute_grids
 +
| yes 
 +
| compute scoring grids (yes)
 +
|-
 +
| grid_spacing
 +
| 0.4
 +
| distance between grid points along each axis (in &Aring;).
 +
|-
 +
| output_molecule
 +
| no
 +
| write up coordinates of the receptor into a new file
 +
|-
 +
| contact_score 
 +
| no
 +
| compute contact grid? default is no
 +
|-
 +
| energy_score
 +
| yes
 +
| compute energy score? yes - we are using this method to compute force fields on probes
 +
|-
 +
| energy_cutoff_distance 
 +
| 9999
 +
| the max distance between atoms for the energy contribution to be computed
 +
|-
 +
| atom_model 
 +
| a
 +
| atom_model u means united atom model where atoms are attached to hydrogens, and a stands for all-atom model, where hydrogens on carbons are treated separately
 +
|-
 +
| attractive_exponent
 +
| 6
 +
| attractive component stands for exponent of the attractive LJ term in VDW potential
 +
|-
 +
| repulsive_exponent
 +
| 9
 +
| repulsive component stands for exponent in the repulsive LJ term in VDW potential
 +
|-
 +
| distance_dielectric
 +
| yes
 +
| distance dielectric stands for the dielectric constant to be linearly dependent on distance
 +
|-
 +
| dielectric_factor
 +
| 4
 +
| distance dielectric factor is the coefficient of the dielectric
 +
|-
 +
| bump_filter
 +
| yes
 +
| bump filter flag determines if we want to screen orientation for clashes before scoring and minimization
 +
|-
 +
| bump_overlap
 +
| 0.75
 +
| bump_overlap stands for the fraction of allowed overlap among receptor's atoms where 1 corresponds to no allowed overlap and 0 corresponds to full overlap being permitted.
 +
|-
 +
| receptor_file                     
 +
| ../01.dockprep/4TKG.receptor.mol2
 +
| our receptor file
 +
|-
 +
| box_file                           
 +
| 4TKG.box.pdb
 +
| the box file we generated in the Box section
 +
|-
 +
| vdw_definition_file         
 +
| ../zzz.parameters/vdw_AMBER_parm99.defn
 +
| van der Waals parameters file
 +
|-
 +
| style="border-bottom: 3px solid grey;" |  score_grid_prefix             
 +
| style="border-bottom: 3px solid grey;" | grid
 +
| style="border-bottom: 3px solid grey;" | prefix for the grid file name; all the extensions will be generated automatically.
 +
|-
 +
 
 +
|}
 +
 
 +
More detail, Please refer to:
 +
 
 +
http://dock.compbio.ucsf.edu/DOCK_6/tutorials/grid_generation/generating_grid.html
 +
 
 +
http://dock.compbio.ucsf.edu/DOCK_6/dock6_manual.htm#GridOverview
 +
 
 +
==V. Docking a Single Molecule for Pose Reproduction==
 +
 
 +
 
 +
'''Minimizations'''
 +
 
 +
One can use the DOCK software to perform a energy minimization. For example, say you wish to minimize a ligand in a receptor's binding site. One way to do this is to first generate the grid for the receptor alone without the ligand. Then, you minimize the ligand in this grid.
 +
 
 +
To accomplish this, first generate the grid for the receptor (if you haven't already done so). Next, create a text file by using the following command:
 +
 
 +
touch min.in
 +
 
 +
This command creates a text file named min.in but leaves it empty. Once you have done this, run the following command:
 +
 
 +
dock6 -i min.in
 +
 
 +
The dock program will start and it will begin to ask you a series of questions. Answer the questions according to the outline below:
 +
 
 +
Note: as you answer these questions, the answers will be saved to the min.in file so that next time you don't have to type the answers in by hand.
 +
 
 +
 
 +
ligand_atom_file                                            ../01.dockprep/4TKG.lig.mol2
 +
 
 +
limit_max_ligands                                            no
 +
 
 +
skip_molecule                                                no
 +
 
 +
read_mol_solvation                                          no
 +
 
 +
calculate_rmsd                                              yes
 +
 
 +
use_rmsd_reference_mol                                      no
 +
 
 +
use_database_filter                                          no
 +
 
 +
orient_ligand                                                no
 +
 
 +
use_internal_energy                                          yes
 +
 
 +
internal_energy_rep_exp                                      12
 +
 
 +
flexible_ligand                                              no
 +
 
 +
bump_filter                                                  no
 +
 
 +
score_molecules                                              yes
 +
 
 +
contact_score_primary                                        no
 +
 
 +
contact_score_secondary                                      no
 +
 
 +
grid_score_primary                                          yes
 +
 
 +
grid_score_secondary                                        no
 +
 
 +
grid_score_rep_rad_scale                                    1
 +
 
 +
grid_score_vdw_scale                                        1
 +
 
 +
grid_score_es_scale                                          1
 +
 
 +
grid_score_grid_prefix                                      ../03.box-grid/grid
 +
 
 +
multigrid_score_secondary                                    no
 +
 
 +
dock3.5_score_secondary                                      no
 +
 
 +
continuous_score_secondary                                  no
 +
 
 +
descriptor_score_secondary                                  no
 +
 
 +
gbsa_zou_score_secondary                                    no
 +
 
 +
gbsa_hawkins_score_secondary                                no
 +
 
 +
SASA_descriptor_score_secondary                              no
 +
 
 +
amber_score_secondary                                        no
 +
 
 +
minimize_ligand                                              yes
 +
 
 +
simplex_max_iterations                                      1000
 +
 
 +
simplex_tors_premin_iterations                              0
 +
 
 +
simplex_max_cycles                                          1
 +
 
 +
simplex_score_converge                                      0.1
 +
 
 +
simplex_cycle_converge                                      1.0
 +
 
 +
simplex_trans_step                                          1.0
 +
 
 +
simplex_rot_step                                            0.1
 +
 
 +
simplex_tors_step                                            10.0
 +
 
 +
simplex_random_seed                                          0
 +
 
 +
simplex_restraint_min                                        yes
 +
 
 +
simplex_coefficient_restraint                                10.0
 +
 
 +
atom_model                                                  all
 +
 
 +
vdw_defn_file                                                ../zzz.parameters/vdw_AMBER_parm99.defn
 +
 
 +
flex_defn_file                                              ../zzz.parameters/flex.defn
 +
 
 +
flex_drive_file                                              ../zzz.parameters/flex_drive.tbl
 +
 
 +
ligand_outfile_prefix                                        4TKG.min
 +
 
 +
write_orientations                                          no
 +
 
 +
num_scored_conformers                                        1
 +
 
 +
rank_ligands                                                no
 +
 
 +
 
 +
Once you have answered all the questions, dock will perform the minimization and will create an output file named 4TKG.min_scored.mol2
 +
 
 +
You can view your minimized ligand by opening it in chimera. To do this, open chimera and then open the receptor mol2 file. Then, open the output mol2 file which you got from the minimization (4TKG.min_scored.mol2).
 +
 
 +
A few points about the min.in input file:
 +
 
 +
Each line in the input file tells dock what to do next. For example, the following lines serve the following purposes:
 +
 
 +
ligand_atom_file specifies the path to the ligand so that dock can retrieve it.
 +
 
 +
minimize_ligand tells dock to perform an energy minimization.
 +
 
 +
grid_score_grid_prefix specifies the path to the grid so that dock can retrieve and use it for the minimization.
 +
 
 +
ligand_outfile_prefix tells dock what prefix to use for the output file.
 +
 
 +
Descriptions for the other lines can be found in the dock user manual.
 +
 
 +
 
 +
 
 +
 
 +
'''Rigid docking'''
 +
 
 +
Rigid docking is a method that is used to dock a ligand to it's receptor without changing the internal conformations of the molecule, but rather as inflexible, rigid molecules. This is much faster than flexible docking and is used in cases where the ligand molecule has too many rotatable bonds. In order to run rigid docking on dock 6, you must first create an input file:
 +
 
 +
touch ­i rgd.in
 +
 
 +
Now, run:
 +
 
 +
dock6 ­i rgd.in
 +
 
 +
You will now be required to answer a series of questions regarding location of input file, docking parameters and output file prefix.
 +
 
 +
Answer all the questions as follows:
 +
 
 +
ligand_atom_file ../01.dockprep/4TKG.lig.mol2
 +
 
 +
limit_max_ligands no
 +
 
 +
skip_molecule no
 +
 
 +
read_mol_solvation no
 +
 
 +
calculate_rmsd yes
 +
 
 +
use_rmsd_reference_mol no
 +
 
 +
use_database_filter no
 +
 
 +
orient_ligand yes
 +
 
 +
automated_matching yes
 +
 
 +
receptor_site_file ../02.surface­spheres/selected_spheres.sph
 +
 
 +
max_orientations 1000
 +
 
 +
critical_points no
 +
 
 +
chemical_matching no
 +
 
 +
use_ligand_spheres no
 +
 
 +
use_internal_energy yes
 +
 
 +
internal_energy_rep_exp 12
 +
 
 +
flexible_ligand no
 +
 
 +
bump_filter no
 +
 
 +
score_molecules yes
 +
 
 +
contact_score_primary no
 +
 
 +
contact_score_secondary no
 +
 
 +
grid_score_primary yes
 +
 
 +
grid_score_secondary no
 +
 
 +
grid_score_rep_rad_scale 1
 +
 
 +
grid_score_vdw_scale 1
 +
 
 +
grid_score_es_scale 1
 +
 
 +
grid_score_grid_prefix ../03.box­grid/grid
 +
 
 +
multigrid_score_secondary no
 +
 
 +
dock3.5_score_secondary no
 +
 
 +
continuous_score_secondary no
 +
 
 +
descriptor_score_secondary no
 +
 
 +
gbsa_zou_score_secondary no
 +
 
 +
gbsa_hawkins_score_secondary no
 +
 
 +
SASA_descriptor_score_secondary no
 +
 
 +
amber_score_secondary no
 +
 
 +
minimize_ligand yes
 +
 
 +
simplex_max_iterations 1000
 +
 
 +
simplex_tors_premin_iterations 0
 +
 
 +
simplex_max_cycles 1
 +
 
 +
simplex_score_converge 0.1
 +
 
 +
simplex_cycle_converge 1.0
 +
 
 +
simplex_trans_step 1.0
 +
 
 +
simplex_rot_step 0.1
 +
 
 +
simplex_tors_step 10.0
 +
 
 +
simplex_random_seed 0
 +
 
 +
simplex_restraint_min no
 +
 
 +
atom_model all
 +
 
 +
vdw_defn_file ../zzz.parameters/vdw_AMBER_parm99.defn
 +
 
 +
flex_defn_file ../zzz.parameters/flex.defn
 +
 
 +
flex_drive_file ../zzz.parameters/flex_drive.tbl
 +
 
 +
ligand_outfile_prefix 4TKG.rgd
 +
 
 +
write_orientations no
 +
 
 +
num_scored_conformers 5000
 +
 
 +
write_conformations no
 +
 
 +
cluster_conformations yes
 +
 
 +
cluster_rmsd_threshold 2.0
 +
 
 +
rank_ligands no
 +
 
 +
 
 +
 
 +
Once the program runs, an output file will be generated in the folder specified which can be viewed using Chimera.
 +
Open the receptor.mol2 file on Chimera and show it's surface representation.
 +
Now go to:
 +
 
 +
Tools>Surface/Binding Analysis>ViewDock>4TKG.rgd_scored.mol2
 +
 
 +
to view the rigid docked conformations of the ligand.
 +
 
 +
 
 +
 
 +
'''Flexible Docking'''
 +
 
 +
 
 +
Flexible docking will take longer time than the rigid one. However, it is very important for virtual screening. There are some changes of input parameters in flexible docking and these changes will generate additional questions prompted to the user.
 +
 
 +
Firstly, we should generate a new file flx.in:
 +
 
 +
touch flx.in
 +
 
 +
Then execute the DOCK6:
 +
 
 +
dock6 ­i flx.in
 +
 
 +
Several additional questions will be asked, and answer them according to the table below:
 +
 
 +
ligand_atom_file ../01.dockprep/4TKG.lig.mol2
 +
 
 +
limit_max_ligands no
 +
 
 +
skip_molecule no
 +
 
 +
read_mol_solvation no
 +
 
 +
calculate_rmsd yes
 +
 
 +
use_rmsd_reference_mol no
 +
 
 +
use_database_filter no
 +
 
 +
orient_ligand yes
 +
 
 +
automated_matching yes
 +
 
 +
receptor_site_file ../02.surface­spheres/selected_spheres.sph
 +
 
 +
max_orientations 1000
 +
 
 +
critical_points no
 +
 
 +
chemical_matching no
 +
 
 +
use_ligand_spheres no
 +
 
 +
use_internal_energy yes
 +
 
 +
internal_energy_rep_exp 12
 +
 
 +
flexible_ligand yes
 +
 
 +
user_specified_anchor no
 +
 
 +
limit_max_anchors no
 +
 
 +
min_anchor_size 5
 +
 
 +
pruning_use_clustering yes
 +
 
 +
pruning_max_orients 1000
 +
 
 +
pruning_clustering_cutoff 100
 +
 
 +
pruning_conformer_score_cutoff 100
 +
 
 +
use_clash_overlap no
 +
 
 +
write_growth_tree no
 +
 
 +
bump_filter no
 +
 
 +
score_molecules yes
 +
 
 +
contact_score_primary no
 +
 
 +
contact_score_secondary no
 +
 
 +
grid_score_primary yes
 +
 
 +
grid_score_secondary no
 +
 
 +
grid_score_rep_rad_scale 1
 +
 
 +
grid_score_vdw_scale 1
 +
 
 +
grid_score_es_scale 1
 +
 
 +
grid_score_grid_prefix ../03.box­grid/grid
 +
 
 +
multigrid_score_secondary no
 +
 
 +
dock3.5_score_secondary no
 +
 
 +
continuous_score_secondary no
 +
 
 +
descriptor_score_secondary no
 +
 
 +
gbsa_zou_score_secondary no
 +
 
 +
gbsa_hawkins_score_secondary no
 +
 
 +
SASA_descriptor_score_secondary no
 +
 
 +
amber_score_secondary no
 +
 
 +
minimize_ligand yes
 +
 
 +
minimize_anchor yes
 +
 
 +
minimize_flexible_growth yes
 +
 
 +
use_advanced_simplex_parameters no
 +
 
 +
simplex_max_cycles 1
 +
 
 +
simplex_score_converge 0.1
 +
 
 +
simplex_cycle_converge 1.0
 +
 
 +
simplex_trans_step 1.0
 +
 
 +
simplex_rot_step 0.1
 +
 
 +
simplex_tors_step 10.0
 +
 
 +
simplex_anchor_max_iterations 500
 +
 
 +
simplex_grow_max_iterations 500
 +
 
 +
simplex_grow_tors_premin_iterations 0
 +
 
 +
simplex_random_seed 0
 +
 
 +
simplex_restraint_min no
 +
 
 +
atom_model all
 +
 
 +
vdw_defn_file ../zzz.parameters/vdw_AMBER_parm99.defn
 +
 
 +
flex_defn_file ../zzz.parameters/flex.defn
 +
 
 +
flex_drive_file ../zzz.parameters/flex_drive.tbl
 +
 
 +
ligand_outfile_prefix 4TKG.flx
 +
 
 +
write_orientations no
 +
 
 +
num_scored_conformers 5000
 +
 
 +
write_conformations no
 +
 
 +
cluster_conformations yes
 +
 
 +
cluster_rmsd_threshold 2.0
 +
 
 +
rank_ligands no
 +
 
 +
After that, we can execute chimera to see the result of flexible docking. Below are two examples of
 +
 
 +
a final docked ligand in the receptor binding site pocket. The worst and best energy scored ligands
 +
 
 +
are shown below. The original ligand is in cyan and the docked ligand is colored in heteroatom.
 +
 
 +
[[Image:dock_bestligand_and original.png|thumb|375px|center|Best scored ligand result]]
 +
 
 +
[[Image:dock_worstligand_and original.png|thumb|375px|center|Worst scored ligand result]]
 +
 
 +
==VI. Virtual Screening==
 +
 
 +
george jones
 +
 
 +
Sam Chiappone
 +
 
 +
 
 +
 
 +
First you must copy your dock directory into seawulf, if you are on seawulf
 +
 
 +
  scp -r USERNAME@silver.mathlab.sunysb.edu:~/Dock .
 +
 
 +
 
 +
===QSUB File===
 +
 
 +
The qsub file is to submit a job to the que of the cluster.
 +
 
 +
Create a file named qsub.csh in the 06.large-screening directory in seawulf, the contents should be
 +
 
 +
----
 +
 
 +
 
 +
#! /bin/tcsh
 +
#PBS -l nodes=2:ppn=2
 +
#PBS -l walltime=24:00:00
 +
#PBS -o zzz.qsub.out
 +
#PBS -e zzz.qsub.err
 +
#PBS -N large-vs
 +
#PBS -V
 +
cd /nfs/user03/username/Dock/06.large-screening
 +
mpirun -n 4 /nfs/user03/fochtman/dock6/bin/dock6.mpi -i large-vs.in -o large-vs.out
 +
 
 +
 
 +
----
 +
 
 +
-l = comma separated list - this denotes required resources to run job
 +
 
 +
-o = defines the standard output stream
 +
 
 +
-e = defines the standard error stream
 +
 
 +
-N = declares name of job
 +
 
 +
nodes = 2:ppn=2 - the format for the nodes operator is <number of nodes>:ppn<number of processors per node> therefore 2 nodes with 2 processors each
 +
 
 +
walltime = sets the limit for how long the job can be on the cluster
 +
 
 +
===Large-vs.in===
 +
 
 +
You must also create a large-vs.in file as the input file with the contents being
 +
 
 +
 
 +
 
 +
 
 +
 
 +
ligand_atom_file                                            small-vs-21.mol2
 +
limit_max_ligands                                            no
 +
skip_molecule                                                no
 +
read_mol_solvation                                          no
 +
calculate_rmsd                                              no
 +
use_database_filter                                          no
 +
orient_ligand                                                yes
 +
automated_matching                                          yes
 +
receptor_site_file                                          ../02.surface-spheres/selected_spheres.sph
 +
max_orientations                                            1000
 +
critical_points                                              no
 +
chemical_matching                                            no
 +
use_ligand_spheres                                          no
 +
use_internal_energy                                          yes
 +
internal_energy_rep_exp                                      12
 +
flexible_ligand                                              yes
 +
user_specified_anchor                                        no
 +
limit_max_anchors                                            no
 +
min_anchor_size                                              5
 +
pruning_use_clustering                                      yes
 +
pruning_max_orients                                          1000
 +
pruning_clustering_cutoff                                    100
 +
pruning_conformer_score_cutoff                              100
 +
use_clash_overlap                                            no
 +
write_growth_tree                                            no
 +
bump_filter                                                  no
 +
score_molecules                                              yes
 +
contact_score_primary                                        no
 +
contact_score_secondary                                      no
 +
grid_score_primary                                          yes
 +
grid_score_secondary                                        no
 +
grid_score_rep_rad_scale                                    1
 +
grid_score_vdw_scale                                        1
 +
grid_score_es_scale                                          1
 +
grid_score_grid_prefix                                      ../03.box-grid/grid
 +
multigrid_score_secondary                                    no
 +
dock3.5_score_secondary                                      no
 +
continuous_score_secondary                                  no
 +
descriptor_score_secondary                                  no
 +
gbsa_zou_score_secondary                                    no
 +
gbsa_hawkins_score_secondary                                no
 +
SASA_descriptor_score_secondary                              no
 +
amber_score_secondary                                        no
 +
minimize_ligand                                              yes
 +
minimize_anchor                                              yes
 +
minimize_flexible_growth                                    yes
 +
use_advanced_simplex_parameters                              no
 +
simplex_max_cycles                                          1
 +
simplex_score_converge                                      0.1
 +
simplex_cycle_converge                                      1.0
 +
simplex_trans_step                                          1.0
 +
simplex_rot_step                                            0.1
 +
simplex_tors_step                                            10.0
 +
simplex_anchor_max_iterations                                500
 +
simplex_grow_max_iterations                                  500
 +
simplex_grow_tors_premin_iterations                          0
 +
simplex_random_seed                                          0
 +
simplex_restraint_min                                        no
 +
atom_model                                                  all
 +
vdw_defn_file                                                ../test-dock/zzz.parameters/vdw_AMBER_parm99.defn
 +
flex_defn_file                                              ../test-dock/zzz.parameters/flex.defn
 +
flex_drive_file                                              ../test-dock/zzz.parameters/flex_drive.tbl
 +
ligand_outfile_prefix                                        large-vs
 +
write_orientations                                          no
 +
num_scored_conformers                                        1
 +
rank_ligands                                                no
 +
 
 +
==VIII. Frequently Encountered Problems==

Latest revision as of 13:16, 25 March 2015

For additional Rizzo Lab tutorials see DOCK Tutorials. Use this link Wiki Formatting as a reference for editing the wiki. This tutorial was developed collaboratively by the AMS 536 class of 2014, using DOCK v6.6.

I. Introduction

DOCK

DOCK is a molecular docking program used in drug discovery. It was developed by Irwin D. Kuntz, Jr. and colleagues at UCSF (see UCSF DOCK). This program, given a protein binding site and a small molecule, tries to predict the correct binding mode of the small molecule in the binding site, and the associated binding energy. Small molecules with highly favorable binding energies could be new drug leads. This makes DOCK a valuable drug discovery tool. DOCK is typically used to screen massive libraries of millions of compounds against a protein to isolate potential drug leads. These leads are then further studied, and could eventually result in a new, marketable drug. DOCK works well as a screening procedure for generating leads, but is not currently as useful for optimization of those leads.

DOCK 6 uses an incremental construction algorithm called anchor and grow. It is described by a three-step process:

  1. Rigid portion of ligand (anchor) is docked by geometric methods.
  2. Non-rigid segments added in layers; energy minimized.
  3. The resulting configurations are 'pruned' and energy re-minimized, yielding the docked configurations.
1HVR Receptor surface

Poly ADP Ribose Polymerase (PARP)

Poly (ADP-ribose) polymerase (PARP) is a family of proteins involved in a number of cellular processes involving mainly DNA repair and programmed cell death. (Wikipedia: http://en.wikipedia.org/wiki/Poly_ADP_ribose_polymerase) The particular PARP family member we will focus on is PARP5b (aka: Tankyrase 2) of which the catalytic domains contains 227 amino acid residues. Olaparib (AZD-2281, trade name Lynparza) is an FDA-approved chemotherapeutic agent, developed by KuDOS Pharmaceuticals and later by AstraZeneca. It is an inhibitor of poly ADP ribose polymerase (PARP), an enzyme involved in DNA repair.[1] It acts against cancers in people with hereditary BRCA1 or BRCA2 mutations, which includes many ovarian, breast, and prostate cancers. (Wikipedia: http://en.wikipedia.org/wiki/Olaparib)

In this class, we will perform docking experiments and virtual screening on a crystallographic structure of PARP5b in complex with a small-molecule inhibitor, olaparib (PDB ID: 4TKG).

Organizing Directories

While performing docking, it is convenient to adopt a standard directory structure / naming scheme, so that files are easy to find / identify. For this tutorial, we will use something similar to the following:

~username/AMS536/dock-tutorial/00.files/
                              /01.dockprep/
                              /02.surface-spheres/
                              /03.box-grid/
                              /04.dock/
                              /05.mini-virtual-screen/
                              /06.virtual-screen/
                             

In addition, most of the important files that are derived from the original crystal structure will be given a prefix that is the same as the PDB code, '4TKG'. The following sections in this tutorial will adhere to this directory structure/naming scheme.


II. Preparing the Receptor and Ligand

Go to the Protein Databank Website (pdb.org) and search for 4TKG. This is the code for PARP protein crystal structure in complex with olaparib. Download the PDB (text) file for this protein. You will then want go to your /00.files directory and copy this file using the command below.

 cp ~/Downloads/4TKG.pdb ./

And then we will create 4 files in 01.dockprep/ directory:

 4TKG.dockprep.mol2  
 4TKG.lig.mol2  
 4TKG.rec.mol2  
 4TKG.rec.noH.pdb

Create the dockprep file

To create a temp.pdb file, you will first need to open 4TKG.pdb in Chimera. You will notice that there are four copies of this protein-ligand complex in the original crystal structure. Since we only want to work with one of these, select 'Chain A' (select->chain->A). Once Chain A is selected we will invert the selection by going the the 'Select' tab->invert(all models). Next we will delete these chains by going to the 'Actions' tab->atoms/bonds->delete. Save this file as temp.pdb in your 00.files directory.

For the "4TKG.dockprep.mol2" file: open the temp.pdb in Chimera; delete the water molecules. In order to prepare your file for docking you will want to follow this workflow "Tools->Surface/Binding Analysis->Dock Prep". Use all of the default settings and select charge model AMBER ff14SB. Save this file as 4TKG.dockprep.mol2 in your 01.dockprep directory.

Next, we will want to make the ligand file. To accomplish this, you will open your 4TKG.dockprep.mol2 file in Chimera. Select the ligand and 'invert selection' as before. You can then proceed to delete everything except the ligand. Save this as 4TKG.lig.mol2 in your 01.dockprep directory.

Once again, open up the 4TKG.dockprep.mol2 file in Chimera. Select the ligand and delete it. Save this as 4TKG.rec.mol2. From here select all the hydrogens and delete them. Save this as 4TKG.rec.noH.pdb.


III. Generating Receptor Surface and Spheres

Generating the Receptor Surface

The following steps will be done in 002.surface-spheres directory:

 cd 02.surface-spheres

Generating receptor surface will be done with Chimera, in the Terminal command line:

 Chimera

Do file-->open-->4TKG.rec.noH.pdb (The PDB file of 4TKG which hydrogens were removed and saved in 01.dockprep) Do Action-->Surface-->show (The surface should be shown.)

Then do Tools-->Structure editing-->Write DMS, and save the file as 4TKG.dms

Placing Spheres

Sphgen is a program that generate sets of overlapping spheres that define the shape of a molecule or molecular surface. Spheres are generated over the entire receptor and ligand surface. For further information on how Sphgen functions, please refer to the latest version of the DOCK manual:

<http://dock.compbio.ucsf.edu/DOCK_6/dock6_manual.htm>

To generate spheres using Sphgen follow the steps below:

Step 1. Create an input file name INSPH with the following information:

vim INSPH
4TKG.rec.dms #surface file generated above will be the input file
R            #flag to place spheres outside (R) or inside (L) of the surface 
X            #flag that informs sphgen of the subset of surface points to be used (X = all points)
0.0          #flag that prevent the generation of large spheres with close surface contacts(default= 0)
4.0          #maximum radius of the spheres generated (default = 4.0 Angstroms)
1.4          #minimum radius of the spheres generated (default = radius of probe)
4TKG.rec.sph #this will be the file which contained the clustered spheres generated

Step 2. Run the program Sphgen using the command sphgen:

sphgen -i INSPH -o OUTSPH
-i is the flag that give sphgen the input file INSPH
INSPH is the file created above that gives sphgen instructions
-o is the flag to create the oputput file
OUTSPH is the output file with the information of the spheres generated from sphgen

Step 3. Visualization of the spheres generated:

Visualization of the spheres can be done directly with chimera or with the program showsphere

3 a. Visualization directly with Chimera:

  • Launch Chimera, choose File -> Open, choose 4TKG.rec.mol2
  • choose File -> Open, choose 4TKG.rec.sph

You should have an image like this:

4TKG Receptor surface (light gray) with all spheres (various colors) generated

3 b. Visualization with showsphere:

showsphere convert the .sph file into PDB format.

(i) Run showsphere, by typing showsphere into the terminal:

showsphere

You will be prompted with the following questions:

Enter name of sphere cluster file:
     4TKG.rec.sph
Enter cluster number to process (<0 = all):
     -1
Generate surfaces as well as pdb files (<N>/Y)?
     N
Enter name for output file prefix:
     output_spheres
Process cluster 0 (contains ALL spheres) (<N>/Y)? 
     N
-1 is a flag that allow you to see all possible spheres

(ii) Open Chimera

  • Launch Chimera, choose File -> Open, choose 4TKG.rec.noH.pdb
  • Go File -> Open, choose output_spheres.pdb

You should see many spheres placed all over the receptor surface.

4TKG Receptor surface (light gray) with all spheres (various colors) using the program showsphere for visualization. The Yellow spheres are located in the binding site of the receptor

Step 4. Selecting spheres of interest:

To select spheres of interest you need to run a program name sphere_selector in the terminal. The idea is to allow the program to select spheres that are within a user-defined radius (in this case, 8.0 angstroms) of a target molecule or a known binding site:

sphere_selector 4TKG.rec.sph ../01.dockprep/4TKG.lig.mol2 8.0

A new file name selected_spheres.sph will be generated.

Step 5. Visualize the spheres using showsphere as previously done:

showsphere

When prompted on the command line, answering the questions as follows:

Enter name of sphere cluster file:
     selected_spheres.sph
Enter cluster number to process (<0 = all):
     -1
Generate surfaces as well as pdb files (<N>/Y)?
     N
Enter name for output file prefix:
     output_spheres_selected
Process cluster 0 (contains ALL spheres) (<N>/Y)? 
     N

View spheres in Chimera:

  • Launch Chimera, choose File -> Open, choose 4TKG.rec.noH.pdb
  • Go File -> Open, choosing output_spheres_selected.pdb
  • Go Select -> Residue -> SPH
  • Go, Actions -> Atoms/Bonds -> sphere
4TKG Receptor surface (light gray) with spheres (blue) within 8A

IV. Generating Box and Grid

Box Generation

To further calculate the grid and do grid-based scoring in DOCK , a box is generated beforehand. The interactive program showbox is used to visualize and define the location and size of the grid to be calculated using grid. The output of the showbox is in PDB format and can be visualized using a program capable of displaying PDB files like Chimera. Here below is a step by step box generation:

  • Go to the directory: 03.box-grid/
   cd ../03.box-grid


  • Make a new file in this directory and name it showbox.in
   vim showbox.in
  • This will automatically open the file showbox.in. Edit the file showbox.in as follows:
   Y                                               # Yes, generate a box
   8.0                                             # Size of the box in Angstroms
   ../02.surface-spheres/selected_spheres.sph      # Sphere.sph file
   1                                               # Cluster number
   4TKG.box.pdb                                    # Name of the output file


  • Save the file using the command:
    :wq
  • Run the command:
    showbox < showbox.in


View box in Chimera:

  • Launch Chimera, choose File -> Open, choose output_spheres_selected.pdb
  • Go File -> Open, choosing 4TKG.box.pdb


Selected sphere surface with generated box

Grid computing

In order to save computational resources and speed up the docking process, we let dock to pre-calculate the potential energy around the docking region, which defined by previous section, before we perform docking calculation.

In grid program, there are two ways to evaluate the potential energy in docking region: contact and energy scoring. The users could apply these two method independently to their docking system by simply typing “yes/no” in corresponding section of input file grid.in. Once the grid calculatin finished, the grid results will be saved in the corresponding extension files: *.cnt and *.nrg. Another important parameter in grid program is bump grid. This variable determines the degree of overlapping among atoms in receptor. The usage method is same as contact or energy scoring.

In this tutorial, we just use the energy scoring option to evaluate the potential energy in docking region. Mathematically grid use the empirical London-Jone's model and Coulomb electrostatic interaction function to approximate the potential energy in each grid points. For more details about the theoretical bases for grid calculation, please refer to:Energy Scoring Method in Grid

Energy scoring function

The equation form of Coulumb potential is fixed. However, you could specify the exponent orders for vdw interaction calculation(a and b) by setting the attractive_exponent and repulsive_exponent variable value. Other coefficients in London-Jone model are specified by the vdw_AMBER_parm99.defn file.

In practice, you have two ways to calculate the grid.

The more clear and efficient way:

  • Create the grid input file:
 vi grid.in
  • Setting all variables in grid.in. And run the program:
 grid -i grid.in

A more interactive and user friendly way:

  • Run the grid program
 grid

Then follow the instruction of the program and setting all variables by answering each questions. If it were you first time to run dock, we strongly recommend you to use second method. For this tutorial, we summarize all parameters that will be needed and give a brief description.

grid.in
Parameter Value Description
compute_grids yes compute scoring grids (yes)
grid_spacing 0.4 distance between grid points along each axis (in Å).
output_molecule no write up coordinates of the receptor into a new file
contact_score no compute contact grid? default is no
energy_score yes compute energy score? yes - we are using this method to compute force fields on probes
energy_cutoff_distance 9999 the max distance between atoms for the energy contribution to be computed
atom_model a atom_model u means united atom model where atoms are attached to hydrogens, and a stands for all-atom model, where hydrogens on carbons are treated separately
attractive_exponent 6 attractive component stands for exponent of the attractive LJ term in VDW potential
repulsive_exponent 9 repulsive component stands for exponent in the repulsive LJ term in VDW potential
distance_dielectric yes distance dielectric stands for the dielectric constant to be linearly dependent on distance
dielectric_factor 4 distance dielectric factor is the coefficient of the dielectric
bump_filter yes bump filter flag determines if we want to screen orientation for clashes before scoring and minimization
bump_overlap 0.75 bump_overlap stands for the fraction of allowed overlap among receptor's atoms where 1 corresponds to no allowed overlap and 0 corresponds to full overlap being permitted.
receptor_file ../01.dockprep/4TKG.receptor.mol2 our receptor file
box_file 4TKG.box.pdb the box file we generated in the Box section
vdw_definition_file ../zzz.parameters/vdw_AMBER_parm99.defn van der Waals parameters file
score_grid_prefix grid prefix for the grid file name; all the extensions will be generated automatically.

More detail, Please refer to:

http://dock.compbio.ucsf.edu/DOCK_6/tutorials/grid_generation/generating_grid.html

http://dock.compbio.ucsf.edu/DOCK_6/dock6_manual.htm#GridOverview

V. Docking a Single Molecule for Pose Reproduction

Minimizations

One can use the DOCK software to perform a energy minimization. For example, say you wish to minimize a ligand in a receptor's binding site. One way to do this is to first generate the grid for the receptor alone without the ligand. Then, you minimize the ligand in this grid.

To accomplish this, first generate the grid for the receptor (if you haven't already done so). Next, create a text file by using the following command:

touch min.in

This command creates a text file named min.in but leaves it empty. Once you have done this, run the following command:

dock6 -i min.in

The dock program will start and it will begin to ask you a series of questions. Answer the questions according to the outline below:

Note: as you answer these questions, the answers will be saved to the min.in file so that next time you don't have to type the answers in by hand.


ligand_atom_file ../01.dockprep/4TKG.lig.mol2

limit_max_ligands no

skip_molecule no

read_mol_solvation no

calculate_rmsd yes

use_rmsd_reference_mol no

use_database_filter no

orient_ligand no

use_internal_energy yes

internal_energy_rep_exp 12

flexible_ligand no

bump_filter no

score_molecules yes

contact_score_primary no

contact_score_secondary no

grid_score_primary yes

grid_score_secondary no

grid_score_rep_rad_scale 1

grid_score_vdw_scale 1

grid_score_es_scale 1

grid_score_grid_prefix ../03.box-grid/grid

multigrid_score_secondary no

dock3.5_score_secondary no

continuous_score_secondary no

descriptor_score_secondary no

gbsa_zou_score_secondary no

gbsa_hawkins_score_secondary no

SASA_descriptor_score_secondary no

amber_score_secondary no

minimize_ligand yes

simplex_max_iterations 1000

simplex_tors_premin_iterations 0

simplex_max_cycles 1

simplex_score_converge 0.1

simplex_cycle_converge 1.0

simplex_trans_step 1.0

simplex_rot_step 0.1

simplex_tors_step 10.0

simplex_random_seed 0

simplex_restraint_min yes

simplex_coefficient_restraint 10.0

atom_model all

vdw_defn_file ../zzz.parameters/vdw_AMBER_parm99.defn

flex_defn_file ../zzz.parameters/flex.defn

flex_drive_file ../zzz.parameters/flex_drive.tbl

ligand_outfile_prefix 4TKG.min

write_orientations no

num_scored_conformers 1

rank_ligands no


Once you have answered all the questions, dock will perform the minimization and will create an output file named 4TKG.min_scored.mol2

You can view your minimized ligand by opening it in chimera. To do this, open chimera and then open the receptor mol2 file. Then, open the output mol2 file which you got from the minimization (4TKG.min_scored.mol2).

A few points about the min.in input file:

Each line in the input file tells dock what to do next. For example, the following lines serve the following purposes:

ligand_atom_file specifies the path to the ligand so that dock can retrieve it.

minimize_ligand tells dock to perform an energy minimization.

grid_score_grid_prefix specifies the path to the grid so that dock can retrieve and use it for the minimization.

ligand_outfile_prefix tells dock what prefix to use for the output file.

Descriptions for the other lines can be found in the dock user manual.



Rigid docking

Rigid docking is a method that is used to dock a ligand to it's receptor without changing the internal conformations of the molecule, but rather as inflexible, rigid molecules. This is much faster than flexible docking and is used in cases where the ligand molecule has too many rotatable bonds. In order to run rigid docking on dock 6, you must first create an input file:

touch ­i rgd.in

Now, run:

dock6 ­i rgd.in

You will now be required to answer a series of questions regarding location of input file, docking parameters and output file prefix.

Answer all the questions as follows:

ligand_atom_file ../01.dockprep/4TKG.lig.mol2

limit_max_ligands no

skip_molecule no

read_mol_solvation no

calculate_rmsd yes

use_rmsd_reference_mol no

use_database_filter no

orient_ligand yes

automated_matching yes

receptor_site_file ../02.surface­spheres/selected_spheres.sph

max_orientations 1000

critical_points no

chemical_matching no

use_ligand_spheres no

use_internal_energy yes

internal_energy_rep_exp 12

flexible_ligand no

bump_filter no

score_molecules yes

contact_score_primary no

contact_score_secondary no

grid_score_primary yes

grid_score_secondary no

grid_score_rep_rad_scale 1

grid_score_vdw_scale 1

grid_score_es_scale 1

grid_score_grid_prefix ../03.box­grid/grid

multigrid_score_secondary no

dock3.5_score_secondary no

continuous_score_secondary no

descriptor_score_secondary no

gbsa_zou_score_secondary no

gbsa_hawkins_score_secondary no

SASA_descriptor_score_secondary no

amber_score_secondary no

minimize_ligand yes

simplex_max_iterations 1000

simplex_tors_premin_iterations 0

simplex_max_cycles 1

simplex_score_converge 0.1

simplex_cycle_converge 1.0

simplex_trans_step 1.0

simplex_rot_step 0.1

simplex_tors_step 10.0

simplex_random_seed 0

simplex_restraint_min no

atom_model all

vdw_defn_file ../zzz.parameters/vdw_AMBER_parm99.defn

flex_defn_file ../zzz.parameters/flex.defn

flex_drive_file ../zzz.parameters/flex_drive.tbl

ligand_outfile_prefix 4TKG.rgd

write_orientations no

num_scored_conformers 5000

write_conformations no

cluster_conformations yes

cluster_rmsd_threshold 2.0

rank_ligands no


Once the program runs, an output file will be generated in the folder specified which can be viewed using Chimera. Open the receptor.mol2 file on Chimera and show it's surface representation. Now go to:

Tools>Surface/Binding Analysis>ViewDock>4TKG.rgd_scored.mol2

to view the rigid docked conformations of the ligand.


Flexible Docking


Flexible docking will take longer time than the rigid one. However, it is very important for virtual screening. There are some changes of input parameters in flexible docking and these changes will generate additional questions prompted to the user.

Firstly, we should generate a new file flx.in:

touch flx.in

Then execute the DOCK6:

dock6 ­i flx.in

Several additional questions will be asked, and answer them according to the table below:

ligand_atom_file ../01.dockprep/4TKG.lig.mol2

limit_max_ligands no

skip_molecule no

read_mol_solvation no

calculate_rmsd yes

use_rmsd_reference_mol no

use_database_filter no

orient_ligand yes

automated_matching yes

receptor_site_file ../02.surface­spheres/selected_spheres.sph

max_orientations 1000

critical_points no

chemical_matching no

use_ligand_spheres no

use_internal_energy yes

internal_energy_rep_exp 12

flexible_ligand yes

user_specified_anchor no

limit_max_anchors no

min_anchor_size 5

pruning_use_clustering yes

pruning_max_orients 1000

pruning_clustering_cutoff 100

pruning_conformer_score_cutoff 100

use_clash_overlap no

write_growth_tree no

bump_filter no

score_molecules yes

contact_score_primary no

contact_score_secondary no

grid_score_primary yes

grid_score_secondary no

grid_score_rep_rad_scale 1

grid_score_vdw_scale 1

grid_score_es_scale 1

grid_score_grid_prefix ../03.box­grid/grid

multigrid_score_secondary no

dock3.5_score_secondary no

continuous_score_secondary no

descriptor_score_secondary no

gbsa_zou_score_secondary no

gbsa_hawkins_score_secondary no

SASA_descriptor_score_secondary no

amber_score_secondary no

minimize_ligand yes

minimize_anchor yes

minimize_flexible_growth yes

use_advanced_simplex_parameters no

simplex_max_cycles 1

simplex_score_converge 0.1

simplex_cycle_converge 1.0

simplex_trans_step 1.0

simplex_rot_step 0.1

simplex_tors_step 10.0

simplex_anchor_max_iterations 500

simplex_grow_max_iterations 500

simplex_grow_tors_premin_iterations 0

simplex_random_seed 0

simplex_restraint_min no

atom_model all

vdw_defn_file ../zzz.parameters/vdw_AMBER_parm99.defn

flex_defn_file ../zzz.parameters/flex.defn

flex_drive_file ../zzz.parameters/flex_drive.tbl

ligand_outfile_prefix 4TKG.flx

write_orientations no

num_scored_conformers 5000

write_conformations no

cluster_conformations yes

cluster_rmsd_threshold 2.0

rank_ligands no

After that, we can execute chimera to see the result of flexible docking. Below are two examples of

a final docked ligand in the receptor binding site pocket. The worst and best energy scored ligands

are shown below. The original ligand is in cyan and the docked ligand is colored in heteroatom.

Best scored ligand result
Worst scored ligand result

VI. Virtual Screening

george jones

Sam Chiappone


First you must copy your dock directory into seawulf, if you are on seawulf

 scp -r USERNAME@silver.mathlab.sunysb.edu:~/Dock .


QSUB File

The qsub file is to submit a job to the que of the cluster.

Create a file named qsub.csh in the 06.large-screening directory in seawulf, the contents should be



#! /bin/tcsh
#PBS -l nodes=2:ppn=2
#PBS -l walltime=24:00:00
#PBS -o zzz.qsub.out
#PBS -e zzz.qsub.err
#PBS -N large-vs
#PBS -V
cd /nfs/user03/username/Dock/06.large-screening
mpirun -n 4 /nfs/user03/fochtman/dock6/bin/dock6.mpi -i large-vs.in -o large-vs.out



-l = comma separated list - this denotes required resources to run job

-o = defines the standard output stream

-e = defines the standard error stream

-N = declares name of job

nodes = 2:ppn=2 - the format for the nodes operator is <number of nodes>:ppn<number of processors per node> therefore 2 nodes with 2 processors each

walltime = sets the limit for how long the job can be on the cluster

Large-vs.in

You must also create a large-vs.in file as the input file with the contents being



ligand_atom_file                                             small-vs-21.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           ../02.surface-spheres/selected_spheres.sph
max_orientations                                             1000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
flexible_ligand                                              yes
user_specified_anchor                                        no
limit_max_anchors                                            no
min_anchor_size                                              5
pruning_use_clustering                                       yes
pruning_max_orients                                          1000
pruning_clustering_cutoff                                    100
pruning_conformer_score_cutoff                               100
use_clash_overlap                                            no
write_growth_tree                                            no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../03.box-grid/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_descriptor_score_secondary                              no
amber_score_secondary                                        no
minimize_ligand                                              yes
minimize_anchor                                              yes
minimize_flexible_growth                                     yes
use_advanced_simplex_parameters                              no
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_anchor_max_iterations                                500
simplex_grow_max_iterations                                  500
simplex_grow_tors_premin_iterations                          0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                ../test-dock/zzz.parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               ../test-dock/zzz.parameters/flex.defn
flex_drive_file                                              ../test-dock/zzz.parameters/flex_drive.tbl
ligand_outfile_prefix                                        large-vs
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

VIII. Frequently Encountered Problems