2016 DOCK tutorial with Beta Trypsin

From Rizzo_Lab
Revision as of 14:37, 29 March 2016 by Stonybrook (talk | contribs) (VI. Virtual Screening)
Jump to: navigation, search

For additional Rizzo Lab tutorials see DOCK Tutorials. Use this link Wiki Formatting as a reference for editing the wiki. This tutorial was developed collaboratively by the AMS 536 class of 2014, using DOCK v6.6.

I. Introduction

DOCK

DOCK is a molecular docking program used in drug discovery. It was developed by Irwin D. Kuntz, Jr. and colleagues at UCSF (see UCSF DOCK). This program, given a protein binding site and a small molecule, tries to predict the correct binding mode of the small molecule in the binding site, and the associated binding energy. Small molecules with highly favorable binding energies could be new drug leads. This makes DOCK a valuable drug discovery tool. DOCK is typically used to screen massive libraries of millions of compounds against a protein to isolate potential drug leads. These leads are then further studied, and could eventually result in a new, marketable drug. DOCK works well as a screening procedure for generating leads, but is not currently as useful for optimization of those leads.

DOCK 6 uses an incremental construction algorithm called anchor and grow. It is described by a three-step process:

  1. Rigid portion of ligand (anchor) is docked by geometric methods.
  2. Non-rigid segments added in layers; energy minimized.
  3. The resulting configurations are 'pruned' and energy re-minimized, yielding the docked configurations.

Beta Trypsin

Beta trypsin is a serine protease involved in degrading peptides and proteins. The development of inhibitors of trypsin-like serine proteases has been justified because of the wide processes these enzymes regulate, among them blood coagulation, complement activation, digestion, reproduction and phagocytosis. Early developments of inhibitors for these enzymes were based on the benzamidine chemical structure. Crystal structures of these benzamidine based inhibitors in complex with beta trypsin gave showed the key resides interactions with the inhibitors and pointed out the importance of the "oxyanion hole" within the catalytic site. This gave new insights of the reaction mechanism of the enzyme and revealed what interactions are required to develop inhibitors with increased potency. [1]

In this tutorial we will use PDB website 1BJU deposited crystal structure of beta-trypsin in complex with inhibitor GP6.

Beta trypsin in complex with GP6 inhibitor. Residues interacting with the inhibitor are shown in spheres.

Organizing Directories

While performing docking, it is convenient to adopt a standard directory structure / naming scheme, so that files are easy to find / identify.For this tutorial, we will use something similar to the following:

~username/AMS536-Spring2016/dock-tutorial/00.files/
                                         /01.dockprep/
                                         /02.surface-spheres/
                                         /03.box-grid/
                                         /04.dock/
                                         /05.large-virtual-screen/
                                         /06.virtual-screen/
                                         /07.footprint/
                                         /08.print_fps

In addition, most of the important files that are derived from the original crystal structure will be given a prefix that is thsame as the PDB code, '1BJU'.The following sections in this tutorial will adhere to this directory structure/naming scheme.

II. Preparing the Receptor and Ligand

Download the PDB File (1BJU)

Go to the PDB website to download 1BJU.pdb file(PDB code: 1BJU). Using command or WinSCP to transfer 1BJU.pdb to 00.files. If you are in 00.files now, the command is:

cp ~/Downloads/1BJU.pdb .

Edit the PDB File (1BJU)

Using vi command to open 1BJU.pdb, then deleting bottom and top lines which do not have ATOM or ATPATM in its start. Then changing all ATPATM to ATOM using command:

%s /ATPATM/ATOM  /gc

In addition, changing all GP6 to LIG B using command:

%s /GP6  /LIG B  /gc

Saving this as raw_1BJU.pdb in 00.files.

Prepare ligand and receptor files for dock

Create dockprep file

Open raw_1BJU.pdb file in chimera, in Tools, find structure editing, then click AddH to add hydrogen in this system.

Also in structure editing, click Add charge, then, changing AMBER ff14SB to AMBER ff99SB and changing net charge to +1.

Saving it as 1BJU.dockprep.mol2 in 01.dockprep.

Create receptor file

Open 1BJU.dockprep.mol2 in chimera, then click select to choose ligand, click action in Atoms/Bonds click delete.

Saving this molecule as 1BJU.rec.mol2 in 01.dockprep.

Create ligand file

Open 1BJU.dockprep.mol2 in chimera, using the same method delete the protein molecule.

Saving it as 1BJU.lig.mol2 in 01.dockprep.

Create no hydrogen receptor file

Open 1BJU.rec.mol2 in chimera, click select, then choosing chemistry H to select hydrogen in protein, Using action button to delete H.

Saving it as 1BJU.rec.noH.pdb in 01.dockprep.

III. Generating Receptor Surface and Spheres

Generating the Receptor Surface

Create the directory where the following steps will be done:

  mkdir 02.surface-sphere
  cd 02.surface-sphere

Open Chimera in the terminal to view the receptor surface:

 Chimera

In Chimera complete the following steps to open the receptor file in the 01.dockprep directory:

  File -> Open -> 1BJU.rec.noH.pdb 

To show the surface receptor:

  Action -> Surface -> Show (The surface should look like the picture shown below.)
Surface of the receptor Beta Trypsin without hydrogens as generated by Chimera.

To save a DMS file as 1BJU.rec.dms in 02.surface-sphere directory:

 Tools -> Structure editing -> Write DMS


Creating Spheres

Using the Sphgen program complete following steps:

1. Create an input file name INSPH to be read by Sphgen:

vi INSPH
1BJU.rec.dms #specifies the input file
R            #spheres generated will be outside of teh receptor surface 
X            #specifies that all points won the receptor will be used
0.0          #distance in angstroms (avoids steric clashes)
4.0          #max surface radius of the spheres in angstroms
1.4          #min surface radius of the spheres in angstroms
1BJU.rec.sph #the specified outfile containing all generated spheres

2. Run the Sphgen using the input file INSPH with the command:

sphgen -i INSPH -o OUTSPH
-i is the flag that specifies the input file INSPH
INSPH is int eh specified input file
-o is the flag to specifiy the output file
OUTSPH is the file containing the generated spheres

3. Visualization generated spheres with Chimera:

  • Launch Chimera in termal,
  • Choose File -> Open, choose 1BJU.rec.mol2
  • choose File -> Open, choose 1BJU.rec.sph

You should have an image like this:

1BJU Receptor surface with the generated surface spheres

Selecting Spheres

To select spheres of interest run the program sphere_selector in the terminal at 8.0 angstrom radius of the target molecule or known binding site:

  sphere_selector 1BJU.rec.sph ../01.dockprep/1BJU.lig.mol2 8.0

The output file generated will be selected_spheres.sph

To visualize the spheres using Chimera as previously done:

  • Launch Chimera, choose File -> Open, choose 1BJU.rec.noH.pdb
  • File -> Open, choose output_spheres_selected.pdb
  • Select -> Residue -> SPH
  • Actions -> Atoms/Bonds -> sphere

The selected spheres with the receptor surface should look similar to that as seen below:

1BJU receptor surface with the selected spheres within 8.0 Angstroms


IV. Generating Box and Grid

Monaf Awwa

Box Generation

>Start Chimera and open the RECEPTOR files 1BJU.rec.noH.pdb Next, select the ACTION subheader, then select SURFACE, then SHOW

Once you can see the van der Waals surface of the protein, select the TOOLS subheader, then STRUCTURE EDITING, and lastly select WRITE DMS and save the file as 1BJU.dms

Now that we have a surface, we can more effectively calculate regions of the protein where a small molecule can bind. To further accelerate calculations, we will specify which atoms to calculate the interactions for via the program sphgen

Switch over to the 02.surfacespheres directory with the following command

cd 02.surfacespheres

Next, create an input file for sphgen to read.

vim INSPH

The file should have the following information

1BJU.rec.dms #the file corresponding to the surface is the input file R #we want the spheres outside the protein surface X #we want all subsets of spheres to be generated 0.0 #default size of proteins with surface contacts 4.0 #default maximum of the sphere radius generated (Angstroms) 1.4 #default minimum of the sphere radius generated (Angstroms) Also corresponds to probe radius 1BJU.rec.sph #output file name

Now we can enter the command to run sphgen

sphgen -i INSPH -o OUTSPH

Once completed, check the OUTSPH file to ensure no errors occured

If you'd like to visually inspect the spheres generated, start Chimera and load the 1BJU.rec.mol2 file, then load the 1BJU.rec.sph file.

Now we will use the showsphere program to convert our .sph file back to a .pdb format.

In the same directory, enter the command

showsphere

The following questions will appear. Answer as follows

Enter name of sphere cluster file:

    1BJU.rec.sph

Enter cluster number to process (<0 = all):

    -1

Generate surfaces as well as pdb files (<N>/Y)?

    N

Enter name for output file prefix:

    output_spheres

Process cluster 0 (contains ALL spheres) (<N>/Y)?

    N

You should now have a file named output_spheres.pdb in your directory. You can visualize the spheres by loading chimera and the 1BJU.rec.noH.mol2, then loading the output_spheres.pdb file.

For the docking calculation, we are most interesting in a subset of the spheres. To select the spheres most important to ligand binding, we will use the command

sphere_selector 1BJU.rec.sph ../01.dockprep/1BJU.lig.mol2 8.0

The 8.0 Angstrom value indicates we want to select spheres within 8 angstroms of our ligand. This hypothetically means we are designing a competitive inhibitor.

Now lets create the proper output file for our spheres. Enter the command

showsphere

and answer the following questions

Enter name of sphere cluster file:

    selected_spheres.sph

Enter cluster number to process (<0 = all):

    -1

Generate surfaces as well as pdb files (<N>/Y)?

    N

Enter name for output file prefix:

    output_spheres_selected

Process cluster 0 (contains ALL spheres) (<N>/Y)?

    N

This will ensure the docking calculation samples the most important regions of the protein surface

Grid computing

Since the electronic interactions between two objects in space can be decomposed into pairwise interactions, it is highly efficient to fully calculate a proteins electric contribution to binding energy before DOCKing a ligand.

switch to the 03.box-grid/ directory

cd 03.box-grid/

create an input file named showbox.in

vim showbox.in

enter the following information

  Y                                               # Do you want to calculate a box? (YES)
  8.0                                             # How long should the box length be?
  ../02.surface-spheres/selected_spheres.sph      # which sphere file is our input?
  1                                               # How many clusters for the file?
  1BJU.box.pdb                                    # What do we want to name the output file?

save the file, then enter the following command to generate the box

showbox > showbox.in

The box can be visualized in Chimera by loading the output_spheres_selected.pdb file, then loading the 1BJU.box.pdb file

Next, we will calculate the grid energies

Create the input file for grid. Enter the command

vim grid.in

Run grid with the command

grid - grid.in

This allows you to go back and fix the input file in case there is a mistake when answer the questions prompted by grid.


grid.in Parameter Value Description

compute_grids yes we want to compute grids

grid_spacing 0.4 resolution of grid energetic contributions (Angstroms). The smaller the value, the more accurate the calculation

output_molecule no we don't want to rewrite ligand output file

contact_score no we don't want to deal with contact score

energy_score yes we want to compute energetic interaction based on force field

energy_cutoff_distance 9999 what distance should we stop computing atomic interactions in angstroms

atom_model a corresponds to all atom model rather than united atom

attractive_exponent 6 see Lennard-Jones attraction term

repulsive_exponent 12 see Lennard-Jones repulsion term

distance_dielectric yes linear decrease in electric force

dielectric_factor 4 how much will electric force decrease based on atomic distance in angstroms

bump_filter yes

bump_overlap 0.75

receptor_file ../01.dockprep/1BJU.rec.mol2

box_file 4TKG.box.pdb

vdw_definition_file ../zzz.parameters/vdw_AMBER_parm99.defn characteristic van der Waal radii parameter file

score_grid_prefix grid ONLY THE PREFIX OF THE GRID FILE NAME

V. Docking a Single Molecule for Pose Reproduction

Agatha Lyczek & Haoyue Guo

Minimization

Dock Minimization Input file: min.in

conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
ligand_atom_file                                             ../01.dockprep/1BJU.lig.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes
rmsd_reference_filename                                      ../01.dockprep/1BJU.lig.mol2
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../03.box-grid/1BJU.grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
ph4_score_secondary                                          no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_descriptor_score_secondary                              no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        yes
simplex_coefficient_restraint                                10.0
atom_model                                                   all
vdw_defn_file                                                ../zzz.parameters/vdw.defn
flex_defn_file                                               ../zzz.parameters/flex.defn
flex_drive_file                                              ../zzz.parameters/flex_drive.tbl
ligand_outfile_prefix                                        1BJU.min
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Result:

Minimization result;RMSD = 0.3177; blue: experiment; green: predicted

Rigid docking

To run the input file:

dock6 -i rigid_nr.in

Rigid docking input file: rigid_nr.in

conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
ligand_atom_file                                             ../01.dockprep/1BJU.lig.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes
rmsd_reference_filename                                      ../01.dockprep/1BJU.lig.mol2
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           ../02.surface-sphere/selected_spheres.sph
max_orientations                                             1000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../03.box-grid/1BJU.grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
ph4_score_secondary                                          no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_descriptor_score_secondary                              no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        yes
atom_model                                                   all
vdw_defn_file                                                ../zzz.parameters/vdw.defn
flex_defn_file                                               ../zzz.parameters/flex.defn
flex_drive_file                                              ../zzz.parameters/flex_drive.tbl
ligand_outfile_prefix                                        1BJU.rigid_nr
write_orientations                                           no

Best scored rigid docking result:

Rigid docking result;RMSD = 0.7946; blue: experiment; pink: predicted

Flexible Docking

To run the input file:

dock6 -i flex.in

Flexible docking input file: flex.in

conformer_search_type                                        flex
user_specified_anchor                                        no
limit_max_anchors                                            no
min_anchor_size                                              5
pruning_use_clustering                                       yes
pruning_max_orients                                          1000
pruning_clustering_cutoff                                    100
pruning_conformer_score_cutoff                               100.0
use_clash_overlap                                            no
write_growth_tree                                            no
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
ligand_atom_file                                             ../01.dockprep/1BJU.lig.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes
rmsd_reference_filename                                      ../01.dockprep/1BJU.lig.mol2
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           ../02.surface-sphere/selected_spheres.sph
max_orientations                                             1000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../03.box-grid/1BJU.grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
ph4_score_secondary                                          no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_descriptor_score_secondary                              no
amber_score_secondary                                        no
minimize_ligand                                              yes
minimize_anchor                                              yes
minimize_flexible_growth                                     yes
use_advanced_simplex_parameters                              no
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0

Best scored flexible docking result:

Flexible docking result;RMSD = 0.45; blue: experiment; pink: predicted

VI. Virtual Screening

Katie Maffucci Virtual screening uses libraries of small molecules to identify novel chemical structures that are most likely to bind a drug target. The search is narrowed to a few "hit" compounds that can be synthesized or purchased and tested against the target. Structure-based virtual screening is computationally demanding, and requires a parallel computing infrastructure to handle the task. LIRED is a Linux-based computing cluster well suited for this purpose.

DOCK Preparation

To set up the virtual screen, make a directory using the command: "mkdir 05.largevirtualscreen" Print your working directory using the "pwd" command and copy this dock directory into your LIRED account: "scp -r /gfps/home/username/AMS536/docktutorial/05.largevirtualscreen ."


QSUB File

To submit your job to the cluster, a qsub c chell script must be made so that the job can be executed. Its contents should be as follows:

#! /bin/tcsh

  #PBS -l nodes=2:ppn=2
  #PBS -l walltime=24:00:00
  #PBS -o zzz.qsub.out
  #PBS -e zzz.qsub.err
  #PBS -N large_vs
  #PBS -M your.name@stonybrook.edu
  #PBS -V

cd /gfps/home/username/AMS536/docktutorial/

/gpfs/software/intel_2016_update1/impi/5.1.2.150/bin64/mpirun np 4 \ /home/tmcgee/local/AMS_536/dock6/bin/dock6.mpi \"


Input File

An input file must also be made. "vi large_vs.in":


=Post Processing

VIII. Frequently Encountered Problems

EVERYONE