2014 DOCK tutorial with HIV Protease

From Rizzo_Lab
Revision as of 15:23, 3 March 2014 by Stonybrook (talk | contribs) (Preparing the ligand and receptor in Chimera)
Jump to: navigation, search

For additional Rizzo Lab tutorials see DOCK Tutorials. Use this link Wiki Formatting as a reference for editing the wiki. This tutorial was developed collaboratively by the AMS 536 class of 2013, using DOCK v6.6.

I. Introduction


DOCK is a molecular docking program used in drug discovery. It was developed by Irwin D. Kuntz, Jr. and colleagues at UCSF (see UCSF DOCK). This program, given a protein binding site and a small molecule, tries to predict the correct binding mode of the small molecule in the binding site, and the associated binding energy. Small molecules with highly favorable binding energies could be new drug leads. This makes DOCK a valuable drug discovery tool. DOCK is typically used to screen massive libraries of millions of compounds against a protein to isolate potential drug leads. These leads are then further studied, and could eventually result in a new, marketable drug. DOCK works well as a screening procedure for generating leads, but is not currently as useful for optimization of those leads.

DOCK 6 uses an incremental construction algorithm called anchor and grow. It is described by a three-step process:

  1. Rigid portion of ligand (anchor) is docked by geometric methods.
  2. Non-rigid segments added in layers; energy minimized.
  3. The resulting configurations are 'pruned' and energy re-minimized, yielding the docked configurations.

HIV Protease


Organizing Directories

While performing docking, it is convenient to adopt a standard directory structure / naming scheme, so that files are easy to find / identify. For this tutorial, we will use something similar to the following:


In addition, most of the important files that are derived from the original crystal structure will be given a prefix that is the same as the PDB code, '1HVR'. The following sections in this tutorial will adhere to this directory structure / naming scheme.

II. Preparing the Receptor and Ligand

Downloading the PDB Structure (1HVR)

Go to PDB (Protein Data Bank) website (http://www.rcsb.org/pdb/home/home.do) enter the protein ID (1HVR), search for the PDB file and download it as a text form.

Preparing the ligand and receptor in Chimera

Put the 1HVR PDB file in 00.file/folder. If you are in the 00.files/directory, then tap the command:

 cp ~/Downloads/1HVR.pdb ./

When you are preparing you PDB files, you have to make some modifications on your original file. For example: we changed the atom name form "CSO" to "CYS" and deleted to lines "OD" and "HD". When you finish the modifications, save it as "1HVR.modified.pdb" in the 00.files/. And then we will create 4 files in 01.dockprep/ directory:


III. Generating Receptor Surface and Spheres

Generating the Receptor Surface

Check to make sure 02.surface-spheres directory exists under dock-tutorial. If not then make the following directory:

mkdir 02.surface-sphgen
cd 02.surface-sphgen

The following steps will be carried out to generate the receptor surface using Chimera:

Open Chimera by simply typing chimera into the terminal window

| Go File -> Open and choose the PDB file of the protein containing no hydrogens (1HVR.receptor.noH.pdb) from 01.dockprep

| Further, Actions -> Surface -> Show

| Go Tools -> Structure Editing -> Write DMS in order to obtain a dms file, which we will need to place spheres

| In the new window save the surface as 1HVR.receptor.dms

1HVR Receptor surface

Placing Spheres

We will be using SPHGEN to generate spheres: see the DOCK online owners manual for additional information:


The following steps will be used to place the spheres on the receptor surface:

1. Create a file called INSPH and fill it out as follows, then save it. This input file tells SPHGEN what to do, details of each line are below:


Input File Details:

 1HVR.receptor.dms - surface file from the previous step
 R - tells SPHGEN to place spheres either outside of the surface (R) or inside the surface (L)
 X - tells SPHGEN the subset of surface points to be used (X=all points)
 0.0 - prevents generation of large spheres with close surface contacts(defalut=0.0)
 4.0 - maximum sphere radius in angstroms (default=4.0)
 1.4 - minimum sphere radius in angstroms (default=radius of probe)
 1HVR.receptor.sph - clustered spheres file that we want to generate

2. Run the sphgen program from the terminal:

sphgen -i INSPH -o OUTSPH
-i tells sphgen where the input file INSPH is
INSPH tells sphgen what to do
-o tells sphgen what to call the oputput file
OUTSPH is the output file containing the sphere information

3. (optional) To look at the spheres generated, you need to put them into PDB format.

Run showsphere, by typing the follwoing into the terminal:


You will be prompted with the following questions:

Enter name of sphere cluster file:
Enter cluster number to process (<0 = all):
Generate surfaces as well as pdb files (<N>/Y)?
Enter name for output file prefix:
Process cluster 0 (contains ALL spheres) (<N>/Y)? 

You can then open the receptor file in Chimera as well as the output_spheres.pdb file.

IV. Generating Box and Grid

Mosavverul Arkin

1.) Make a new directory and name it: 03.box-grid/

      mkdir 03.box-grid

2.) Make a new file in this directory and name it showbox.in

     vim showbox.in

3.) This will automatically open the file showbox.in. Edit the file showbox.in as follows:

   Y                                               #Yes, generate a box
   8.0                                             #Size of the box in Angstroms
   ../02.surfaces-spheres/selected_spheres.sph     #Sphere.sph file
   1                                               #Cluster number
   1HVR.box.pdb                                    #Name of the output file

V. Docking a Single Molecule for Pose Reproduction

Jess Junjie Kai

Best scored ligand result
Worst scored ligand result

VI. Virtual Screening

Virtual Screening Introduction

Virtual screening is a method used to predict most favorable ligand binding to a target protein within a ligand database. It also allows for comparison of both qualitative (e.g. position in binding site) and quantitative (e.g. grid scores, internal energy) data pertaining to the each screened ligand with the originally docked molecule.

To perform virtual screening, we use HIVPR.ligands.005.mol2, a mol2 file which contains 5 small molecules to be the virtual library. This is a reasonable computational cost for a quick search, so we can conduct it on own computer . After which, we may able to conduct virtual screening within a larger database HIVPR.ligands.100.mol2. Since the computational cost of virtual screening is much higher, it is better to run it on Seawulf.

VII. Running DOCK in Parallel on Seawulf

Fengfei Lu

VIII. Frequently Encountered Problems














Write some text here..

 command or input file
Receptor surface