Difference between revisions of "2013 DOCK tutorial with Orotodine Monophosphate Decarboxylase"

From Rizzo_Lab
Jump to: navigation, search
(VII. Running DOCK in Serial and in Parallel on Seawulf)
(VII. Running DOCK in Serial and in Parallel on Seawulf)
Line 106: Line 106:
  
 
If you are docking multiple ligands, you can use more than one processor in parallel mode, but you should never use more processors than you have ligands.  
 
If you are docking multiple ligands, you can use more than one processor in parallel mode, but you should never use more processors than you have ligands.  
Before we can run DOCK on seawulf, we need to copy the proper files from herbie to seawulf
+
Before we can run DOCK on Seawulf, we need to copy the proper files from Herbie to Seawulf.
 
If we CD into the AMS536 folder we can use the following command from the mathlab computer to copy all of the dock-tutorial files
 
If we CD into the AMS536 folder we can use the following command from the mathlab computer to copy all of the dock-tutorial files
  
Line 127: Line 127:
 
   
 
   
 
  cd /nfs/user03/zfoda/AMS536/dock-tutorial/07.virtscreen
 
  cd /nfs/user03/zfoda/AMS536/dock-tutorial/07.virtscreen
 
 
  /nfs/user03/wjallen/local/dock6/bin/dock6 -i dock.in -o dock.out
 
  /nfs/user03/wjallen/local/dock6/bin/dock6 -i dock.in -o dock.out
  
Line 140: Line 139:
 
  #PBS -o pbs.out                                            #Name of your output file
 
  #PBS -o pbs.out                                            #Name of your output file
 
   
 
   
  cd /nfs/user03/zfoda/AMS536/dock-tutorial/07.virtscreen      #Change to your home directory and folder with dock files                
+
  cd /nfs/user03/zfoda/AMS536/dock-tutorial/07.virtscreen      #Change to your home directory and folder with dock files          
  /nfs/user03/wjallen/local/dock6/bin/dock6 -i dock.in -o dock.out #Specifies path to dock executable and provide input and output filenames
+
/nfs/user03/wjallen/local/dock6/bin/dock6 -i dock.in -o dock.out #Specifies path to dock executable and provide input and output filenames
  
  
Line 168: Line 167:
 
   
 
   
 
  cd /nfs/user03/username/AMS536/dock-tutorial/07.virtscreen
 
  cd /nfs/user03/username/AMS536/dock-tutorial/07.virtscreen
 
 
  mpirun -np 8 /nfs/user03/wjallen/local/dock6/bin/dock6.mpi -i dockvs.in -o dockvs.out
 
  mpirun -np 8 /nfs/user03/wjallen/local/dock6/bin/dock6.mpi -i dockvs.in -o dockvs.out
  
Line 175: Line 173:
 
  #PBS -l nodes=4:ppn=2 #Use 4 nodes, and 2 processors per node, so 8 processors  
 
  #PBS -l nodes=4:ppn=2 #Use 4 nodes, and 2 processors per node, so 8 processors  
 
   
 
   
Note: since one processor is used to distribute the processes, this will run dock as 7 (n-1) parallel processes.
+
Note: since one processor is used to distribute the processes, this will run DOCK as 7 (n-1) parallel processes.
  
 
Last Line
 
Last Line
 
 
  mpirun -np 8 /nfs/user03/wjallen/local/dock6/bin/dock6.mpi -i dockvs.in -o dockvs.out #this line uses mpi to run dock.mpi on multiple processors  
 
  mpirun -np 8 /nfs/user03/wjallen/local/dock6/bin/dock6.mpi -i dockvs.in -o dockvs.out #this line uses mpi to run dock.mpi on multiple processors  
  

Revision as of 15:54, 4 March 2013

For additional Rizzo Lab tutorials see DOCK Tutorials. Use this link Wiki Formatting as a reference for editing the wiki. This tutorial was developed collaboratively by the AMS 536 class of 2013, using DOCK v6.6.

I. Introduction

DOCK

DOCK is a molecular docking program used in drug discovery. It was developed by Irwin D. Kuntz, Jr. and colleagues at UCSF (see UCSF DOCK). This program, given a protein binding site and a small molecule, tries to predict the correct binding mode of the small molecule in the binding site, and the associated binding energy. Small molecules with highly favorable binding energies could be new drug leads. This makes DOCK a valuable drug discovery tool. DOCK is typically used to screen massive libraries of millions of compounds against a protein to isolate potential drug leads. These leads are then further studied, and could eventually result in a new, marketable drug. DOCK works well as a screening procedure for generating leads, but is not currently as useful for optimization of those leads.

DOCK 6 uses an incremental construction algorithm called anchor and grow. It is described by a three-step process:

  1. Rigid portion of ligand (anchor) is docked by geometric methods.
  2. Non-rigid segments added in layers; energy minimized.
  3. The resulting configurations are 'pruned' and energy re-minimized, yielding the docked configurations.


Orotodine Monophosphate Decarboxylase

The protein receptor which is the subject of this tutorial is orotodine monophosphate decarboxylase (OMP), a homodimeric protein from the organism Methanobacterium thermoautotrophicum. OMP is involved in the biosynthesis of several pyrimidines including uridine monophosphate (UMP), the ligand used in this tutorial. UMP is a pyrimidine-based nucleotide monomer of RNA. The structure used for this tutorial can be found at the Protein Data Bank under accenssion number 1LOQ.


Organizing Directories

While performing docking, it is convenient to adopt a standard directory structure / naming scheme, so that files are easy to find / identify. For this tutorial, we will use something similar to the following:

~username/AMS536/dock-tutorial/00.files/
                              /01.dockprep/
                              /02.surface-sphgen/
                              /03.box-grid/
                              /04.dock/
                              /05.mini-virtual-screen/
                              /06.database-filter/
                              /07.virtual-screen/
                             

In addition, most of the important files that are derived from the original crystal structure will be given a prefix that is the same as the PDB code, '1LOQ'. The following sections in this tutorial will adhere to this directory structure / naming scheme.

II. Preparing the Receptor and Ligand

(Ye and Weiliang)

PDB file downloading (1LOQ): go to PDB homepage(http://www.rcsb.org/pdb/home/home.do ) enter the protein ID (1LOQ) in the PDB, click Download Files in the top-right of the webpage, then select PDB File (text). In the new window, save the file in Downloads.

Generating files for ligand and receptor:


In this section, we will create four new files in the 00-original-files folder:

1LOQ.dockprep.mol2 - ligand molecule, with hydrogens and am 1-bcc partial charges.

1LOQ.receptor.mol2 - receptor molecule, with hydrogens and amber charges.

1LOQ.receptor.noH.mol2 - receptor without hydrogen atoms.

1LOQ.ligand.mol2 - ligand only.


Firstly please copy the pdb file in 00-original-files folder;then open the pdb file in the promt "vim 1LOQ.pdb";in the pdb file, change all residues "U" by "LIG" starting at 2082 line; here is the comment for you to search in the vim sheet to change the ligand U into LIG to ensure the ligand can be read by Chimera:

 %s/  U/LIG/gc

g is short for global check and c is short for checking before having changes

The preparation will be shown in Chimera. Open Chimera by typing Chimera at the prompt if you are on "herbie"; Click Open in File menu and find the file "1LOQ.pdb";

To delete water molecules/other ligands, click Structure Editing in Tools manu and click Dock Prep. Check all boxes and click okey to the end.

Or go to Select->residue->HOH, then go to actions->delete then u remove water in the original molecule.

then add H to the molecule:tools->Structure editing->add H

Next, to add charge to the ligand, go to Select->residue->LIG,then go to tools->Structure editing->->add charge, and then chose "AMBER ff99SB" in charge model -> select AM1-BCC as charge method, then assign residue LIG net charge for -1


To create a receptor file: Open 1LOQ.dockprep.mol2, click Select -> Residue -> LIG. Then click Actions -> Atoms -> Delete. Save the file as 1LOQ.receptor.mol2.


To create a receptor file with no hydrogen atoms: Open 1LOQ.dockprep.mol2, click Select -> Chemistry -> Element -> H -> Delete. Save the file as 1LOQ.receptor.noH.mol2.


To create a Ligand file: Open 1LOQ.dockprep.mol2, click Select -> Chain -> Delete. Save the file as 1LOQ.ligand.mol2.

Example.png

III. Generating Receptor Surface and Spheres

Nick and Artem

IV. Generating Box and Grid

mkdir 03.box.grid
cd 03.box.grid

V. Docking a Single Molecule for Pose Reproduction

Jiaye, He, and Natalie


VI. Virtual Screening

Brian and Koushik


VII. Running DOCK in Serial and in Parallel on Seawulf

The Seawulf Cluster is a 470-processor Linux Cluster capable of highly parallel processing. This parallel processing allows dock virtual screens to be completed in a fraction of the time as a single processor.

If you are docking multiple ligands, you can use more than one processor in parallel mode, but you should never use more processors than you have ligands. Before we can run DOCK on Seawulf, we need to copy the proper files from Herbie to Seawulf. If we CD into the AMS536 folder we can use the following command from the mathlab computer to copy all of the dock-tutorial files

$scp -r /dock-tutorial/  username@herbie.mathlab.sunysb.edu:~/AMS536/

Now we have all of our Dock preparation files and folders on the seawulf cluster.

Running DOCK in Serial on a Single processor

Running on a single processor is very similar to running dock on the mathlab comptuer.

If you make a file called qsub.csh with the text:

#!/bin/tcsh                 
#PBS -l nodes=1:ppn=1        
#PBS -l walltime=01:00:00   
#PBS -N dock6           
#PBS -M user@ic.sunysb.edu 
#PBS -j oe                   
#PBS -o pbs.out            

cd /nfs/user03/zfoda/AMS536/dock-tutorial/07.virtscreen
/nfs/user03/wjallen/local/dock6/bin/dock6 -i dock.in -o dock.out

An explanation of the commands:

#!/bin/tcsh                                                 #Execute script with tcsh
#PBS -l nodes=1:ppn=1                                       #Use one node, and one processor per node, so one single processor 
#PBS -l walltime=01:00:00                                   #Allow 1 hour for your job run 
#PBS -N dock6                                               #Name of your job
#PBS -M user@ic.sunysb.edu                                  #Get an email notifying you when your job is completed
#PBS -j oe                                                  #Combine the output and error streams into a single output file
#PBS -o pbs.out                                             #Name of your output file

cd /nfs/user03/zfoda/AMS536/dock-tutorial/07.virtscreen       #Change to your home directory and folder with dock files           
/nfs/user03/wjallen/local/dock6/bin/dock6 -i dock.in -o dock.out #Specifies path to dock executable and provide input and output filenames


To submit the experiment use the command:

qsub qsub.csh

You will have submitted a DOCK experiment to the seawulf queue.

See also PBS commands.

Running DOCK in Parallel using MPI

In order to run DOCK in parallel you have to use a slightly different build of DOCK6 called dock6.mpi. Message passing interface (MPI) is basically a program that allows programs like DOCK to run in parallel.

So, make another file called qsub.vs.csh with the contents:

#!/bin/tcsh
#PBS -l nodes=4:ppn=2 
#PBS -l walltime=24:00:00
#PBS -N screen
#PBS -o qsub.log
#PBS -j oe
#PBS -V

cd /nfs/user03/username/AMS536/dock-tutorial/07.virtscreen
mpirun -np 8 /nfs/user03/wjallen/local/dock6/bin/dock6.mpi -i dockvs.in -o dockvs.out

As you can see there are two major changes: Line 2

#PBS -l nodes=4:ppn=2 #Use 4 nodes, and 2 processors per node, so 8 processors 

Note: since one processor is used to distribute the processes, this will run DOCK as 7 (n-1) parallel processes.

Last Line

mpirun -np 8 /nfs/user03/wjallen/local/dock6/bin/dock6.mpi -i dockvs.in -o dockvs.out #this line uses mpi to run dock.mpi on multiple processors 

And then we can run:

 qsub qsub.vs.csh

VIII. Frequently Encountered Problems

Artem

Brian

He

Jiahui

Jiaye

Koushik

Natalie

Nikolay

Weiliang

Ye

Yuan

Zach