Difference between revisions of "DOCK6 POSE Reproduction"

From Rizzo_Lab
Jump to: navigation, search
(4.Generating a CSV and DAT file of Results)
 
(16 intermediate revisions by the same user not shown)
Line 1: Line 1:
 
The purpose of this tutorial is to provide people with instructions on how to perform a commonly used test, pose reproduction that evaluates DOCK6's ability to reproduce known experimental results.
 
The purpose of this tutorial is to provide people with instructions on how to perform a commonly used test, pose reproduction that evaluates DOCK6's ability to reproduce known experimental results.
 
==I. Introduction==
 
==I. Introduction==
      DOCK6 is a commonly used docking protocol which assess the affinity of a ligand, a drug candidate to a target site (Protein, Enzyme, RNA). To evaluate a software's ability to accurately reproduce experimental results, an experiment called Pose Reproduction.  
+
DOCK6 is a commonly used docking protocol which assess the affinity of a ligand, a drug candidate to a target site (Protein, Enzyme, RNA). To evaluate a software's ability to accurately reproduce experimental results, an experiment called Pose Reproduction.  
  
      Pose Reproduction used an experimental known ligand and protein complex from the PDB database and attempts to dock this ligand back into it's original location. If the lowest energy ligand(most energetically favorable) is within 2.0 RMSDh of the original target site, this is referred to as a docking success. If any of the ligands, but not the lowest energy ligand is within 2.0 RMSDh of the original target site, this is referred to as a scoring failure. If none of the ligands are within 2.0 RMSDh of the original target site this is referred to as a sampling failure.  
+
Pose Reproduction used an experimental known ligand and protein complex from the PDB database and attempts to dock this ligand back into it's original location. If the lowest energy ligand(most energetically favorable) is within 2.0 RMSDh of the original target site, this is referred to as a docking success. If any of the ligands, but not the lowest energy ligand is within 2.0 RMSDh of the original target site, this is referred to as a scoring failure. If none of the ligands are within 2.0 RMSDh of the original target site this is referred to as a sampling failure.
 +
 
 +
[[File:Pose_Reproduction_Intro.JPG]]
  
 
==II. Preparing DOCK directories==
 
==II. Preparing DOCK directories==
      The first step in this process is to prepare all of the DOCK6 directories for each of the systems present. The first step in the procedure is to prepare all a directory for each PDB system. The testset used in this docking protocol was the SB2012 test set with 1043 systems. The testset should be formatted as below with a separate directory for each PDB system.
+
The first step in this process is to prepare all of the DOCK6 directories for each of the systems present. The first step in the procedure is to prepare all a directory for each PDB system. The testset used in this docking protocol was the SB2012 test set with 1043 systems. The testset should be formatted as below with a separate directory for each PDB system.
  
 
===PDB Directories===
 
===PDB Directories===
Line 19: Line 21:
 
  |}
 
  |}
  
      Following this each PDB system needs it 5 files within it to properly function, the original receptor file, the original ligand file, the grid files, the sphere's generated. These files are formatted as below.  
+
Following this each PDB system needs it 5 files within it to properly function, the original receptor file, the original ligand file, the grid files, the sphere's generated. These files are formatted as below.  
  
 
===Files in 121P directory===
 
===Files in 121P directory===
 
{|
 
{|
  |<pre>121P.lig.am1bcc.mol2
+
  |<pre> 121P.lig.am1bcc.mol2
 
  121P.rec.clean.mol2
 
  121P.rec.clean.mol2
 
  121P.sphere.mol2
 
  121P.sphere.mol2
Line 31: Line 33:
 
  |}
 
  |}
  
      These files are used to perform the flexible docking of the ligand into the receptor in the following step. The remaining steps have prepared scripts to perform the remaining steps.
+
These files are used to perform the flexible docking of the ligand into the receptor in the following step. The remaining steps have prepared scripts to perform the remaining steps.
  
 
==3.Performing Pose Reproduction==
 
==3.Performing Pose Reproduction==
  
      Following this a script titled FLX.sh is used to generate a new directory of any name that is preferred within the directory. To modify the script at all it is recommended that you make your own version of this script.
+
Following this a script titled FLX.sh is used to generate a new directory of any name that is preferred within the directory. To modify the script at all it is recommended that you make your own version of this script.
  
 
{|
 
{|
  |<pre>bash FLX.sh name_of_run List_of_PDB_Files
+
  |<pre> bash FLX.sh name_of_run List_of_PDB_Files
 
   bash FLX.sh FLX clean.systems.all
 
   bash FLX.sh FLX clean.systems.all
  Need to check script for specific parameters again for FLX.sh file
 
 
  |
 
  |
 
  |}
 
  |}
  
      This script both creates a dock parameter file and docks the ligand into the receptor target site. Recommend to submit this script to slurm.
+
This script both creates a dock parameter file and docks the ligand into the receptor target site. Recommend to submit this script to slurm.
 +
 
 +
NOTE: The dock6 with simplex minimized script is located in /gpfs/projects/rizzo/slaverty/dock_pose_4_15_20_copy/FLX_pak.sh
  
 
==4.Generating a CSV and DAT file of Results==
 
==4.Generating a CSV and DAT file of Results==
      Following this a CSV file will be generated that has the results of each of the scores, that can be further analyzed. There will also be 3 dat files generated that provides which systems are successes, scoring failures, and sampling failures.
+
Following this a CSV file will be generated that has the results of each of the scores, that can be further analyzed. There will also be 3 dat files generated that provides which systems are successes, scoring failures, and sampling failures.
  
 
Use script in directory that provides the link to all the PDB directories.
 
Use script in directory that provides the link to all the PDB directories.
  
python DOCK6_Pose_reproduction_make_csv.py Name_of_run list_of_PDB_systems CSV_File_generated
+
Note: The key difference in these scripts is where the receptors are being read for assigning ions, this scripts reads from yuchzhuo, /gpfs/projects/rizzo/slaverty/dock_pose_4_15_20_copy/bickel_laverty_calculate_dock6_results_yuchzhuo.py
 +
 
 +
python /gpfs/projects/rizzo/yuchzhou/RCR/DOCK_testset/bickel_laverty_calculate_dock6_results.py FLX /gpfs/projects/rizzo/yuchzhou/RCR/DOCK_testset/clean.systems.all FOR_Tutorial
 +
 
 +
This creates a FOR_Tutorial.csv for all of the systems
 +
 
 +
==5.Generating a graphs of the results==
 +
These are scripts used to analyze CSV Files generated in the previous graph. 3 different graphing options are created using the subplot to be analyzed with the docking experiment.
 +
 
 +
NOTE: The new scripts are up to v6 with analyzing graphs. This is the latest analysis script and /gpfs/projects/rizzo/yuchzhou/RCR/DOCK_testset/DOCK6_Pose_Reproduction_analysis_v6.py
  
==5.Generating a graphs of the results
+
python DOCK6_Pose_Reproduction_analysis_v3.py CSV_File 1
      These are scripts used to analyze CSV Files generated in the previous graph. 3 different graphing options are created using the subplot to be analyzed with the docking experiment.
 
  
python DOCK6_Pose_reproduction_make_graph.py CSV_File 1
+
This will produce a pie chart of the success rate and print out the average time for all the docking experiments
  
python DOCK6_Pose_reproduction_make_graph.py CSV_File 2
+
python DOCK6_Pose_Reproduction_analysis_v3.py CSV_File 2
 +
[[File:FOR_Tutorial_2.png]]
  
python DOCK6_Pose_reproduction_make_graph.py CSV_File 3
+
python DOCK6_Pose_Reproduction_analysis_v3.py CSV_File 3
 +
[[File:FOR_Tutorial_3.png]]

Latest revision as of 11:16, 16 August 2020

The purpose of this tutorial is to provide people with instructions on how to perform a commonly used test, pose reproduction that evaluates DOCK6's ability to reproduce known experimental results.

I. Introduction

DOCK6 is a commonly used docking protocol which assess the affinity of a ligand, a drug candidate to a target site (Protein, Enzyme, RNA). To evaluate a software's ability to accurately reproduce experimental results, an experiment called Pose Reproduction.

Pose Reproduction used an experimental known ligand and protein complex from the PDB database and attempts to dock this ligand back into it's original location. If the lowest energy ligand(most energetically favorable) is within 2.0 RMSDh of the original target site, this is referred to as a docking success. If any of the ligands, but not the lowest energy ligand is within 2.0 RMSDh of the original target site, this is referred to as a scoring failure. If none of the ligands are within 2.0 RMSDh of the original target site this is referred to as a sampling failure.

Pose Reproduction Intro.JPG

II. Preparing DOCK directories

The first step in this process is to prepare all of the DOCK6 directories for each of the systems present. The first step in the procedure is to prepare all a directory for each PDB system. The testset used in this docking protocol was the SB2012 test set with 1043 systems. The testset should be formatted as below with a separate directory for each PDB system.

PDB Directories

121P/
  181L/
  182L/
  183L/
  184L/
  etc/

Following this each PDB system needs it 5 files within it to properly function, the original receptor file, the original ligand file, the grid files, the sphere's generated. These files are formatted as below.

Files in 121P directory

 121P.lig.am1bcc.mol2
 121P.rec.clean.mol2
 121P.sphere.mol2
 121P.grid.nrg
 121P.grid.bmp

These files are used to perform the flexible docking of the ligand into the receptor in the following step. The remaining steps have prepared scripts to perform the remaining steps.

3.Performing Pose Reproduction

Following this a script titled FLX.sh is used to generate a new directory of any name that is preferred within the directory. To modify the script at all it is recommended that you make your own version of this script.

<pre> bash FLX.sh name_of_run List_of_PDB_Files
 bash FLX.sh FLX clean.systems.all

This script both creates a dock parameter file and docks the ligand into the receptor target site. Recommend to submit this script to slurm.

NOTE: The dock6 with simplex minimized script is located in /gpfs/projects/rizzo/slaverty/dock_pose_4_15_20_copy/FLX_pak.sh

4.Generating a CSV and DAT file of Results

Following this a CSV file will be generated that has the results of each of the scores, that can be further analyzed. There will also be 3 dat files generated that provides which systems are successes, scoring failures, and sampling failures.

Use script in directory that provides the link to all the PDB directories.

Note: The key difference in these scripts is where the receptors are being read for assigning ions, this scripts reads from yuchzhuo, /gpfs/projects/rizzo/slaverty/dock_pose_4_15_20_copy/bickel_laverty_calculate_dock6_results_yuchzhuo.py

python /gpfs/projects/rizzo/yuchzhou/RCR/DOCK_testset/bickel_laverty_calculate_dock6_results.py FLX /gpfs/projects/rizzo/yuchzhou/RCR/DOCK_testset/clean.systems.all FOR_Tutorial

This creates a FOR_Tutorial.csv for all of the systems

5.Generating a graphs of the results

These are scripts used to analyze CSV Files generated in the previous graph. 3 different graphing options are created using the subplot to be analyzed with the docking experiment.

NOTE: The new scripts are up to v6 with analyzing graphs. This is the latest analysis script and /gpfs/projects/rizzo/yuchzhou/RCR/DOCK_testset/DOCK6_Pose_Reproduction_analysis_v6.py

python DOCK6_Pose_Reproduction_analysis_v3.py CSV_File 1

This will produce a pie chart of the success rate and print out the average time for all the docking experiments

python DOCK6_Pose_Reproduction_analysis_v3.py CSV_File 2 FOR Tutorial 2.png

python DOCK6_Pose_Reproduction_analysis_v3.py CSV_File 3

FOR Tutorial 3.png