Database Enrichment SB2024 V1 DOCK6.10 A
The purpose of this tutorial is to develop a uniform method to test ligand enrichment across the Rizzo lab with the DOCK software.
Contents
I.Introduction
Ligand Enrichment is a common experiment used to evaluate how well a docking program is capable of accurately modeling in vitro experiments. This experiment uses active ligands and decoy ligands to access a docking programs ability to successfully dock to a target site. These active and decoy ligands are roughly the same size and differ due to chemical similarities. These active ligands should bind more favorably(Have a lower energy score) then the decoy ligands if the docking software works properly.
The 3 major outcomes for this experiment are early enrichment indicating the active ligands dock more successful in the experiment(The goal for all docking programs), the second is random enrichment indicating that the docking program can differentiate between active and decoy, and late enrichment indicating that docking software gives the lowest energy scores to the decoys which is the worst outcome. The other factor to consider is the degree of early and late enrichment
II.Prepping systems
-The first step is to create a directory for the system you are preparing
mkdir 1Q4X
-The first step is to obtain the active and decoy ligand test set systems which can be found on the Schoichet DUD-E test set website http://dude.docking.org/targets
-Once these targets are obtained unzip these files using the gzip -d actives_final.mol2.gz for the active and gzip -d decoys_final.mol2.gz to decompress files
-Prepare the target receptor by either using the official test set ligands or manually prepare a receptor target from scratch
Following all these steps your directory should look like the following using the 1LRU system
actives_final.mol2 decoys_final.mol2 1Q4X.rec.clean.mol2
III.Docking molecules
-After completing this step a virtual screen will be conducted using mpi.
-The input parameters are as follows
-Then submit the script to the qsub to dock the molecule in parallel. Some of the ligand active and decoy testsets are quite large so mpi submission is recommended.
IV.Ligand Enrichment Analysis
-Lastly, 2 scripts were developed to analyze the results. One script to generate a CSV file and a secondary script that uses the CSV data to create a graph.
-The script that generates the CSV file takes three parameters, the list of systems, name of decoy ligands mol2 file, name of active ligands mol2 file. (NOTE: This script can generate multiple CSV files for different ligand experiments, but the naming of the active and decoy mol2 files must be the same) Example: python roc_curve_lig_enrichment_v2.py 1Q4X.txt decoys_final.mol2 actives_final.mol2