Difference between revisions of "Database Enrichment SB2024 V1 DOCK6.10 A"
Line 40: | Line 40: | ||
- 001.submit.sh has #SBATCH header for submitting to an HPC, such as seawulf. If not using an HPC, delete #SBATCH lines. | - 001.submit.sh has #SBATCH header for submitting to an HPC, such as seawulf. If not using an HPC, delete #SBATCH lines. | ||
+ | |||
- Enter required parameters in script | - Enter required parameters in script | ||
Revision as of 19:25, 24 January 2024
The purpose of this tutorial is to develop a uniform method to test ligand enrichment across the Rizzo lab with the DOCK software.
Contents
I.Introduction
Ligand Enrichment is an experiment used to evaluate how well a docking program can rank experimentally known binders (termed actives) over decoy molecules for a given target. These active and decoy ligands are ideally property matched meaning an active has decoys with similar physiochemical properties. These active ligands should bind more favorably(Have a lower energy score) then the decoy ligands if the docking program can accurately model these binding site and ligand interactions.
The 3 major outcomes for this experiment are early enrichment, random enrichment, and late enrichment. Early enrichment indicates the active ligands dock more successful in the experiment(The goal for all docking programs). The second is random enrichment indicating that the docking program cannot differentiate between active and decoy. Late enrichment indicating that docking software gives the lowest energy scores to the decoys which is the worst outcome.
II.Prepping systems
-The first step is to create directories.
mkdir testset
-Create subdirectory for each system you will run
mkdir 1Q4X
- Then obtain the active and decoy ligands which can be found on the Schoichet DUD-E test set website http://dude.docking.org/targets. Once these targets are obtained unzip these files using the gzip command and move them into the appropriate subdirectory.
cd 1Q4X gzip -d actives_final.mol2.gz gzip -d decoys_final.mol2.gz
-Prepare the target receptor by either using the official SB2023 test set files (to be published) or prepare the receptor associated with the PDB using run000 to run004 in https://github.com/rizzolab/Testset_Protocols and move relevant files into the directory ~/testset/1Q4X
Following all these steps you should have a separate subdirectory for each system with the following files:
actives_final.mol2 decoys_final.mol2 1Q4X.rec.clean.mol2 1Q4X.rec.clust.close.sph 1Q4X.rec.nrg 1Q4X.rec.bmp
III.Docking molecules
-Now that files are ready for docking step a virtual screen will be conducted for both the active and decoy ligands separately.
-Pull Database Enrichment scripts from https://github.com/rizzolab/Benchmarking_and_Validation
- 001.submit.sh has #SBATCH header for submitting to an HPC, such as seawulf. If not using an HPC, delete #SBATCH lines.
- Enter required parameters in script
testset=" Path to folder with all system subdirectories" system_file=" List of systems to run" ie: 1Q4X 1BCD 1SJ0 ... dock=" Path to dock uppermost folder" mpi="Yes / No" - do you want to run in parallel processes=" Number of processes" - only set if mpi = Yes
sbatch or bash 001.submit.sh
IV.Ligand Enrichment Analysis
-