Cross Docking SB2024 V1 DOCK6.10 A

!!!!!!Under Construction!!!!!!

The purpose of this tutorial is to develop a uniform method to test cross docking across the Rizzo lab with the DOCK software. Note any data in this tutorial is solely for the purpose of example.

I.Introduction

- Cross docking is a test which is fundamentally similar to pose reproduction. If you are not experienced running pose reproduction yet, begin with:

    https://ringo.ams.stonybrook.edu/index.php/Pose_Reproduction_Tutorial

- Cross docking measures pose reproduction accuracy with differing protein conformations/ structures as an additonal variable. It is a more translatable test to "real world" virtual screening, because it tests the ability to identify native poses, even when protein conformation/ sidechain packing is not induced to the particular ligand. When virual screening with a rigid receptor, the particular conformation chosen will not be ideal for all binder chemotypes, but nonetheless it is desirable to predict near native poses.

- The outcomes for cross docking are the same as pose reproduction, although there is a fourth outcome termed "incompatible". This is when a ligand is energetically incompatible in its native pose when complexed in an alternative, rigid, structure of the protein.

II.Necessary files

Scripts to run Cross Docking in batch mode are found at:

    https://github.com/rizzolab/Benchmarking_and_Validation

This tutorial rebuilds systems in an aligned frame. Thus standard test set files are not applicable because proteins families will not necessarily be aligned at protein backbone. Instead of directly using a finalized and prepared test set as Pose Reproduction does, this prepares files from initial preparatory files. The list of necessary files are:

    ${pdb_id}.lig.moe.mol2
    ${pdb_id}.cof.moe.mol2 (if applicable)
    ${pdb_id}.rec.foramber.pdb

For further explanation of these 3 files see section "Preliminary File Preparation":

   https://ringo.ams.stonybrook.edu/index.php/Test_Set_Tutorial_V1

III.Preparing protein families

Enter CrossDocking Directory:

    cd Benchmarking_and_Validation/CrossDocking/

Protein families are a set of structures of the same protein in different conditions. This tutorial will not cover how protein families are determined, although one option is to restrict a family to structures with a single "UniProt ID" and no differing mutations within an active site, and with co-crystal ligands occupying the same active site. Lists of protein families can be found at (Note: see each corresponding paper for how families were determined):

    https://ringo.ams.stonybrook.edu/index.php/Rizzo_Lab_Downloads

A list of PDB codes for each protein family needs to be provided:

    cd zzz.family_lists

For each protein family, create a file with the name of the protein family, and list all PDB structures for that family:

    vi Acetylcholinesterase.txt

The file should like this (Note: The first PDB listed will be the reference which all other proteins are aligned to - a criteria should be chosen for which is first):

    1EVE
    1GPK
    1GPN
    ...

After creating a file listing PDB codes for each protein family, create a file listing protein family names:

    vi zzz.Families.txt

The file should look like this:

    Acetylcholinesterase
    ...

IV.Aligning protein families

The first step is to align protein families using the program UCSF Chimera along the backbone. This is done using the "mmaker" command and aligning all proteins to a single reference. The co-crystal ligand in each structure undergoes the same transformation alignment. (If Chimera is not available as a module this can be done in the gui.)

    module load chimera/1.13.1

Certain variables need to be sourced every time a new session is started to run these scripts:

    source 000.source.env.sh

Edit slurm header and set path to testset (Same testset for Pose Reproduction which should have zzz.master as a subdirectory):

    vi 001.align.submit.sh

    sbatch 001.align.submit.sh

The aligned files for ligand (${pdb_id}.lig.moe.mol2) and receptor (${pdb_id}.rec.foramber.pdb) are found in:

    cd Alignment/Acetylcholinesterase/mol2/

The visual alignment of each protein family should be checked in Chimera gui:

3 aligned protein and ligand structures from same protein family

Statistics on alignment should also be inspected. This can be found in:

    Alignment/Acetylcholinesterase/${pdb_id}

In this folder is the alignment data for this particular structure:

    vi chimera.out

Below is an example of the output of a structure which has a good alignment to the reference. All pairs were used in the alignment and the RMSD of the 2 structures in the alignment is 0.321 angstrom:

Statistics from protein well aligned with reference

In general a good alignment will include (i) at least 90% of the pairs, with (ii) an RMSD less than 2.0 angstrom for the pairs. Anything outside of this range should be rejected from a protein family.

Below is an example of the output of a structure which has a poor alignment to the reference. Only 20% (108 / 528) of pairs are used in the alignment which produces a low RMSD. The RMSD from all pairs is higher than the 2.0 cutoff. This structure should be removed from the family:

Statistics from protein poorly aligned with reference

V.System Preparation of Aligned Structures

If this is a new session remember to run:

    source 000.source.env.sh

Python2 scripts are required for the next step (Following command should load py/2.7.15 for current seawulf setup):

    module load py/

DOCK6 is also used in the next step so load appropriate DOCK6 compilers (Different DOCK compilations have different compilers):

    module load intel-stack

VI.Docking molecules

VII.Cross Docking Analysis

-SEE README FILE IN GIT REPO FOR ADDTIONAL DETAILS THAT MAY NOT BE COVERED HERE

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Tutorial Written By: Christopher Corbo, Rizzo Lab, Stony Brook University (2024)

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Cross Docking SB2024 V1 DOCK6.10 A

Contents

I.Introduction

II.Necessary files

III.Preparing protein families

IV.Aligning protein families

V.System Preparation of Aligned Structures

VI.Docking molecules

VII.Cross Docking Analysis

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Rizzo Lab

Courses

Toolbox