2022 DOCK tutorial 2 with PDBID 4ZUD

From Rizzo_Lab
Jump to: navigation, search

Introduction

DOCK

DOCK is a molecular modeling program capable of sampling lower-energy ligand conformations with respect to a binding surface on a given protein. DOCK utilizes and manipulates the geometry of the ligand to find the conformation that yields that most favorable interaction with the respective binding site. With this tool, millions of molecules can be rapidly screened against a target protein for the purposes of identifying new drug molecules that are physiologically relevant.

For more information on DOCK and it's uses, please refer to their online manual: DOCK6 Manual

4ZUD System

4ZUD is an angiotensin II type 1 receptor. It is primarily expressed in liver, heart and kidney cells with its primary function being its ability to regulate blood pressure. Perturbations in chemical cascades that involve this kind of receptor often lead to cardiovascular disease and hypertension; both common ailments.

Since angiotensin II type 1 overstimulation is so common, there has been a large effort to find ways to modulate the protein with drug molecules.

The 4ZUD structure is an angiotesin II type 1 receptor binding domain crystal in complex with Olmesartan, an inverse agonist.

Crystal Structure Paper

Directory Preparation

Before beginning, create the following directories in your space so that all necessary files are organized and can be access quickly:

mkdir 001.structure 002.surface_spheres 003.gridbox 004.dock 005.virtual_screen  006.cartesian_min 007.rescore

You don't have to name your directories the same as they are named here, but be cautious since the files that will be used for this tutorial utilize this naming scheme. They will need to be changed in each file that refers to them if you don't use this naming scheme!

Be sure to have Chimera installed on your system as it will be our primary visualization and system-editing program.

Protein and Ligand Preparation

Download the 4ZUD PDB file from the RCSB PDB website and open the file in Chimera.

Select -> Open -> (pathway to pdb file on your local machine) -> Open
4zud pdb.png

You will notice a few side chain residues are explicitly displayed; those are the ones that directly engage with the ligand. The structure also has some missing regions denoted by the dashed-lines. These regions do not have to be modeled to use the system for docking since the majority of the protein remains restrained during the process (except for the residues of the active site, to a certain extent). You can play around with Chimera and visualize the protein from different angles to get a complete look at the protein to ensure there are no glaring errors in the structure that could have somehow arose from the downloading and opening process (Doesn't usually happen, but it's always good to be sure before moving on!)

Protein Preparation

Many structures deposited in the PDB lack hydrogens due to the difficulty in resolving their electron densities from cryo-EM or X-ray crystallography. The structures also lack formal charges since that information is not captured with out current experimental structure-determining techniques. Both charges and hydrogens are crucial for accurately studying any chemical system, and so they both must be added manually to 4ZUD in order to prime the system for docking.

First we want to delete the ligand from the file since we want to save the protein separately

Select -> Residue -> OLM
Actions -> Atoms/Bonds -> Delete

Adding Hydrogens and Charges

To select the entire protein:

Select -> Chain -> A

To add the hydrogens and charges to the protein we will use Dock Prep:

Tools -> Surface/Binding Analysis -> DockPrep

The first panel should look something like this:

4zud protein dockprep panel1.png

Leave the default setting on and press okay. You will then be prompted to add the hydrogens:

4zud protein dockprep panel2.png

Again, leave the default settings.

Lastly, DockPrep will assign charges:

4zud protein dockprep panel3.png

Make sure that this value of the calculated charge (which will be outputted in the Chimera reply log) is in agreement with what the PDB website or the corresponding paper says about the charge, if that information is provided.

File Saving

Once you have completed all of the aforementioned steps, you have to save the protein as a mol2 file.

File -> Save Mol2

For the purposes of this tutorial the file will be called 4ZUD_protein_hydrogens.mol2

You will want to save a version of this file without hydrogens. To do that, you can select and delete all of the hydrogens like so:

Select -> Chemistry -> element -> H
Actions -> Atoms/Bonds -> Delete

You can then resave the file. 4ZUD_protein_without_hydrogens.mol2

Ligand Preparation

Now we want to focus on preparing the ligand. We can reopen the unmodified 4ZUD PDB in Chimera and delete the protein from the file. You can do this by doing an inverse selection for the protein. First select the ligand:

Select -> Residue -> OLM

On your keyboard, press Shift + Right-Arrow keys simultaneously to invert the selection to the parts that belong to the protein. You can then delete your selection, and you should be left with just the ligand:

Actions -> Atoms/Bonds -> Delete

Adding Hydrogens and Charges

To add hydrogens to the ligand, you can follow the same procedure that you used to add them to the protein § Adding Hydrogens and Charges.

DockPrep predicts a net -1 charge for the ligand. This makes sense since there is a carboxylate group in the molecules. Again, make sure to corroborate this with what the paper states and/or the PDB website states about the charge.

File Saving

You can then save the file the same way you did for the protein mol2. For the purposes of this tutorial, the file will be named 4ZUD_ligand_hydrogens.mol2


Before going any further, make sure to copy the mol2 files that were generated thus far from your local computer to Seawulf. Place these files in the 001.structure directory, or whichever one is the equivalent directory in your records.

Surface Spheres Generation

Grid Box

Grid Generation

UNDER CONSTRUCTION -- FILE NAMES, DIRECTORIES ARE FROM AS. WILL CHANGE THEM TO JA CONVENTIONS.

Next, we'll need to generate the grid we'll use in subsequent steps. Let's start by creating our grid.in file:

   vi grid.in

Into this file copy-paste the following:

 compute_grids                             yes
 grid_spacing                              0.4
 output_molecule                           no
 contact_score                             no
 energy_score                              yes
 energy_cutoff_distance                    9999
 atom_model                                a
 attractive_exponent                       6
 repulsive_exponent                        9
 distance_dielectric                       yes
 dielectric_factor                         4
 bump_filter                               yes
 bump_overlap                              0.75
 receptor_file                             ../001_structure/4zud_receptor_wcharges_wH.mol2
 box_file                                  4zud.box.pdb
 vdw_definition_file                      /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 score_grid_prefix                        grid

Now that there is a grid.in file, the grid can be produced by running the following:

    grid -i grid.in -o gridinfo.out

Upon successful completion of the grid generation, there should be three new files in your working directory: grid.bmp, grid.nrg, and gridinfo.out.

Box Generation

Before we do anything, we must be in the correct directory:

  cd 003.gridbox

To generate a box around our subject, we need an .in file for Showbox:

  vi showbox.in


Into which we will copy-paste the following:

  Y
  8.0 
  ../002_surface_spheres/4zud_receptor_trunc_spheres_sel.sph
  1
  4zud.box.pdb

Now that we have a .in file, we will produce a .box.pdb file by running the following:

  showbox < showbox.in

After running this, you should have a 4ZUD.box.pdb file in the directory which you ran these commands.

Ligand Minimization

While the ligand in the 4ZUD crystal structure seems to be in a reasonable pose (no steric clashes, etc.), and is of course bound to the receptor, it may not be representative of the lowest energy conformation. Assuming this is the case, the use of the non-minimized ligand conformation will reduce the accuracy of any calculations we perform with it.

One can avoid this problem by performing an energy minimization of the ligand. The first step to do this with 4ZUD will be to cd into the appropriate directory:

   cd 004.dock

We'll be doing some other work in this directory later, so we should create a directory within this one specifically for performing our minimization:

  mkdir min

cd into /min/ and produce an input file:

   vi min.in

Into which the following should be copy-pasted:

  conformer_search_type                                        rigid
  use_internal_energy                                          yes
  internal_energy_rep_exp                                      12
  internal_energy_cutoff                                       100.0
  ligand_atom_file                                             ../001_structure/4zud_ligand_wcharges_wH.mol2
  limit_max_ligands                                            no
  skip_molecule                                                no
  read_mol_solvation                                           no
  calculate_rmsd                                               yes
  use_rmsd_reference_mol                                       yes
  rmsd_reference_filename                                      ../001_structure/4zud_ligand_wcharges_wH.mol2
  use_database_filter                                          no
  orient_ligand                                                no
  bump_filter                                                  no
  score_molecules                                              yes   
  contact_score_primary                                        no
  contact_score_secondary                                      no
  grid_score_primary                                           yes
  grid_score_secondary                                         no
  grid_score_rep_rad_scale                                     1
  grid_score_vdw_scale                                         1
  grid_score_es_scale                                          1  
  grid_score_grid_prefix                                       ../003_gridbox/grid
  multigrid_score_secondary                                    no
  dock3.5_score_secondary                                      no
  continuous_score_secondary                                   no
  footprint_similarity_score_secondary                         no
  pharmacophore_score_secondary                                no
  descriptor_score_secondary                                   no
  gbsa_zou_score_secondary                                     no
  gbsa_hawkins_score_secondary                                 no
  SASA_score_secondary                                         no
  amber_score_secondary                                        no
  minimize_ligand                                              yes
  simplex_max_iterations                                       1000
  simplex_tors_premin_iterations                               0 
  simplex_max_cycles                                           1
  simplex_score_converge                                       0.1
  simplex_cycle_converge                                       1.0
  simplex_trans_step                                           1.0
  simplex_rot_step                                             0.1
  simplex_tors_step                                            10.0
  simplex_random_seed                                          0
  simplex_restraint_min                                        yes
  simplex_coefficient_restraint                                10.0
  atom_model                                                   all
  vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
  flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
  flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
  ligand_outfile_prefix                                        4zud.lig.min

Now that we have our .in file, let's put it to use:

 dock6 -i min.in -o min.out

Once this finishes running, two files will be generated: 4zud.lig.min_scored.mol2 and min.out. Copy 4zud.lig.min_scored.mol2 to your local machine and open it alongside the original ligand in chimera. It should something like this:

Min ligand.png

Where the blue structure is the ligand before minimization, and the beige ligand is after.

Footprint Analysis

Footprint analysis allows one to examine how particular residues within the receptor interact with the ligand. Specifically, we will be looking at electrostatic and Van der Waals interactions between 4ZUD's binding site residues and the OLM ligand. This can allow one to identify which residues of the receptor contribute to the ligand's binding affinity.

To begin footprint analysis, we first need to make a .in file:

 vi footprint.in

Into which the following should be copy-pasted:

 conformer_search_type                                        rigid
 use_internal_energy                                          no
 ligand_atom_file                                             ../../energy_min/4zud.lig.min_scored.mol2
 limit_max_ligands                                            no
 skip_molecule                                                no
 read_mol_solvation                                           no
 calculate_rmsd                                               no
 use_database_filter                                          no
 orient_ligand                                                no
 bump_filter                                                  no
 score_molecules                                              yes
 contact_score_primary                                        no
 contact_score_secondary                                      no
 grid_score_primary                                           no
 grid_score_secondary                                         no
 multigrid_score_primary                                      no
 multigrid_score_secondary                                    no
 dock3.5_score_primary                                        no
 dock3.5_score_secondary                                      no
 continuous_score_primary                                     no
 continuous_score_secondary                                   no
 footprint_similarity_score_primary                           yes
 footprint_similarity_score_secondary                         no
 fps_score_use_footprint_reference_mol2                       yes
 fps_score_footprint_reference_mol2_filename                  ../../../001_structure/4zud_ligand_wcharges_wH.mol2
 fps_score_foot_compare_type                                  Euclidean
 fps_score_normalize_foot                                     no
 fps_score_foot_comp_all_residue                              yes
 fps_score_receptor_filename                                  ../../../001_structure/4zud_receptor_wcharges_wH.mol2
 fps_score_vdw_att_exp                                        6
 fps_score_vdw_rep_exp                                        9
 fps_score_vdw_rep_rad_scale                                  1
 fps_score_use_distance_dependent_dielectric                  yes
 fps_score_dielectric                                         4.0
 fps_score_vdw_fp_scale                                       1
 fps_score_es_fp_scale                                        1
 fps_score_hb_fp_scale                                        0
 pharmacophore_score_secondary                                no
 descriptor_score_secondary                                   no
 gbsa_zou_score_secondary                                     no
 gbsa_hawkins_score_secondary                                 no
 SASA_score_secondary                                         no
 amber_score_secondary                                        no
 minimize_ligand                                              no
 atom_model                                                   all
 vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 ligand_outfile_prefix                                        footprint.out
 write_footprints                                             yes
 write_hbonds                                                 yes
 write_orientations                                           no
 num_scored_conformers                                        1
 rank_ligands                                                 no

Once the footprint.in file has been completed, we can perform the calculations by running the following command:

 dock6 -i footprint.in -o footprint.out

After the calculations have been completed, you should have three new files in the directory where the command was run: footprint.out_footprint_scored.txt footprint.out_hbond_scored.txt and footprint.out_scored.mol2. We can now produce plots of the results using the python_footprint_single_magnitude.py script, which can be found in /gpfs/projects/AMS536/zzz.programs. Run the script by issuing the following command:

 python plot_footprint_single_magnitude.py footprint.out_footprint_scored.txt  50

Note that the "50" argument specifies that we want to plot the 50 residues with the greatest VDW or electrostatic energies, and ignore the rest. Once the script has been run, you should see a footprint.out.pdf file in your directory. If you transfer this file and open it in a PDF viewer, you should see something that looks similar to this:

[FOOTPRINT_PLOT]

From this plot it's readily apparent that there are strong Van der Waals and electrostatic interactions between OLM and ARG254, as well as significant Van der Waals interactions with TRP178.

DOCKING

The main goal of this section is to conduct a pose reproduction of the crystal structure ligand to ensure that we are able to successfully re-produce the low-energy pose that the ligand was crystalized in. If docking is not successful at this, then this system may not be suitable for further testing using DOCK since it is not able to produce a pose that is already known to be sampled by the ligand, based on observations from the crystal. For these reasons, pose reproduction is a worthwhile preliminary step in any docking experiments so that we can be more confident in our results if the system will be used for more complex docking protocols such as virtual screening or de novo design.

In the current implementation of DOCK6, a sampling success is defined as a pose RMSD that is < 2 Å relative to the crystal pose.

Rigid Docking

Rigid docking is the least computationally intensive algorithm in the DOCK program since it does not allow for dihedral rotations, or bond length perturbations. Since we are just focusing on pose reproduction (and since we know that the crystal structure ligand pose is of biological relevance), this docking method would be a good first step.

In your 004.dock directory prepare this file:

vim rigid.in

Insert the following into that file:

conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100
ligand_atom_file                                             ../003.gridbox/4zud.lig.min_scored.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes
rmsd_reference_filename                                      ../003.gridbox/4zud.lig.min_scored.mol2
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           ../002.surface_spheres/selected_spheres.sph
max_orientations                                             2000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../003.gridbox/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        rigid.out
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Run the input file with the following command:

dock6 -i rigid.in

Along with an out file, the rigid.out_scored.mol2 should have been created. This is your docked ligand(s). The ligand maybe be superimposed onto the crystal structure ligand using View Dock in Chimera:

Tools -> Surface/Binding Analysis -> ViewDock -> (Path to docked mol2) -> Dock 4, 5, or 6

The crystal reference ligand is colored in brown.

RMSD: 0.4473 Å relative to the minimized crystal pose

Notice how similar the reproduced pose is to the crystal pose. This is largely due to not allowing much position correction/movement during the docking; hence rigid docking. This is the simplest form of docking so you should expect to see very small RMSDs. The RMSD for this DOCK run is below our threshold so it is considered to be a sampling success.

Fixed Anchor Docking

With fixed anchor docking, we can introduce more flexibility into the conformational sampling when trying to re-dock the ligand into the active site. This is done by choosing certain fragments from the ligand as "anchors" which are manually placed in specific locations on the active site, which serve as points from where the other fragments that make up the known crystal ligand can attach to and grow from.

This docking algorithm will allow for the sampling of rotamers of the linker moieties attached to the fixed anchor(s), allowing for a more diverse free energy landscape to explore; possibly yielding more theoretically favorable ligand conformations!

In a new sub directory:

mkdir Fixed_anchor_docking

In a new input file:

vi fixed.in

Insert the following:

conformer_search_type                                        flex
write_fragment_libraries                                     no
user_specified_anchor                                        no
limit_max_anchors                                            no
min_anchor_size                                              5
pruning_use_clustering                                       yes
pruning_max_orients                                          10000
pruning_clustering_cutoff                                    100
pruning_conformer_score_cutoff                               100.0
pruning_conformer_score_scaling_factor                       1.0
use_clash_overlap                                            no
write_growth_tree                                            no
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100.0
ligand_atom_file                                             ../../001.structure/4ZUD_ligand_hydrogens.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               yes
use_rmsd_reference_mol                                       yes
rmsd_reference_filename                                      ../../001.structure/4ZUD_ligand_hydrogens.mol2
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           yes
grid_score_secondary                                         no
grid_score_rep_rad_scale                                     1
grid_score_vdw_scale                                         1
grid_score_es_scale                                          1
grid_score_grid_prefix                                       ../../003.gridbox/grid
multigrid_score_secondary                                    no
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
minimize_anchor                                              yes
minimize_flexible_growth                                     yes
use_advanced_simplex_parameters                              no
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1
simplex_trans_step                                           1
simplex_rot_step                                             0.1
simplex_tors_step                                            10
simplex_anchor_max_iterations                                500
simplex_grow_max_iterations                                  500
simplex_grow_tors_premin_iterations                          0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
ligand_outfile_prefix                                        4ZUD_fad
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

To run the analysis:

dock6 -i fixed.in -O fixed.out

The resulting docked pose:

RMSD: 0.9235 Å relative to the minimized crystal pose

The RMSD in this case is still small, and within the cutoff, but slightly larger than the rigid docking which is also evident in the image. It makes sense that the pose generated from fixed anchor docking is not as close to the crystal pose since we allowed the algorithm to explore different rotational conformations, increasing the likelihood of generating a new conformation.

Flex Docking

In Flex Docking the ligand is fully flexible, as its name would suggest. All degrees of freedom are sampled and all atoms within the ligand have full rotational freedom about their bonds. Due to the rigor of this calculation, however, flex docking is quite resource and time-consuming. This of course means that we will need to submit our run to the queue. Before we do that, we need to make an .in file for our run:

 conformer_search_type                                        flex
 write_fragment_libraries                                     no
 user_specified_anchor                                        no
 limit_max_anchors                                            no
 min_anchor_size                                              5
 pruning_use_clustering                                       yes
 pruning_max_orients                                          5000
 pruning_clustering_cutoff                                    2500
 pruning_conformer_score_cutoff                               100.0
 pruning_conformer_score_scaling_factor                       1.0
 use_clash_overlap                                            no
 write_growth_tree                                            no
 use_internal_energy                                          yes
 internal_energy_rep_exp                                      12
 internal_energy_cutoff                                       100.0
 ligand_atom_file                                             ../../003.gridbox/4zud.lig.min_scored.mol2 #CHANGE ME
 limit_max_ligands                                            no
 skip_molecule                                                no
 read_mol_solvation                                           no
 calculate_rmsd                                               yes
 use_rmsd_reference_mol                                       yes
 rmsd_reference_filename                                      ../../003.gridbox/4zud.lig.min_scored.mol2
 use_database_filter                                          no
 orient_ligand                                                yes
 automated_matching                                           yes
 receptor_site_file                                           ../../002.surface_spheres/selected_spheres.sph #CHANGE ME
 max_orientations                                             1000
 critical_points                                              no
 chemical_matching                                            no
 use_ligand_spheres                                           no
 bump_filter                                                  no
 score_molecules                                              yes
 contact_score_primary                                        no
 contact_score_secondary                                      no
 grid_score_primary                                           yes
 grid_score_secondary                                         no
 grid_score_rep_rad_scale                                     1
 grid_score_vdw_scale                                         1
 grid_score_es_scale                                          1
 grid_score_grid_prefix                                       ../../003.gridbox/grid #CHANGE ME
 multigrid_score_secondary                                    no
 dock3.5_score_secondary                                      no
 continuous_score_secondary                                   no
 footprint_similarity_score_secondary                         no
 pharmacophore_score_secondary                                no
 descriptor_score_secondary                                   no
 gbsa_zou_score_secondary                                     no
 gbsa_hawkins_score_secondary                                 no
 SASA_score_secondary                                         no
 amber_score_secondary                                        no
 minimize_ligand                                              yes
 minimize_anchor                                              yes
 minimize_flexible_growth                                     yes
 use_advanced_simplex_parameters                              no
 simplex_max_cycles                                           1
 simplex_score_converge                                       0.1
 simplex_cycle_converge                                       1.0
 simplex_trans_step                                           1.0
 simplex_rot_step                                             0.1
 simplex_tors_step                                            10.0
 simplex_anchor_max_iterations                                500
 simplex_grow_max_iterations                                  500
 simplex_grow_tors_premin_iterations                          0
 simplex_random_seed                                          0
 simplex_restraint_min                                        no
 atom_model                                                   all
 vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 ligand_outfile_prefix                                        flex.out
 write_orientations                                           yes
 num_scored_conformers                                        1
 rank_ligands                                                 no

Now that we have our .in file, we need to make a slurm script to run it. To make a script, first run the following:

 vi flex.sh

Inside of this file, copy-paste the following:

 #!/bin/bash
 #SBATCH --time=10:00:00
 #SBATCH --nodes=1
 #SBATCH --ntasks=28
 #SBATCH --job-name=4zud_flex_dock
 #SBATCH --output=4zud_flex_dock
 #SBATCH -p long-28core
 #SBATCH --mail-type=BEGIN,END
 #SBATCH --mail-user=<your_netid>@stonybrook.edu
 cd $SLURM_SUBMIT_DIR
 dock6 -i flex.in -o 4zud_flex.out

You should get an email when your run begins and when it ends. For most systems, this should take a few hours to run. If the run is much shorter than that it can be an indicator something has gone wrong. Once complete, three files will appear in the directory where you ran your script: flex.out_scored.mol2, flex.out_conformers.mol2 and flex.out_orients.mol2. Be sure to take a look at your flex.out_scored.mol2. You should see that all RMSD values are less than 2 angstroms -- a desirable result.

RMSD: 1.2088 Å relative to the minimized crystal pose

Virtual Screening

Virtual Screening is a powerful technique in which we dock a library of molecules (often numbering in the tens of thousands) into our receptor and rank them on various parameters. As one might imagine, these calculations are time and resource-intensive, so we won't be running this on the head node.

The molecule library we will use in this tutorial is "VS_library_25K.mol2", which can be found in /gpfs/software/AMS536/zzz.programs. Copy this file to <virtual screening directory> and <cartesian minimization directory>. We'll need it later when we perform our cartesian minimization.

To start off, move into the appropriate directory:

 <directory>

Once there, make an .in file for the screen:

 conformer_search_type                                        flex
 write_fragment_libraries                                     no
 user_specified_anchor                                        no
 limit_max_anchors                                            no
 min_anchor_size                                              5
 pruning_use_clustering                                       yes
 pruning_max_orients                                          1000
 pruning_clustering_cutoff                                    100
 pruning_conformer_score_cutoff                               100.0
 pruning_conformer_score_scaling_factor                       1.0
 use_clash_overlap                                            no
 write_growth_tree                                            no
 use_internal_energy                                          yes
 internal_energy_rep_exp                                      12
 internal_energy_cutoff                                       100.0
 ligand_atom_file                                             VS_library_25K.mol2 
 limit_max_ligands                                            no
 skip_molecule                                                no
 read_mol_solvation                                           no
 calculate_rmsd                                               no
 use_database_filter                                          no
 orient_ligand                                                yes
 automated_matching                                           yes
 receptor_site_file                                           ../002_surface_sphere/selected_spheres.sph #CHANGE ME
 max_orientations                                             1000
 critical_points                                              no
 chemical_matching                                            no
 use_ligand_spheres                                           no
 bump_filter                                                  no
 score_molecules                                              yes
 contact_score_primary                                        no
 contact_score_secondary                                      no
 grid_score_primary                                           yes
 grid_score_secondary                                         no
 grid_score_rep_rad_scale                                     1
 grid_score_vdw_scale                                         1
 grid_score_es_scale                                          1
 grid_score_grid_prefix                                       ../003_gridbox/grid #CHANGE ME
 multigrid_score_secondary                                    no
 dock3.5_score_secondary                                      no
 continuous_score_secondary                                   no
 footprint_similarity_score_secondary                         no
 pharmacophore_score_secondary                                no
 descriptor_score_secondary                                   no
 gbsa_zou_score_secondary                                     no
 gbsa_hawkins_score_secondary                                 no
 SASA_score_secondary                                         no
 amber_score_secondary                                        no
 minimize_ligand                                              yes
 minimize_anchor                                              yes
 minimize_flexible_growth                                     yes
 use_advanced_simplex_parameters                              no
 simplex_max_cycles                                           1
 simplex_score_converge                                       0.1
 simplex_cycle_converge                                       1.0
 simplex_trans_step                                           1.0
 simplex_rot_step                                             0.1
 simplex_tors_step                                            10.0
 simplex_anchor_max_iterations                                500
 simplex_grow_max_iterations                                  500
 simplex_grow_tors_premin_iterations                          0
 simplex_random_seed                                          0
 simplex_restraint_min                                        no
 atom_model                                                   all
 vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 ligand_outfile_prefix                                        virtual.out
 write_orientations                                           no
 num_scored_conformers                                        1
 rank_ligands                                                 yes
 max_ranked_ligands                                           500

To ensure that the .in file has been configured correctly, we will do a test run of the calculation:

 dock6 -i virtual.in 

If the calculation seems to begin without issue, you can press ctrl+c to end the test. We are ready to submit the calculation as a job. Since we can leverage multiple cores for our virtual screen, we will run the command with MPI. The slurm script can be made by running the following:

 vi virtual_screen.sh

Then you need to copy-paste the following into the file:

 #!/bin/bash
 #SBATCH --time=48:00:00
 #SBATCH --nodes=4
 #SBATCH --ntasks=28
 #SBATCH --job-name=4zud_vs
 #SBATCH --output=4zud_vs
 #SBATCH -p long-28core
 #SBATCH --mail-type=BEGIN,END
 #SBATCH --mail-user=andrew.sillato@stonybrook.edu
 cd $SLURM_SUBMIT_DIR
 mpirun -np 112 dock6.mpi -i virtual.in -o virtual.out

To submit this job to the queue, run:

 sbatch virtual_screen.sh

Due to the fact that this job is running on 112 cores, 112 .out files will be produced over the course of the run. Their contents are summarized in virtual.out, which should be transferred to your local machine for viewing in Chimera.

Cartesian Minimization

Once the virtual screen has been completed, we must perform an energy minimization of all of our screened ligands. After this we will rescore them, revealing any differences between the energy minimized and non-minimized poses. This will help us to identify which ligands bind most strongly with our receptor. Before we begin, we must move into the appropriate directory:

 cd <cartmin_directory>

Once there, we must make an .in file:

 vi cartmin.in

Into this .in, copy-paste the following:

 conformer_search_type                                        rigid
 use_internal_energy                                          yes
 internal_energy_rep_exp                                      12
 internal_energy_cutoff                                       100
 ligand_atom_file                                             ../006_virtual_screen_mpi/virtual.out_scored.mol2 #CHANGE ME
 limit_max_ligands                                            no
 skip_molecule                                                no
 read_mol_solvation                                           no
 calculate_rmsd                                               no
 use_database_filter                                          no
 orient_ligand                                                no
 bump_filter                                                  no
 score_molecules                                              yes
 vcontact_score_primary                                        no
 contact_score_secondary                                      no
 grid_score_primary                                           no
 grid_score_secondary                                         no
 multigrid_score_primary                                      no
 multigrid_score_secondary                                    no
 dock3.5_score_primary                                        no
 dock3.5_score_secondary                                      no
 continuous_score_primary                                     yes
 continuous_score_secondary                                   no
 cont_score_rec_filename                                      ../001_structure/1HW9_rec_wH.mol2 #CHANGE ME
 cont_score_att_exp                                           6
 cont_score_rep_exp                                           12
 cont_score_rep_rad_scale                                     1
 cont_score_use_dist_dep_dielectric                           yes
 cont_score_dielectric                                        4.0
 cont_score_vdw_scale                                         1.0
 cont_score_es_scale                                          1.0
 footprint_similarity_score_secondary                         no
 pharmacophore_score_secondary                                no
 descriptor_score_secondary                                   no
 gbsa_zou_score_secondary                                     no
 gbsa_hawkins_score_secondary                                 no
 SASA_score_secondary                                         no
 amber_score_secondary                                        no
 minimize_ligand                                              yes
 simplex_max_iterations                                       1000
 simplex_tors_premin_iterations                               0
 simplex_max_cycles                                           1.0
 simplex_score_converge                                       0.1
 simplex_cycle_converge                                       1.0
 simplex_trans_step                                           1.0
 simplex_rot_step                                             0.1
 simplex_tors_step                                            10.0
 simplex_random_seed                                          0
 simplex_restraint_min                                        no
 atom_model                                                   all
 vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 ligand_outfile_prefix                                        4zud.min 
 write_orientations                                           no
 num_scored_conformers                                        1
 rank_ligands                                                 no

Since this is another rather intensive calculation, we need to submit it to the queue. We need to first make the slurm script:

vi cartmin.sh

Into said file, you should copy the following:

 #!/bin/bash
 #SBATCH --time=48:00:00
 #SBATCH --nodes=2 
 #SBATCH --ntasks=28
 #SBATCH --job-name=mpi_4zud_VS_CM
 #SBATCH --output=4zud_CM_mpi.out
 #SBATCH -p long-28core
 #SBATCH --mail-type=BEGIN,END
 #SBATCH --mail-user=<your_netid>@stonybrook.edu

cd $SLURM_SUBMIT_DIR mpirun -np 56 /gpfs/projects/AMS536/zzz.programs/dock6.9_release/bin/dock6.mpi -i cartmin.in -o cartmin_scored.out

Once the script has been saved, we can use it to submit this job to the queue:

 sbatch cartmin.sh

It should be noted that, like the virtual screen, we can speed this job up by requesting more nodes. Once the job is done, open any of the .out files to check that the job ran successfully.

Rescoring

As mentioned in the previous section, once we have energy-minimized the molecules from the virtual screen, we need to rescore them based on any changes to the conformations of the various ligands. We can rescore and rank them using a variety of scoring functions (pharmacophore score, footprint similarity score, Hungarian score, etc.). To accomplish this, we must first create a .in file in which we will specify the functions with which we want to score our molecules. Change directory into 00X.rescore and run the following:

 vi rescore.in

Into this file, paste the following:

 conformer_search_type                                        rigid
 use_internal_energy                                          yes
 internal_energy_rep_exp                                      12
 internal_energy_cutoff                                       100
 ligand_atom_file                                             ../005_virtual_screen/virtual.out_scored.mol2 #CHANGE ME
 limit_max_ligands                                            no
 skip_molecule                                                no
 read_mol_solvation                                           no
 calculate_rmsd                                               no 
 use_database_filter                                          no
 orient_ligand                                                no
 bump_filter                                                  no
 score_molecules                                              yes
 contact_score_primary                                        no
 contact_score_secondary                                      no
 grid_score_primary                                           no
 grid_score_secondary                                         no
 multigrid_score_primary                                      no
 multigrid_score_secondary                                    no
 dock3.5_score_primary                                        no
 dock3.5_score_secondary                                      no
 continuous_score_primary                                     no
 continuous_score_secondary                                   no
 footprint_similarity_score_primary                           no
 footprint_similarity_score_secondary                         no
 pharmacophore_score_primary                                  no
 pharmacophore_score_secondary                                no
 descriptor_score_primary                                     yes
 descriptor_score_secondary                                   no
 descriptor_use_grid_score                                    no
 descriptor_use_multigrid_score                               no
 descriptor_use_continuous_score                              no
 descriptor_use_footprint_similarity                          yes
 descriptor_use_pharmacophore_score                           yes
 descriptor_use_hungarian                                     yes
 descriptor_use_volume_overlap                                yes
 descriptor_fps_score_use_footprint_reference_mol2            yes
 descriptor_fps_score_footprint_reference_mol2_filename       ../004_dock/energy_min/4zud.lig.min_scored.mol2 #CHANGE ME
 descriptor_fps_score_foot_compare_type                       Euclidean
 descriptor_fps_score_normalize_foot                          no
 descriptor_fps_score_foot_comp_all_residue                   yes
 descriptor_fps_score_receptor_filename                       ../001_structure/4zud_receptor_prepped.mol2 #CHANGE ME
 descriptor_fps_score_vdw_att_exp                             6
 descriptor_fps_score_vdw_rep_exp                             12
 descriptor_fps_score_vdw_rep_rad_scale                       1
 descriptor_fps_score_use_distance_dependent_dielectric       yes
 descriptor_fps_score_dielectric                              4.0
 descriptor_fps_score_vdw_fp_scale                            1
 descriptor_fps_score_es_fp_scale                             1
 descriptor_fps_score_hb_fp_scale                             0
 descriptor_fms_score_use_ref_mol2                            yes
 descriptor_fms_score_ref_mol2_filename                       ../004_dock/energy_min/4zud.lig.min_scored.mol2 #CHANGE ME
 descriptor_fms_score_write_reference_pharmacophore_mol2      no
 descriptor_fms_score_write_reference_pharmacophore_txt       no
 descriptor_fms_score_write_candidate_pharmacophore           no
 descriptor_fms_score_write_matched_pharmacophore             no
 descriptor_fms_score_compare_type                            overlap
 descriptor_fms_score_full_match                              yes
 descriptor_fms_score_match_rate_weight                       5.0
 descriptor_fms_score_match_dist_cutoff                       1.0
 descriptor_fms_score_match_proj_cutoff                       .7071
 descriptor_fms_score_max_score                               20
 descriptor_fingerprint_ref_filename                          ../004_dock/energy_min/4zud.lig.min_scored.mol2 #CHANGE ME
 descriptor_hms_score_ref_filename                            ../004_dock/energy_min/4zud.lig.min_scored.mol2 #CHANGE ME
 descriptor_hms_score_matching_coeff                          -5
 descriptor_hms_score_rmsd_coeff                              1
 descriptor_volume_score_reference_mol2_filename              ../004_dock/energy_min/4zud.lig.min_scored.mol2 #CHANGE ME
 descriptor_volume_score_overlap_compute_method               analytical
 descriptor_weight_fps_score                                  1
 descriptor_weight_pharmacophore_score                        1
 descriptor_weight_fingerprint_tanimoto                       -1
 descriptor_weight_hms_score                                  1
 descriptor_weight_volume_overlap_score                       -1
 gbsa_zou_score_secondary                                     no
 gbsa_hawkins_score_secondary                                 no
 SASA_score_secondary                                         no
 amber_score_secondary                                        no
 minimize_ligand                                              no
 atom_model                                                   all
 vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn
 flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn
 flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl
 chem_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/chem.defn
 pharmacophore_defn_file                                      /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/ph4.defn
 ligand_outfile_prefix                                        descriptor.output #CHANGE ME
 write_footprints                                             yes
 write_hbonds                                                 yes
 write_orientations                                           no
 num_scored_conformers                                        1
 rank_ligands                                                 no

Similar to the previous sections, we will need to run this on a compute node. Create the slurm script to do so by running the following:

vi rescore.sh

Into this file copy the following:

 #!/bin/bash
 #SBATCH --time=48:00:00
 #SBATCH --nodes=2
 #SBATCH --ntasks=28  
 #SBATCH --job-name=mpi_4zud_VS_CM_rescore
 #SBATCH --output=4zud_VS_CM_rescore_mpi.out
 #SBATCH -p long-28core
 #SBATCH --mail-type=BEGIN,END
 #SBATCH --mail-user=<your_netid>@stonybrook.edu
 cd $SLURM_SUBMIT_DIR
 mpirun -np 32 dock6.mpi -i rescore.in -o rescore.out

Once the script has been saved, run it:

 sbatch rescore.sh

After the job is complete you will see many .out files, all of which are summarized in descriptor.out_scored.mol2. This contains our complete rescored dataset, which can be viewed with ViewDock in Chimera.

If, however, we wanted to view a subset of these molecules (i.e., dozens) ranked by the same scoring function all at once, we would not be able to. We can get around this issue by using the zzz.sep_by_val.py script, found in <zzz.sep_by_val.py directory>. In this example, we will select the 25 molecules which have the highest footprint similarity scores when compared to our reference ligand:

 python zzz.sep_by_val.py descriptor.output_scored.mol2 descriptor.output_scored_by_footprint_similarity.mol2 Name: Footprint_Similarity_Score 25

We can then import the descriptor.output_scored_by_footprint_similarity.mol2 file into Chimera. It should look something like this:

[FOOTPRINT_OVERLAP_PICTURE]