2021 DOCK tutorial 3 with PDBID 1S19
- 1 Introduction
- 2 Preparing the Ligand and Receptor
- 3 Generating Receptor Surface and Spheres
- 4 Creating Box and Grid
- 5 Energy Minimization
- 6 Docking
Developed by Irwin D. Kuntz, Jr. and colleagues at UCSF, DOCK is a program used to dock molecules to one another. Docking is a process in which given a small molecule or ligand in the active site of a receptor, the program will try to predict the lowest-energy binding mode of the ligand to the receptor. This process is very important in drug discovery as small molecules that bind to or interact with the active site of a receptor associated with a disease could inhibit its function, acting as a drug. DOCK is a particularly helpful tool as it is used to screen massive libraries of molecules, containing millions of compounds, against a receptor to identify the most promising drug lead compounds.
DOCK historically used a geometric matching algorithm to superimpose the ligand onto a negative image of the active site of the receptor. Over the years, features were added that would improve the programs ability to find the lowest-energy binding mode. Somem of these features include force-field based scoring, on-the-fly optimization and an algorithm for flexible ligand docking.
In this tutorial DOCK6 will be used. New features for DOCK6 include: additional scoring options during minimization; DOCK 3.5 scoring-including Delphi electrostatics, ligand conformational entropy corrections, ligand desolvation, receptor desolvation; Hawkins-Cramer-Truhlar GB/SA solvation scoring with optional salt screening and more (see UCSF DOCK). These new features improved the programs ability to predict binding poses.
This tutorial will use PDB code: 1S19. 1S19 is the crystal structure of VDR ligand binding domain complexed to calcipotriol (find structure here. The resolution is 2.10 Å.
UCSF Chimera was developed by Resource for Biocomputing, Visualization, and Informatics (RBVI) at the University of California, San Francisco. Chimera is a program made for the interactive visualization and analysis of molecular structures. Some features of Chimera include general structure analysis (automatic identification of an atom, hydrogen addition and partial charge assignment, structure building and bond rotation, etc.) presenting images and movies (high-res images, visual effects, standard molecular representations, etc.) and sequence structure tools (sequence alignments, structure superposition, etc.)
To make it easier for us to locate certain files, we are going to create directories for each step of the docking process. The mkdir command creates a new directory in which new files can be saved. The cd command allows for you to navigate between directories.
In your Bash Shell environment, create a directory containing all of the information for the project by typing:
To change into that directory type:
To create the directories for each step:
mkdir 001.structure 002.surface_spheres 003.gridbox 004.dock 005.virtual_screen 006.virtual_screen_mpi 007.cartesianmin 008.rescore
To confirm that the directories have been created:
If you made a mistake and need to delete a directory:
rm *insert name of directory to be deleted*
Preparing the Ligand and Receptor
Download the structure of 1S19 from the Protein Data Bank (PDB).
Download Files -> PDB Format
This file contains the coordinates for the 3D structure of the receptor, ligand and any other molecules present during the experiment (typically water or metal ions). To visualize the structure, we will be using Chimera.
Visualization with Chimera To open the newly downloaded PDB coordinates:
File -> Open
The protein when you first open the file should look like the image below. You can change the view of the structure by rotating it with your mouse or touchpad. Currently, some of the side chains of the backbone are shown, there are no hydrogens or partial charges, and there are some water molecules present.
In order to dock, the ligand and receptor have to be separated and saved into different files. To do this is a simple process and can be done in Chimera. To hide all sidechain bonds:
Select -> standard amino acids Actions -> Atoms/Bonds -> hide
To prepare the receptor, we are going to want to delete everythign except the protein from the PDB file. To do this in Chimera:
Select -> Residue -> All nonstandard Actions -> Atoms/Bonds -> Delete
You will now be left with only the desired protein. To save this file, we are going to save it as a mol2 file. This can be done by:
File -> Save Mol2 -> "1s19_receptor_woH.mol2"
Adding Hydrogens and Charge PDB structures are reported without hydrogens, so it is important that we add them to the receptor in order to gain accurate calculations for the interactions between the protein and ligand. This can also be done in Chimera by doing the following:
Tools -> Structure Editing -> Add H -> Ok
We will also need to add charges to the receptor
Tools -> Structure Editing -> Add Charge -> (have Amber ff14SB and AM1-BCC selected) -> Ok
Once this is done it is very important to check that the charges of the receptor match that of the experiments. Chimera adds the standard protonation states to the amino acids, so it's important to read the paper associated with the PDB file (see here) to make sure that there are no amino acids that are specifically protonated or deprotonated.
Once you have checked to make sure that the protonation states are okay, save this as a mol2 file:
File -> Save Mol2 -> "1s19_receptor_dockprep.mol2"
Like the receptor, we will need to save the ligand as a separate mol2 file in order to perform the docking. For this model, the ligand is named as MC9 in Chimera.
To isolate the ligand:
Select -> Residue -> MC9 Select -> Invert (all models) Actions -> Atoms/Bonds -> Delete
We are now left with just the ligand as pictured below.
We will save this as a mol2 file by:
File -> Save Mol2 -> 1s19_lig_woH.mol2
Adding Hydrogens and Charges In the same steps as the receptor, we will add hydrogens and partial charges to the ligand. For the Hydrogens:
Tools -> Structure Editing -> Add H
After doing so, the structure should look like the one below.
Now we have to add the charges to the ligand. Remember to double check the crystal structure and validate the charge. In this case the charge is neutral (+/- 0), and Chimera was able to model it correctly. However this is not always the case and you will sometimes be required to manually change the charges.
Save this final structure as a mol2 file with the name "1s19_lig_dockprep.mol2"
Generating Receptor Surface and Spheres
In Chimera we can generate a representation of the receptor that creates a negative image of the protein. This surface will guide the ligand to the active site of the receptor during docking. To do this, we are going to open the receptor without hydrogens in Chimera, 1s19_receptor_woH.mol2. Then in Chimera, do:
Actions -> Surface -> Show Tools -> Structure Editing -> Write DMS -> "1s19_rec_surface.dms"
Move this to the "002.surface_spheres" directoy in the Bash Shell environment. THe following Figure is the depiction of the surface in Chimera.
Spheres in docking represent empty space in the receptor. Now that we have the surface representation of our receptor, we are going to use the sphgen ("sphere generation") script to create the largest possible sphere for any given empty space. It is okay that spheres overlap with each other, however we do not want any spheres overlapping with the protein.
Still in the "002.surface_spheres" directory, we are going to create a file calle INSPH (short for "in spheres"). To do this type:
In this new file, we are going to write a script in order to automate the process of generating the spheres. Below is an example of the script that we used.
1s19_rec_surface.dms # your receptor as DMS file R # <R flag> - enables sphere generation outside the protein surface (no eclipsing) X # <X flag - uses all coordinates 0.0 # <double> - distance that steric interactions are checked 4.0 # <double> - Maximum sphere radius of generated sphere 1.4 # <double> - Size of sphere that rolls over dms file surface for cavities 1s19_receptor_woH.sph # the name of your output file
With this script saved, you can now run it through sphgen by the following:
sphgen -i INSPH -o OUTSPH
When this is run, the program will generate the "1s19_receptor_woH.sph" file and also an "OUTSPH" file. Looking at 1s19_receptor_woH.sph in Chimera will give you the following image.
Selecting Spheres We are more concerned with the empty space near the active site of the receptor, since that is where we are going to be docking our ligand. To only select particular spheres within 10 angstroms of the ligand, we executed the following command:
sphere_selector 3vjk_receptor_woH.sph 3vjk_ligand_H.mol2 10.0
This generated a new file called "selected_spheres.sph" in our working directory. The result of the sphere selection is show in the image below.
Creating Box and Grid
To make the amount of calculations we have to do smaller, and thus more efficient, we will create a grid in which relevent residues/atoms are selected. This method of creating the box and grid will eliminate any long-distance interactions that the ligand may have.
We will create the box by using the Showbox program. To do this, we will create the following file in our "003.gridbox" directory.
We put the following script inside the showbox.in file:
Y #generate box# 8.0 #how many angstroms the box edges should be from the spheres ./../002.surface_spheres/select_spheres.sph #the location of the selected spheres 1 1s19.box.pdb #name of the output file
To execute this script, type the following:
showbox < showbox.in
Once this program has run, the 1s19.box.pdb file should be in the working directory.
Next, we will be generating the grid. To do so, we will create a file called grid.in
The following script will be copied into the grid.in file
compute_grids yes grid_spacing 0.4 output_molecule no contact_score no energy_score yes energy_cutoff_distance 9999 atom_model a attractive_exponent 6 repulsive_exponent 9 distance_dielectric yes dielectric_factor 4. bump_filter yes bump_overlap 0.75 receptor_file ./../001.structure/1s19_receptor_dockprep.mol2 box_file 1s19.box.pbd vdw_definition_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn score_grid_prefix grid
To run this script, type:
grid -i grid.in -o gridinfo.out
When this program finishes running, you should have three new files: gridinfo.out, grid.nrg, and grid.bmp
The ligand pose in the crystal structure may not accurately represent the lowest energy conformation of the ligand in a biological system. Therefore, to accurately dock the ligand, we first need to reduce the overall potential energy of the ligand to its global minimum. To do this, we are going to create an energy minimization script:
We used the following script for the minimizaiton:
conformer_search_type rigid use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100.0 ligand_atom_file ./../001.build/1s19_ligand_dockprep.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd yes use_rmsd_reference_mol yes rmsd_reference_filename ./../001.build/1s19_ligand_dockprep.mol2 use_database_filter no orient_ligand no bump_filter no score_molecules yes contact_score_primary no contact_score_secondary no grid_score_primary yes grid_score_secondary no grid_score_rep_rad_scale 1 grid_score_vdw_scale 1 grid_score_es_scale 1 grid_score_grid_prefix ./../003.gridbox/grid multigrid_score_secondary no dock3.5_score_secondary no continuous_score_secondary no footprint_similarity_score_secondary no pharmacophore_score_secondary no descriptor_score_secondary no gbsa_zou_score_secondary no gbsa_hawkins_score_secondary no SASA_score_secondary no amber_score_secondary no minimize_ligand yes simplex_max_iterations 1000 simplex_tors_premin_iterations 0 simplex_max_cycles 1 simplex_score_converge 0.1 simplex_cycle_converge 1.0 simplex_trans_step 1.0 simplex_rot_step 0.1 simplex_tors_step 10.0 simplex_random_seed 0 simplex_restraint_min yes simplex_coefficient_restraint 10.0 atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl ligand_outfile_prefix 1s19.lig.min write_orientations no num_scored_conformers 1 rank_ligands no
Now that the script is made, we can run the energy minimization by the following line
dock6 -i min.in -o min.out
Running this command will yield two new files in the direectory: min.out and 1s19.lig.min_scored.mol2
You can visualize the minimized ligand and the original ligand in Chimera to observe the difference between the two.
The beige structure is the minimized ligand and the blue structure is the original ligand. The difference between the two is not large. To see the RMSD (room mean squared deviation) between the two, you can vim into the 1s19.lig.min_scored.mol2 file that was generated. The picture below shows the top of that file, and RMSD between the two molecules is 0.2281.
Rigid is the least computationally intensive form of docking - the ligand has no rotational freedom around its bonds and is kept static. The first step is to move to the dock folder and make a folder for rigid docking.
cd 004.dock mkdir 1S19_rigid
Then we can make an input file for rigid docking and edit in within Vim or the Dock6.9 program.
touch rigid.in vim rigid.in
Here we will edit the parameters as follows:
conformer_search_type rigid use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100 ligand_atom_file ./../004.dock/1S19_lig_min_scored.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd yes use_rmsd_reference_mol yes rmsd_reference_filename ./../004.dock/1S19_lig_min_scored.mol2 use_database_filter no orient_ligand yes automated_matching yes receptor_site_file ./../002.surface_spheres/selected_spheres.sph max_orientations 1000 critical_points no chemical_matching no use_ligand_spheres no bump_filter no score_molecules yes contact_score_primary no contact_score_secondary no grid_score_primary yes grid_score_secondary no grid_score_rep_rad_scale 1 grid_score_vdw_scale 1 grid_score_es_scale 1 grid_score_grid_prefix ./../003.gridbox/grid multigrid_score_secondary no dock3.5_score_secondary no continuous_score_secondary no footprint_similarity_score_secondary no pharmacophore_score_secondary no descriptor_score_secondary no gbsa_zou_score_secondary no gbsa_hawkins_score_secondary no SASA_score_secondary no amber_score_secondary no minimize_ligand yes simplex_max_iterations 1000 simplex_tors_premin_iterations 0 simplex_max_cycles 1 simplex_score_converge 0.1 simplex_cycle_converge 1.0 simplex_trans_step 1.0 simplex_rot_step 0.1 simplex_tors_step 10.0 simplex_random_seed 0 simplex_restraint_min no atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl ligand_outfile_prefix rigid.out write_orientations no num_scored_conformers 1 rank_ligands no
Then we will input the file into Dock6.9 with the following command:
dock6 -i rigid.in
This will output the file rigid.out_scored.mol2, which can be visualized in Chimera.