Difference between revisions of "2024 DOCK tutorial 2 with PDBID 1NDV"
Stonybrook (talk | contribs) (→Downloading PDB 1NDV) |
Stonybrook (talk | contribs) (→reScore) |
||
(55 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
== Introduction == | == Introduction == | ||
− | DOCK is a powerful computational tool used in drug design. "Docking" refers to the identification of low energy binding conformations of a small molecule, or ligand, in an active site. There are three methods in which DOCK performs docking: rigid, fixed anchor, and flexible. DOCK also has the ability to perform a virtual screen, which is analogous to a high throughput screen often performed in wet labs. This method allows for thousands of molecules to be docked and ranked so that the highest scoring molecules can move to the next stage of testing. A virtual screen can save both time and money when compared to a traditional high throughput screen. This tutorial will guide you through the process of performing each method of docking, a virtual screen, and associated analysis using PDB 1NDV. Replace 1NDV with your target protein in this tutorial. | + | DOCK is a powerful computational tool used in drug design. "Docking" refers to the identification of low energy binding conformations of a small molecule, or ligand, in an active site. There are three methods in which DOCK performs docking: rigid, fixed anchor, and flexible. DOCK also has the ability to perform a virtual screen, which is analogous to a high throughput screen often performed in wet labs. This method allows for thousands of molecules to be docked and ranked so that the highest scoring molecules can move to the next stage of testing. A virtual screen can save both time and money when compared to a traditional high throughput screen. This tutorial will guide you through the process of performing each method of docking, a virtual screen, and associated analysis using PDB 1NDV on a HPC cluster (Seawulf). Replace 1NDV with your target protein in this tutorial. |
=== Required Software and Skills === | === Required Software and Skills === | ||
In order to complete this tutorial [https://www.cgl.ucsf.edu/chimera/download.html UCSF Chimera] is required. Chimera is a powerful visualization tool that will aid in the preparation and analysis of the virtual screen. Please note that the newer version, Chimera X, can not be used for this tutorial as it lacks certain integrations with DOCK. While this tutorial will walk you through all steps necessary to use Chimera, it may be helpful to view this [https://ringo.ams.stonybrook.edu/index.php/Chimera tutorial] for a more in depth guide on the program. | In order to complete this tutorial [https://www.cgl.ucsf.edu/chimera/download.html UCSF Chimera] is required. Chimera is a powerful visualization tool that will aid in the preparation and analysis of the virtual screen. Please note that the newer version, Chimera X, can not be used for this tutorial as it lacks certain integrations with DOCK. While this tutorial will walk you through all steps necessary to use Chimera, it may be helpful to view this [https://ringo.ams.stonybrook.edu/index.php/Chimera tutorial] for a more in depth guide on the program. | ||
Line 11: | Line 11: | ||
===== Creating Directories ===== | ===== Creating Directories ===== | ||
Create a parent directory which will contain all subdirectors for each docking step. For this tutorial we will name the parent directory 1NDV_DOCK_VS. Now create the subdirectories using the following scheme. | Create a parent directory which will contain all subdirectors for each docking step. For this tutorial we will name the parent directory 1NDV_DOCK_VS. Now create the subdirectories using the following scheme. | ||
+ | |||
[[File:1NDV_Chart.png|center|thumb]] | [[File:1NDV_Chart.png|center|thumb]] | ||
− | ===== Downloading PDB | + | |
+ | ===== Downloading PDB File ===== | ||
Visit the RCSB Protein Data Bank (PDB) [https://www.rcsb.org website] and use the search bar to search for your target protein. As stated before, we will be using 1NDV for this tutorial. | Visit the RCSB Protein Data Bank (PDB) [https://www.rcsb.org website] and use the search bar to search for your target protein. As stated before, we will be using 1NDV for this tutorial. | ||
+ | |||
[[File:PDB_SearchBar.png|center|thumb|upright=4.25]] | [[File:PDB_SearchBar.png|center|thumb|upright=4.25]] | ||
+ | |||
+ | |||
+ | The page for the protein will look like this. Click the "Download Files" dropdown menu, and download the "PDB Format". | ||
+ | |||
+ | |||
+ | [[File:1NDV-Download.png|thumb|center|upright=3]] | ||
+ | |||
+ | |||
+ | With the PDB file saved, we are now ready to begin generating files for DOCK! | ||
== Preparing Structure Files == | == Preparing Structure Files == | ||
+ | To prepare structure files, the protein and ligand structures need to be prepped. It is important prepare your structure files using the following steps to ensure your files run properly and are considered correct chemical models of your system. | ||
+ | |||
+ | We will prep the protein and ligand structures using Chimera. | ||
+ | |||
=== Protein === | === Protein === | ||
+ | Required file: *Structure file from the RCSB Protein Data Bank (PDB). *For this tutorial, is it 1ndv.pdb. | ||
====Analyzing the Protein Structure:==== | ====Analyzing the Protein Structure:==== | ||
− | Open | + | Open 1ndv.pdb in Chimera. |
'''Assess the protein structure:''' | '''Assess the protein structure:''' | ||
Line 28: | Line 45: | ||
1. Are there any missing loops? | 1. Are there any missing loops? | ||
− | 2. | + | 2. Is there a metal coordination atom forming key interactions with the protein? |
The .pdb for 1ndv from the RCSB Protein Data Bank is shown below. | The .pdb for 1ndv from the RCSB Protein Data Bank is shown below. | ||
− | |||
[[File:1ndv_raw_pdb.jpg|center|500px]] | [[File:1ndv_raw_pdb.jpg|center|500px]] | ||
− | |||
'''Assessment:''' | '''Assessment:''' | ||
Line 40: | Line 55: | ||
2. Zinc coordination with the protein and a water molecule. This zinc atom must be kept in the protein prep. | 2. Zinc coordination with the protein and a water molecule. This zinc atom must be kept in the protein prep. | ||
+ | [[File:1ndv_coordination.jpg|center|500px]] | ||
+ | |||
+ | ====Protein Prep==== | ||
+ | 1. Select an atom on the protein | ||
− | [[File: | + | 2. Select the entire protein by pressing the "up" arrow key. You will want to keep the Zn in the complex so hold down the "shift" key and click the Zn atom to select it as well. |
+ | |||
+ | 3. Go to the menu bar Select → Invert(all models). This will invert the selection and select waters, the ligand, and other compounds in the crystal structure that wasn't previously selected. | ||
+ | |||
+ | 4. Go to the menu bar Actions → Atoms/Bonds → Delete. | ||
+ | |||
+ | 5. Then save this structure with a a file name (i.e. 1ndv_protein_only.pdb). Go to the menu bar File → Save PBD. Your pdb file will now look similar to this: | ||
+ | |||
+ | [[File:1ndv_protein_only.png|center|500px]] | ||
+ | |||
+ | 6. We also need to generate a .mol2 file. So go to File → Save Mol2 and give it a name 1ndv_protein_only.mol2. The previous file as a .pdb extension and will not be overwritten as a mol2 file is separate from mol2 format. | ||
+ | |||
+ | ====Adding Hydrogens and Charges==== | ||
+ | |||
+ | We need to prepare the protein with hydrogens and charges. We must add hydrogens before adding charges. | ||
+ | |||
+ | 7. Adding hydrogens. To add hydrogen atoms click on: Tools → Structure Editing → AddH. This command will cause the following dialogue box to appear: | ||
+ | |||
+ | [[File:1ndv_protein_addH.png|center|300px]] | ||
+ | |||
+ | a. Click 'OK'. When the program is finished, the bottom left-hand side will say "Hydrogens added". You may not see any changes in the structure. Here we can see hydrogens on residues surround the zinc atom. If so you can make sure by selecting a couple residues and clicking Actions → Atoms/Bonds → Show. You can unselect these residues by clicking on the background. | ||
+ | |||
+ | [[File:1ndv protein hydrogens check.png|center|300px]] | ||
+ | |||
+ | b. Now once this check is done save your protein file with hydrogens as a .pdb (i.e 1ndv_protein_H.pdb) and .mol2 (i.e 1ndv_protein_H.mol2). It is useful to save changes in Chimera in steps as each change is not able to be undone. | ||
+ | |||
+ | 8. Now we can add charges by Tools → Structure Editing → Add Charge and the following dialogue box will show up: | ||
+ | |||
+ | [[File:1ndv_protein_addcharge.png|center|300px]] | ||
+ | |||
+ | You can keep the defaults selected and click "OK". If you have a metal coordination atom in your structure Chimera will ask you to specify the atom's charge. | ||
+ | |||
+ | [[File:1ndv protein addcharge Zn.png|center|500px]] | ||
+ | |||
+ | Once the program is finished in the bottom left hand corner will display what the total charge of the structure is. This number should be an integer. | ||
+ | |||
+ | The protein structure is now completely prepared and ready for docking. | ||
+ | |||
+ | The final step is to save two files, a .pdb and a .mol2 file of this structure (i.e. 1ndv_protein_H_charges.pdb and 1ndv_protein_H_charges.mol2) | ||
=== Ligand === | === Ligand === | ||
+ | We must first create a .pdb file with only the ligand. To do this we must select the ligand in Chimera. There are two ways to do this: | ||
+ | |||
+ | 1.Select an atom on the ligand, and use the up arrow until the entire ligand is selected | ||
+ | |||
+ | -or- | ||
+ | |||
+ | 2. Use the toolbar and click "Select" → "Structure" → "Ligand" | ||
+ | |||
+ | Once the ligand is selected use "Select" → "Invert" to select all other atoms present in the session. | ||
+ | |||
+ | Then use "Actions" → "Atoms/Bonds" → "Delete" to delete everything except the ligand. | ||
+ | |||
+ | It should look like this. | ||
+ | |||
+ | [[File:1ndv-ligand-only.png|thumb|center|upright=1.5]] | ||
+ | |||
+ | You can now save only the ligand in a separate .pbd file. We will name this file 1ndv_ligand_only.pdb. | ||
+ | |||
+ | We will also save the ligand in this state as a .mol2 file. We will name this file 1ndv_no_hydrogens_no_charges.mol2 | ||
+ | |||
+ | We now need to add hydrogens and charges like we did previously for the protein. It is important to be careful and to cross reference the structure of the ligand in the PDB to ensure that hydrogens are being added in the right places, and that the overall charge of the ligand remains the same. After adding hydrogens, your structure should look similar to this: | ||
+ | |||
+ | [[File:1ndv-ligand-only-H.png|thumb|center|upright=1.5]] | ||
+ | |||
+ | We must now look at the structure in the PDB to ensure Chimera is adding the hydrogens in the proper place. To do this, scroll down on the PDB page of the protein until you see the table labeled "Small Molecules". | ||
+ | |||
+ | [[File:1ndv-small-mol.png|center|thumb|upright=1.5]] | ||
+ | |||
+ | Now click on the "2D Diagram" of the ligand to bring up a larger picture. | ||
+ | |||
+ | [[File:1ndv-small-mol-big.png|center|thumb|upright=1.5]] | ||
+ | |||
+ | It appears that Chimera has added an extra Hydrogen to N26. We need to remove that by simply selecting and deleting that hydrogen. The ligand should now look like this: | ||
+ | |||
+ | [[File:1ndv-ligand-with-H.png|thumb|center|upright=1.5]] | ||
+ | |||
+ | We can now add charges to the ligand. When the dialogue box appears, be sure to set the overall charge to 0 for this ligand. With this step complete we can now save a .pdb file and .mol2 with the name 1ndv_ligand_H_charge. | ||
+ | |||
+ | Upload all four of the .mol2 files created during the protein and ligand preparation process to Seawulf for use with DOCK. | ||
+ | |||
== Surface Spheres == | == Surface Spheres == | ||
+ | This section will walk you through the steps necessary to identify the binding site of the protein using a function within DOCK to place surface spheres along the protein. | ||
=== DMS file === | === DMS file === | ||
+ | In Chimera, open the 1ndv_protein_only.pdb file. Go to Actions → Surface → show. This will display the van der Waals surface of your protein. Your image should be similar to: | ||
+ | |||
+ | [[File:1ndv surface.png|center|500px]] | ||
+ | |||
+ | There may be a warning that could pop up about letting you know that a failure for disconnected additional components, such as the zinc atom, occurred. Since we have our desired surface we can continue and close the warning box. | ||
+ | |||
+ | [[File:1ndv surface warning.png|center|300px]] | ||
+ | |||
+ | Next we need to write the .dms file. To do this select Tools → Structure Editing → Write DMS. You will need to give the file a name (i.e. 1ndv_surface.dms). To check if the .dms file represents the surface of the protein well we can overlay the .dms file with the protein_only file. | ||
+ | |||
+ | To do this, close your current session, open the 4s0v_protein_only.pdb and then open the 4s0v_surface.dms. If everything worked as it should you should see an image similar to: | ||
+ | |||
+ | [[File:1ndv surface dms.png|center|500px]] | ||
+ | |||
+ | The small dots (which is the .dms file) is perfectly aligned over the protein structure. Now that we have verified that everything looks good, we can continue. | ||
+ | |||
+ | The 1ndv_surface.dms can be deposited into the folder 002.surface_spheres. | ||
+ | |||
=== sphgen === | === sphgen === | ||
+ | Open your to your 002_surface_spheres directory and create a file to run sphgen: | ||
+ | |||
+ | vi INSP | ||
+ | |||
+ | Then hit the "Insert" key to start inserting text and type: | ||
+ | |||
+ | 1ndv_surface.dms | ||
+ | R | ||
+ | X | ||
+ | 0.0 | ||
+ | 4.0 | ||
+ | 1.4 | ||
+ | 1ndv.sph | ||
+ | |||
+ | Note: If you are running different filename will need to change the name of your files on the first and last line above. Using a text editor like Sublime text or Notepad can make is easier to make these changes to your code and then you can paste into the vi editor when you are ready. | ||
+ | |||
+ | To save your file hit the colon/semi-colon key and type: | ||
+ | |||
+ | :wq! | ||
+ | |||
+ | to save your work. | ||
+ | |||
+ | Then using command line type the command below to run sphgen on your protein. | ||
+ | |||
+ | sphgen -i INSPH -o OUTSPH | ||
+ | |||
+ | Once this is completed you will see the 1ndv_binding_spheres.sph in your directory. To check the binding_sphere.sph file was generated correctly open it in Chimera with the original .pdb file. The two structures should be aligned to each other. The image should be similar to: | ||
+ | |||
+ | [[File:1ndv binding spheres.png|center|500px]] | ||
+ | |||
+ | The binding_spheres.sph overlay well with the original protein pdb and are good to proceed to the next step. | ||
+ | |||
=== sphere selector === | === sphere selector === | ||
+ | This step runs a DOCK program called sphere_selector which is again run on the command line. Return to your 002.surface_spheres directory. Then type: | ||
+ | |||
+ | sphere_selector 1ndv_binding_spheres.sph ../001.structure/1ndv_ligand_H_charge.mol2 10.0 | ||
+ | |||
+ | When this program has successfully completed you'll see a new file in your directory called selected_spheres.sph. This file generated spheres which should line up with the binding site of the protein. To verify this open both the selected_spheres.sph file and your original protein/ligand complex file (1ndv.pdb) in Chimera. You should see the spheres located within the binding site of the protein similar to this picture below: | ||
+ | |||
+ | [[File:1ndv selected spheres.png|center|500px]] | ||
+ | |||
+ | To check the selected spheres are near the ligand you can either: | ||
+ | a. Ctrl-click on a sphere and hit the "Up" arrow until all the spheres are selected. Then go to Actions → Atoms/Bonds → Hide. or | ||
+ | b. On the Model Panel click on the Cluster 1 - selected_spheres.sph and selecte hide. | ||
+ | |||
+ | We can now see our ligand (green) and can confirm the selected_spheres are near our ligand. | ||
+ | |||
+ | [[File:1ndv selected spheres check.png|center|500px]] | ||
+ | |||
== Gridbox == | == Gridbox == | ||
− | === | + | While it is possible to calculate interactions between the ligand and the entire protein surface, this is computationally expensive. In this tutorial we will define a box of interest in the active site of the protein, and DOCK will generate a grid to calculate interaction energies within this box. |
− | === | + | === Generating the Box === |
− | == Energy Minimization | + | To generate the box we use a DOCK program called showbox. This program is accessed through the command line. First navigate to your 003.gridbox directory and create a new file called showbox.in. Place the following commands into the new file: |
+ | Y | ||
+ | 8.0 | ||
+ | ../002.surface_spheres/selected_spheres.sph | ||
+ | 1 | ||
+ | 1ndv.box.pdb | ||
+ | |||
+ | Remember to change "1ndv" to your protein name. This file may need to be modified for your system if you need a larger box. This file currently draws a box of 8.0 angstroms around the selected spheres. To change this, simply change the "8.0" in the input file. | ||
+ | |||
+ | To run this file type: | ||
+ | showbox < showbox.in | ||
+ | If showbox ran successfully you will now have a file named 1ndv.box.pdb in the directory. | ||
+ | |||
+ | === Generating the Grid=== | ||
+ | Now that the box has been defined, we can generate the grid. We do this using a Dock program called grid. Make a new file called grid.in. Place the following commands into the new file: | ||
+ | |||
+ | allow_non_integral_charges no | ||
+ | compute_grids yes | ||
+ | grid_spacing 0.4 | ||
+ | output_molecule no | ||
+ | contact_score no | ||
+ | energy_score yes | ||
+ | energy_cutoff_distance 9999 | ||
+ | atom_model a | ||
+ | attractive_exponent 6 | ||
+ | repulsive_exponent 9 | ||
+ | distance_dielectric yes | ||
+ | dielectric_factor 4. | ||
+ | bump_filter yes | ||
+ | bump_overlap 0.75 | ||
+ | receptor_file ../001.structure/1ndv_protein_only_H_charge.mol2 | ||
+ | box_file 1ndv.box.pdb | ||
+ | vdw_definition_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn | ||
+ | score_grid_prefix grid | ||
+ | |||
+ | Be sure you are using the receptor_file and box_file for your protein. | ||
+ | |||
+ | Run the file with the following command: | ||
+ | grid -i grid.in -o 1ndvGridInfo.out | ||
+ | |||
+ | This program will take a few minutes to run. If it ran succesfully you will see three new files in your direcotry: | ||
+ | #grid.bmp | ||
+ | #grid.nrg | ||
+ | #1ndvGridInfo.out | ||
+ | |||
+ | After successfully generating the box and grid, we can begin energy minimization. | ||
+ | |||
+ | == Ligand Energy Minimization == | ||
+ | DOCK calculates interaction energies between the protein and the ligand. It is important that we first minimize the ligand to a low energy conformer before we dock it into the binding site of the protein to ensure accurate results. | ||
+ | |||
+ | First navigate to the 004.minization directory and create a new file called min.in. Place the follow lines in the new file. | ||
+ | |||
+ | conformer_search_type rigid | ||
+ | use_internal_energy yes | ||
+ | internal_energy_rep_exp 12 | ||
+ | internal_energy_cutoff 100.0 | ||
+ | ligand_atom_file /gpfs/projects/AMS536/2024/students/group_2_1NDV/dock_screen/001.structure/1ndv_ligand_only_H_charge.mol2 | ||
+ | limit_max_ligands no | ||
+ | skip_molecule no | ||
+ | read_mol_solvation no | ||
+ | calculate_rmsd yes | ||
+ | use_rmsd_reference_mol yes | ||
+ | rmsd_reference_filename /gpfs/projects/AMS536/2024/students/group_2_1NDV/dock_screen/001.structure/1ndv_ligand_only_H_charge.mol2 | ||
+ | use_database_filter no | ||
+ | orient_ligand no | ||
+ | bump_filter no | ||
+ | score_molecules yes | ||
+ | contact_score_primary no | ||
+ | grid_score_primary yes | ||
+ | grid_score_rep_rad_scale 1 | ||
+ | grid_score_vdw_scale 1 | ||
+ | grid_score_es_scale 1 | ||
+ | grid_score_grid_prefix /gpfs/projects/AMS536/2024/students/group_2_1NDV/dock_screen/003.gridbox/grid | ||
+ | minimize_ligand yes | ||
+ | simplex_max_iterations 1000 | ||
+ | simplex_tors_premin_iterations 0 | ||
+ | simplex_max_cycles 1 | ||
+ | simplex_score_converge 0.1 | ||
+ | simplex_cycle_converge 1.0 | ||
+ | simplex_trans_step 1.0 | ||
+ | simplex_rot_step 0.1 | ||
+ | simplex_tors_step 10.0 | ||
+ | simplex_random_seed 0 | ||
+ | simplex_restraint_min yes | ||
+ | simplex_coefficient_restraint 10.0 | ||
+ | atom_model all | ||
+ | vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn | ||
+ | flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn | ||
+ | flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl | ||
+ | ligand_outfile_prefix 1ndv.lig.min | ||
+ | write_orientations no | ||
+ | num_scored_conformers 1 | ||
+ | rank_ligands no | ||
+ | |||
+ | Now run the file with the following command: | ||
+ | dock6 -i min.in -o min.out | ||
+ | |||
+ | If the program ran successfully you will see a 1ndv.lig.min.mol2 file. | ||
+ | |||
+ | Save this file to your local computer and open it in Chimera with the original ligand file. You will see that DOCK has made very slight changes to the strucutre. | ||
+ | |||
+ | [[File:1ndv-min.png|thumb|center|upright=1.5]] | ||
== Footprint Generation == | == Footprint Generation == | ||
+ | Footprint analysis is a method where the intermolecular hydrogen bonds and energy (van der Waals and electrostatics) interactions can be computed between the ligand and each residue of the receptor. It is used to compare how similar two poses or molecules are to each other. | ||
+ | |||
+ | Open your 005.footprint directory and create a footprint input file by the command below: | ||
+ | |||
+ | vi footprint.in | ||
+ | |||
+ | Next enter the following information for your protein. *File names will need to be changed if you are running different files or from different parameter locations. | ||
+ | |||
+ | conformer_search_type rigid | ||
+ | use_internal_energy no | ||
+ | ligand_atom_file .../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | limit_max_ligands no | ||
+ | skip_molecule no | ||
+ | read_mol_solvation no | ||
+ | calculate_rmsd no | ||
+ | use_database_filter no | ||
+ | orient_ligand no | ||
+ | bump_filter no | ||
+ | score_molecules yes | ||
+ | contact_score_primary no | ||
+ | grid_score_primary no | ||
+ | multigrid_score_primary no | ||
+ | dock3.5_score_primary no | ||
+ | continuous_score_primary no | ||
+ | footprint_similarity_score_primary yes | ||
+ | fps_score_use_footprint_reference_mol2 yes | ||
+ | fps_score_footprint_reference_mol2_filename .../001.structure/1ndv_ligand_only_H_charge.mol2 | ||
+ | fps_score_foot_compare_type Euclidean | ||
+ | fps_score_normalize_foot no | ||
+ | fps_score_foot_comp_all_residue yes | ||
+ | fps_score_receptor_filename .../001.structure/1ndv_protein_only_H_charge.mol2 | ||
+ | fps_score_vdw_att_exp 6 | ||
+ | fps_score_vdw_rep_exp 9 | ||
+ | fps_score_vdw_rep_rad_scale 1 | ||
+ | fps_score_use_distance_dependent_dielectric yes | ||
+ | fps_score_dielectric 4.0 | ||
+ | fps_score_vdw_fp_scale 1 | ||
+ | fps_score_es_fp_scale 1 | ||
+ | fps_score_hb_fp_scale 0 | ||
+ | minimize_ligand no | ||
+ | atom_model all | ||
+ | vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn | ||
+ | flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn | ||
+ | flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl | ||
+ | ligand_outfile_prefix 1ndv_footprint.out | ||
+ | write_footprints yes | ||
+ | write_hbonds yes | ||
+ | write_orientations no | ||
+ | num_scored_conformers 1 | ||
+ | rank_ligands no | ||
+ | |||
+ | Next use the command below to generate your footprint analysis: | ||
+ | |||
+ | dock6 -i footprint.in | ||
+ | |||
+ | If the program ran successfully the following three files will be now in your directory: | ||
+ | |||
+ | - 1ndv_footprint.out_footprint_scored.txt | ||
+ | |||
+ | - 1ndvv_footprint.out_hbond_scored.txt | ||
+ | |||
+ | - 1ndv_footprint.out_scored.mol2 | ||
+ | |||
+ | There is a python script from the Rizzo group that can be used to generate a graph of the footprint analysis "plot_footprint_single_magnitude.py". This needs to copied into your 005.footprint directory and can be run using the following command: | ||
+ | |||
+ | module load python | ||
+ | |||
+ | python plot_footprint_single_magnitude.py 1ndv_footprint.out_footprint_scored.txt 50 | ||
+ | |||
+ | This will generate a .pdf file of your ligand's footprint in your directory. The 50 variable at the end of the code is adjustable to show the top specified number of residues. | ||
+ | |||
+ | [[File:1ndv footprint analysis.png|center|650px]] | ||
== Rigid Docking == | == Rigid Docking == | ||
+ | The first docking method we will be using is rigid docking. In this method, the ligand is treated as a rigid structure. DOCK attempts to fit the ligand into the binding site using the structure provided, and different ligand conformations are not sampled. For this step we will be working in the 0006.rigid_docking directory and will be using the minimized ligand structure. | ||
+ | |||
+ | Create a new file called rigid.in and place the following lines into it: | ||
+ | |||
+ | conformer_search_type rigid | ||
+ | use_internal_energy yes | ||
+ | internal_energy_rep_exp 12 | ||
+ | internal_energy_cutoff 100.0 | ||
+ | ligand_atom_file ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | limit_max_ligands no | ||
+ | skip_molecule no | ||
+ | read_mol_solvation no | ||
+ | calculate_rmsd yes | ||
+ | use_rmsd_reference_mol yes | ||
+ | rmsd_reference_filename ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | use_database_filter no | ||
+ | orient_ligand yes | ||
+ | automated_matching yes | ||
+ | receptor_site_file ../002.surface_spheres/selected_spheres.sph | ||
+ | max_orientations 1000 | ||
+ | critical_points no | ||
+ | chemical_matching no | ||
+ | use_ligand_spheres no | ||
+ | bump_filter no | ||
+ | score_molecules yes | ||
+ | contact_score_primary no | ||
+ | grid_score_primary yes | ||
+ | grid_score_rep_rad_scale 1 | ||
+ | grid_score_vdw_scale 1 | ||
+ | grid_score_es_scale 1 | ||
+ | grid_score_grid_prefix ../003.gridbox/grid | ||
+ | minimize_ligand yes | ||
+ | simplex_max_iterations 1000 | ||
+ | simplex_tors_premin_iterations 0 | ||
+ | simplex_max_cycles 1 | ||
+ | simplex_score_converge 0.1 | ||
+ | simplex_cycle_converge 1.0 | ||
+ | simplex_trans_step 1.0 | ||
+ | simplex_rot_step 0.1 | ||
+ | simplex_tors_step 10.0 | ||
+ | simplex_random_seed 0 | ||
+ | simplex_restraint_min no | ||
+ | atom_model all | ||
+ | vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn | ||
+ | flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn | ||
+ | flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl | ||
+ | ligand_outfile_prefix rigid.out | ||
+ | write_orientations no | ||
+ | num_scored_conformers 10 | ||
+ | write_conformations yes | ||
+ | cluster_conformations yes | ||
+ | cluster_rmsd_threshold 2.0 | ||
+ | rank_ligands no | ||
+ | |||
+ | Remember to change paths so that they point to the files on your system. Now run the program using the following command: | ||
+ | |||
+ | dock6 -i rigid.in -o rigid.out | ||
+ | |||
+ | This step may take a few minutes. You will know it ran successfully when you see a file named rigid.out_scored.mol2. This file will contain your docked ligand in 10 orientations. Let's open this file with the minimized ligand file. | ||
+ | |||
+ | First, open the minimized ligand file in Chimera normally. We will use the View Dock tool in Chimera to view the different orientations of the rigid docking results. To do this click "Tools"→ "Surface/Binding Analysis" → "ViewDock". then select "Dock 4,5 or 6" as the file type. Now select "Column" → "Show" → "Grid Score" so we can see how DOCK scored these orientations. You can now view each orientation individually or a selection of them all at once. Here is our top scoring orientation and our minimized ligand. | ||
+ | |||
+ | [[File:1ndv-rigid.png|thumb|center|upright=2.5]] | ||
+ | |||
== Fixed Anchor Docking == | == Fixed Anchor Docking == | ||
+ | Fixed anchor docking allows some heavy atoms to move via rotatable bonds while keeping the majority of the ligand's position fixed during docking. | ||
+ | |||
+ | To model fixed anchor docking, open your 007.fixed_anchor_docking directory and create an input file by the command below: | ||
+ | |||
+ | vi fixed.in | ||
+ | |||
+ | Then type the following lines: | ||
+ | - File names will need to be changed if you are running different files or from different parameter locations. | ||
+ | |||
+ | conformer_search_type flex | ||
+ | write_fragment_libraries no | ||
+ | user_specified_anchor no | ||
+ | limit_max_anchors no | ||
+ | min_anchor_size 5 | ||
+ | pruning_use_clustering yes | ||
+ | pruning_max_orients 1000 | ||
+ | pruning_clustering_cutoff 100 | ||
+ | pruning_conformer_score_cutoff 100.0 | ||
+ | pruning_conformer_score_scaling_factor 1.0 | ||
+ | use_clash_overlap no | ||
+ | write_growth_tree no | ||
+ | use_internal_energy yes | ||
+ | internal_energy_rep_exp 12 | ||
+ | internal_energy_cutoff 100.0 | ||
+ | ligand_atom_file ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | limit_max_ligands no | ||
+ | skip_molecule no | ||
+ | read_mol_solvation no | ||
+ | calculate_rmsd yes | ||
+ | use_rmsd_reference_mol yes | ||
+ | rmsd_reference_filename ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | use_database_filter no | ||
+ | orient_ligand no | ||
+ | bump_filter no | ||
+ | score_molecules yes | ||
+ | contact_score_primary no | ||
+ | grid_score_primary yes | ||
+ | grid_score_rep_rad_scale 1 | ||
+ | grid_score_vdw_scale 1 | ||
+ | grid_score_es_scale 1 | ||
+ | grid_score_grid_prefix ../003.gridbox/grid | ||
+ | minimize_ligand yes | ||
+ | minimize_anchor yes | ||
+ | minimize_flexible_growth yes | ||
+ | use_advanced_simplex_parameters no | ||
+ | minimize_flexible_growth_ramp no | ||
+ | simplex_max_cycles 1 | ||
+ | simplex_score_converge 0.1 | ||
+ | simplex_cycle_converge 1 | ||
+ | simplex_trans_step 1 | ||
+ | simplex_rot_step 0.1 | ||
+ | simplex_tors_step 10.0 | ||
+ | simplex_anchor_max_iterations 500 | ||
+ | simplex_grow_max_iterations 500 | ||
+ | simplex_grow_tors_premin_iterations 0 | ||
+ | simplex_random_seed 0 | ||
+ | simplex_restraint_min no | ||
+ | atom_model all | ||
+ | vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn | ||
+ | flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn | ||
+ | flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl | ||
+ | ligand_outfile_prefix 1ndv_fixed_output | ||
+ | write_orientations no | ||
+ | num_scored_conformers 10 | ||
+ | write_conformations yes | ||
+ | cluster_conformations yes | ||
+ | cluster_rmsd_threshold 2.0 | ||
+ | rank_ligands no | ||
+ | |||
+ | Also note the option below allows for 10 conformers to be generated: | ||
+ | |||
+ | num_scored_conformers 10 | ||
+ | |||
+ | To your input file type: | ||
+ | |||
+ | dock6 -i fixed.in -o fixed.out | ||
+ | |||
+ | Once the job is complete, you will see three new files in your directory: | ||
+ | |||
+ | -fixed.out | ||
+ | |||
+ | -1ndv_fixed_output_scored.mol2 | ||
+ | |||
+ | -1ndv_fixed_output_conformers.mol2 | ||
+ | |||
+ | Next return to your Chimera session and load your original pdb (1ndv.pdb). | ||
+ | |||
+ | To open multiple conformers go to Surface/Binding Analysis → ViewDock and select Dock 6 option to open your multiple conformations in your file 1ndv_fixed_output_scored.mol2. You will see a window like one shown below and can select through the various docking poses. To see Grid_Scores, go to Column → Show → Grid_Score. | ||
+ | |||
+ | [[File:1ndv viewdock.png|center|500px]] | ||
+ | |||
+ | Our top pose is skewed due to hydrogens on the ligand looking for form coordination with the Zn atom, whereas our second best grid score pose recapitulates the crystal structure well. | ||
+ | |||
+ | [[File:1ndv fixed secondGS.png|center|750px]] | ||
+ | |||
== Flex Docking == | == Flex Docking == | ||
+ | The last docking method will we try is called flexible docking. This method is the most computationally expensive, but also samples the most conformational space. In this method DOCK will sample conformations of all rotatable bonds and will attempt to minimize the energy of the ligand. Navigate to the 008.flex_docking directory and create a new file with the name flex.in. Add the following lines to the new file: | ||
+ | conformer_search_type flex | ||
+ | write_fragment_libraries no | ||
+ | user_specified_anchor no | ||
+ | limit_max_anchors no | ||
+ | min_anchor_size 5 | ||
+ | pruning_use_clustering yes | ||
+ | pruning_max_orients 1000 | ||
+ | pruning_clustering_cutoff 100 | ||
+ | pruning_conformer_score_cutoff 100.0 | ||
+ | pruning_conformer_score_scaling_factor 1.0 | ||
+ | use_clash_overlap no | ||
+ | write_growth_tree no | ||
+ | use_internal_energy yes | ||
+ | internal_energy_rep_exp 12 | ||
+ | internal_energy_cutoff 100.0 | ||
+ | ligand_atom_file ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | limit_max_ligands no | ||
+ | skip_molecule no | ||
+ | read_mol_solvation no | ||
+ | calculate_rmsd yes | ||
+ | use_rmsd_reference_mol yes | ||
+ | rmsd_reference_filename | ||
+ | ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | use_database_filter no | ||
+ | orient_ligand yes | ||
+ | automated_matching yes | ||
+ | receptor_site_file | ||
+ | ../002.surface_spheres/selected_spheres.sph | ||
+ | max_orientations 1000 | ||
+ | critical_points no | ||
+ | chemical_matching no | ||
+ | use_ligand_spheres no | ||
+ | bump_filter no | ||
+ | score_molecules yes | ||
+ | contact_score_primary no | ||
+ | grid_score_primary yes | ||
+ | grid_score_rep_rad_scale 1 | ||
+ | grid_score_vdw_scale 1 | ||
+ | grid_score_es_scale 1 | ||
+ | grid_score_grid_prefix ../003.gridbox/grid | ||
+ | minimize_ligand yes | ||
+ | minimize_anchor yes | ||
+ | minimize_flexible_growth yes | ||
+ | use_advanced_simplex_parameters no | ||
+ | minimize_flexible_growth_ramp no | ||
+ | simplex_max_cycles 1 | ||
+ | simplex_score_converge 0.1 | ||
+ | simplex_cycle_converge 1.0 | ||
+ | simplex_trans_step 1.0 | ||
+ | simplex_rot_step 0.1 | ||
+ | simplex_tors_step 10.0 | ||
+ | simplex_anchor_max_iterations 500 | ||
+ | simplex_grow_max_iterations 500 | ||
+ | simplex_grow_tors_premin_iterations 0 | ||
+ | simplex_random_seed 0 | ||
+ | simplex_restraint_min no | ||
+ | atom_model all | ||
+ | vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn | ||
+ | flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn | ||
+ | flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl | ||
+ | ligand_outfile_prefix flex.out | ||
+ | write_orientations no | ||
+ | num_scored_conformers 1 | ||
+ | rank_ligands no | ||
+ | |||
+ | This file will only save 1 conformation. This can be changed by adjusting the "write_orientations" parameter. Use the following command to run the docking: | ||
+ | |||
+ | dock6 -i flex.in -o flex.out | ||
+ | |||
+ | You will know the program ran successfully when you se a flex.out_scored.mol2 file. Save this file on your local system and open it in Chimera along with the minimized ligand file. We can open both of these files normally as we only saved 1 conformation. This is what our docking result looks like in Chimera: | ||
+ | |||
+ | [[File:1ndv-ligand-flex.png|thumb|center|upright=1.5]] | ||
+ | |||
== Virtual Screening == | == Virtual Screening == | ||
+ | Virtual Screening is a method that can dock several hundred to thousands of compounds into a receptor. | ||
+ | |||
+ | To set up your virtual screening script, open your 009.virtual_screening directory and create an input file by the following command below: | ||
+ | |||
+ | vi virtual.in | ||
+ | |||
+ | Then type the following lines: | ||
+ | - File names will need to be changed if you are running different files or from different parameter locations. | ||
+ | |||
+ | conformer_search_type flex | ||
+ | write_fragment_libraries no | ||
+ | user_specified_anchor no | ||
+ | limit_max_anchors no | ||
+ | min_anchor_size 5 | ||
+ | pruning_use_clustering yes | ||
+ | pruning_max_orients 1000 | ||
+ | pruning_clustering_cutoff 100 | ||
+ | pruning_conformer_score_cutoff 100.0 | ||
+ | pruning_conformer_score_scaling_factor 1.0 | ||
+ | use_clash_overlap no | ||
+ | write_growth_tree no | ||
+ | use_internal_energy yes | ||
+ | internal_energy_rep_exp 12 | ||
+ | internal_energy_cutoff 100.0 | ||
+ | ligand_atom_file /gpfs/projects/AMS536/zzz.programs/VS_libraries/VS_library_100.mol2 | ||
+ | limit_max_ligands no | ||
+ | skip_molecule no | ||
+ | read_mol_solvation no | ||
+ | calculate_rmsd no | ||
+ | use_database_filter no | ||
+ | orient_ligand yes | ||
+ | automated_matching yes | ||
+ | receptor_site_file ../002.surface_spheres/selected_spheres.sph | ||
+ | max_orientations 1000 | ||
+ | critical_points no | ||
+ | chemical_matching no | ||
+ | use_ligand_spheres no | ||
+ | bump_filter no | ||
+ | score_molecules yes | ||
+ | contact_score_primary no | ||
+ | grid_score_primary yes | ||
+ | grid_score_rep_rad_scale 1 | ||
+ | grid_score_vdw_scale 1 | ||
+ | grid_score_es_scale 1 | ||
+ | grid_score_grid_prefix ../003.gridbox/grid | ||
+ | minimize_ligand yes | ||
+ | minimize_anchor yes | ||
+ | minimize_flexible_growth yes | ||
+ | use_advanced_simplex_parameters no | ||
+ | minimize_flexible_growth_ramp yes | ||
+ | simplex_max_cycles 1 | ||
+ | simplex_score_converge 0.1 | ||
+ | simplex_initial_score_coverge 5 | ||
+ | simplex_cycle_converge 1.0 | ||
+ | simplex_trans_step 1.0 | ||
+ | simplex_rot_step 0.1 | ||
+ | simplex_tors_step 10.0 | ||
+ | simplex_anchor_max_iterations 500 | ||
+ | simplex_grow_max_iterations 250 | ||
+ | simplex_grow_tors_premin_iterations 0 | ||
+ | simplex_random_seed 0 | ||
+ | simplex_restraint_min no | ||
+ | atom_model all | ||
+ | vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn | ||
+ | flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn | ||
+ | flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl | ||
+ | ligand_outfile_prefix virtual.out | ||
+ | write_orientations no | ||
+ | num_scored_conformers 1 | ||
+ | rank_ligands no | ||
+ | |||
=== Slurm === | === Slurm === | ||
+ | Slurm is a scheduler that allows jobs to be managed on a cluster without gumming up the flow or resources of the computing cluster. It is important to use slurm scripts on jobs that will either take longer than a few minutes or need debugging. | ||
+ | |||
+ | To create our slurm script, you can vi your file name: | ||
+ | |||
+ | vi 1ndv_virtual.slurm | ||
+ | |||
+ | and then type the following below: | ||
+ | - File names and queue details will need to be changed if you are running different files or from different parameter locations. | ||
+ | |||
+ | #!/bin/bash | ||
+ | # | ||
+ | #SBATCH --job-name=1ndv_vs_tutorial | ||
+ | #SBATCH --output=vs_output.txt | ||
+ | #SBATCH --ntasks=24 | ||
+ | #SBATCH --nodes=6 | ||
+ | #SBATCH --time=4:00:00 | ||
+ | #SBATCH -p short-28core | ||
+ | |||
+ | module load intel/mpi/64/2018/18.0.3 | ||
+ | |||
+ | mpirun -np 144 dock6.mpi -i virtual.in -o virtual.out | ||
+ | |||
+ | To get this script to run, on the command line, type: | ||
+ | |||
+ | module load slurm | ||
+ | sbatch 1ndv_virtual.slurm | ||
+ | |||
+ | If for any reason you need to cancel your job, you can use: | ||
+ | scancel (followed by your jobid) | ||
== Cartesian Minimization == | == Cartesian Minimization == | ||
+ | In this step we will minimize all the molecules that were docked during the virtual screen. Navigate to the 010.cartesian_minimization directory and create a new file called min.in. Add the following lines to the new file: | ||
+ | |||
+ | conformer_search_type rigid | ||
+ | use_internal_energy yes | ||
+ | internal_energy_rep_exp 12 | ||
+ | internal_energy_cutoff 100.0 | ||
+ | ligand_atom_file ../009.virtual_screening/virtual.out.mol2 | ||
+ | limit_max_ligands no | ||
+ | skip_molecule no | ||
+ | read_mol_solvation no | ||
+ | calculate_rmsd no | ||
+ | use_database_filter no | ||
+ | orient_ligand no | ||
+ | bump_filter no | ||
+ | score_molecules yes | ||
+ | contact_score_primary no | ||
+ | grid_score_primary no | ||
+ | multigrid_score_primary no | ||
+ | dock3.5_score_primary no | ||
+ | continuous_score_primary yes | ||
+ | cont_score_rec_filename ../001.structure/1ndv_protein_only_H_charge.mol2 | ||
+ | cont_score_att_exp 6 | ||
+ | cont_score_rep_exp 12 | ||
+ | cont_score_rep_rad_scale 1 | ||
+ | cont_score_use_dist_dep_dielectric yes | ||
+ | cont_score_dielectric 4.0 | ||
+ | cont_score_vdw_scale 1 | ||
+ | cont_score_es_scale 1 | ||
+ | minimize_ligand yes | ||
+ | simplex_max_iterations 1000 | ||
+ | simplex_tors_premin_iterations 0 | ||
+ | simplex_max_cycles 1 | ||
+ | simplex_score_converge 0.1 | ||
+ | simplex_cycle_converge 1.0 | ||
+ | simplex_trans_step 1.0 | ||
+ | simplex_rot_step 0.1 | ||
+ | simplex_tors_step 10.0 | ||
+ | simplex_random_seed 0 | ||
+ | simplex_restraint_min no | ||
+ | atom_model all | ||
+ | vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn | ||
+ | flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn | ||
+ | flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl | ||
+ | ligand_outfile_prefix 1ndv.virtual_screen.min | ||
+ | write_orientations no | ||
+ | num_scored_conformers 1 | ||
+ | rank_ligands no | ||
+ | |||
+ | Do NOT run this on the login node. Create a slurm script similar to this one which we tiltled run.min: | ||
+ | |||
+ | #!/bin/bash | ||
+ | # | ||
+ | #SBATCH --job-name=1ndv_minimization_tutorial | ||
+ | #SBATCH --output=vs_output.txt | ||
+ | #SBATCH --ntasks=28 | ||
+ | #SBATCH --nodes=6 | ||
+ | #SBATCH --time=48:00:00 | ||
+ | #SBATCH -p long-28core | ||
+ | |||
+ | dock6 -i min.in -o min.out | ||
+ | |||
+ | Once this is complete, we will rescore the molecules based on their minimized structures. | ||
+ | |||
== reScore == | == reScore == | ||
+ | For reScoring the virtual screen to find molecule rankings, we can create an input file below: | ||
+ | |||
+ | vi rescore.in | ||
+ | |||
+ | conformer_search_type rigid | ||
+ | use_internal_energy yes | ||
+ | internal_energy_rep_exp 12 | ||
+ | internal_energy_cutoff 100.0 | ||
+ | ligand_atom_file ../009.virtual_screening/virtual.out.mol2 | ||
+ | limit_max_ligands no | ||
+ | skip_molecule no | ||
+ | read_mol_solvation no | ||
+ | calculate_rmsd no | ||
+ | use_database_filter no | ||
+ | orient_ligand no | ||
+ | bump_filter no | ||
+ | score_molecules yes | ||
+ | contact_score_primary no | ||
+ | grid_score_primary no | ||
+ | multigrid_score_primary no | ||
+ | dock3.5_score_primary no | ||
+ | continuous_score_primary no | ||
+ | footprint_similarity_score_primary no | ||
+ | pharmacophore_score_primary no | ||
+ | hbond_score_primary no | ||
+ | internal_energy_score_primary no | ||
+ | descriptor_score_primary yes | ||
+ | descriptor_use_grid_score no | ||
+ | descriptor_use_multigrid_score no | ||
+ | descriptor_use_continuous_score no | ||
+ | descriptor_use_footprint_similarity yes | ||
+ | descriptor_use_pharmacophore_score yes | ||
+ | descriptor_use_tanimoto no | ||
+ | descriptor_use_hungarian yes | ||
+ | descriptor_use_volume_overlap yes | ||
+ | descriptor_fps_score_use_footprint_reference_mol2 yes | ||
+ | descriptor_fps_score_footprint_reference_mol2_filename ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | descriptor_fps_score_foot_compare_type Euclidean | ||
+ | descriptor_fps_score_normalize_foot no | ||
+ | descriptor_fps_score_foot_comp_all_residue yes | ||
+ | descriptor_fps_score_receptor_filename ../001.structure/1ndv_protein_only_H_charge.mol2 | ||
+ | descriptor_fps_score_vdw_att_exp 6 | ||
+ | descriptor_fps_score_vdw_rep_exp 12 | ||
+ | descriptor_fps_score_vdw_rep_rad_scale 1 | ||
+ | descriptor_fps_score_use_distance_dependent_dielectric yes | ||
+ | descriptor_fps_score_dielectric 4.0 | ||
+ | descriptor_fps_score_vdw_fp_scale 1 | ||
+ | descriptor_fps_score_es_fp_scale 1 | ||
+ | descriptor_fps_score_hb_fp_scale 0 | ||
+ | descriptor_fms_score_use_ref_mol2 yes | ||
+ | descriptor_fms_score_ref_mol2_filename ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | descriptor_fms_score_write_reference_pharmacophore_mol2 no | ||
+ | descriptor_fms_score_write_reference_pharmacophore_txt no | ||
+ | descriptor_fms_score_write_candidate_pharmacophore no | ||
+ | descriptor_fms_score_write_matched_pharmacophore no | ||
+ | descriptor_fms_score_compare_type overlap | ||
+ | descriptor_fms_score_full_match yes | ||
+ | descriptor_fms_score_match_rate_weight 5.0 | ||
+ | descriptor_fms_score_match_dist_cutoff 1.0 | ||
+ | descriptor_fms_score_match_proj_cutoff 0.7071 | ||
+ | descriptor_fms_score_max_score 20 | ||
+ | descriptor_hms_score_ref_filename ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | descriptor_hms_score_matching_coeff -5 | ||
+ | descriptor_hms_score_rmsd_coeff 1 | ||
+ | descriptor_volume_score_reference_mol2_filename ../004.energy_min/1ndv.lig.min_scored.mol2 | ||
+ | descriptor_volume_score_overlap_compute_method analytical | ||
+ | descriptor_weight_fps_score 1 | ||
+ | descriptor_weight_pharmacophore_score 1 | ||
+ | descriptor_weight_hms_score 1 | ||
+ | descriptor_weight_volume_overlap_score -1 | ||
+ | minimize_ligand no | ||
+ | atom_model all | ||
+ | vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn | ||
+ | flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn | ||
+ | flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl | ||
+ | chem_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/chem.defn | ||
+ | pharmacophore_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/ph4.defn | ||
+ | ligand_outfile_prefix 1ndv.output | ||
+ | write_footprints yes | ||
+ | write_hbonds yes | ||
+ | write_orientations no | ||
+ | num_scored_conformers 1 | ||
+ | rank_ligands no | ||
+ | |||
+ | This file will need to be run using slurm. | ||
+ | |||
+ | vi 1ndv_rescore.slurm | ||
+ | |||
+ | #!/bin/bash | ||
+ | # | ||
+ | #SBATCH --job-name=1ndv_rescoring | ||
+ | #SBATCH --output=rescoring.txt | ||
+ | #SBATCH --ntasks-per-node=24 | ||
+ | #SBATCH --nodes=6 | ||
+ | #SBATCH --time=1:00:00 | ||
+ | #SBATCH -p debug-28core | ||
+ | |||
+ | module load intel/mpi/64/2018/18.0.3 | ||
+ | |||
+ | mpirun -np 144 dock6.mpi -i rescore.in -o rescore.out | ||
+ | |||
+ | and can be run using the command | ||
+ | |||
+ | sbatch 1ndv_rescore.slurm | ||
+ | |||
+ | Once the job is completed 1ndv.output_scored.mol2 will appear in your directory. Open this using the ViewDock analysis to compare molecules using various descriptors. |
Latest revision as of 13:43, 6 May 2024
Contents
Introduction
DOCK is a powerful computational tool used in drug design. "Docking" refers to the identification of low energy binding conformations of a small molecule, or ligand, in an active site. There are three methods in which DOCK performs docking: rigid, fixed anchor, and flexible. DOCK also has the ability to perform a virtual screen, which is analogous to a high throughput screen often performed in wet labs. This method allows for thousands of molecules to be docked and ranked so that the highest scoring molecules can move to the next stage of testing. A virtual screen can save both time and money when compared to a traditional high throughput screen. This tutorial will guide you through the process of performing each method of docking, a virtual screen, and associated analysis using PDB 1NDV on a HPC cluster (Seawulf). Replace 1NDV with your target protein in this tutorial.
Required Software and Skills
In order to complete this tutorial UCSF Chimera is required. Chimera is a powerful visualization tool that will aid in the preparation and analysis of the virtual screen. Please note that the newer version, Chimera X, can not be used for this tutorial as it lacks certain integrations with DOCK. While this tutorial will walk you through all steps necessary to use Chimera, it may be helpful to view this tutorial for a more in depth guide on the program.
It is also recommended that you have experience with Unix and vi. A Unix tutorial can be found on the site here and a vi tutorial can be found here.
Set-up
Before we begin, we need to set up our directories and download our file from the Protein Data Bank (PDB).
Creating Directories
Create a parent directory which will contain all subdirectors for each docking step. For this tutorial we will name the parent directory 1NDV_DOCK_VS. Now create the subdirectories using the following scheme.
Downloading PDB File
Visit the RCSB Protein Data Bank (PDB) website and use the search bar to search for your target protein. As stated before, we will be using 1NDV for this tutorial.
The page for the protein will look like this. Click the "Download Files" dropdown menu, and download the "PDB Format".
With the PDB file saved, we are now ready to begin generating files for DOCK!
Preparing Structure Files
To prepare structure files, the protein and ligand structures need to be prepped. It is important prepare your structure files using the following steps to ensure your files run properly and are considered correct chemical models of your system.
We will prep the protein and ligand structures using Chimera.
Protein
Required file: *Structure file from the RCSB Protein Data Bank (PDB). *For this tutorial, is it 1ndv.pdb.
Analyzing the Protein Structure:
Open 1ndv.pdb in Chimera.
Assess the protein structure:
1. Are there any missing loops?
2. Is there a metal coordination atom forming key interactions with the protein?
The .pdb for 1ndv from the RCSB Protein Data Bank is shown below.
Assessment:
1. No missing loops.
2. Zinc coordination with the protein and a water molecule. This zinc atom must be kept in the protein prep.
Protein Prep
1. Select an atom on the protein
2. Select the entire protein by pressing the "up" arrow key. You will want to keep the Zn in the complex so hold down the "shift" key and click the Zn atom to select it as well.
3. Go to the menu bar Select → Invert(all models). This will invert the selection and select waters, the ligand, and other compounds in the crystal structure that wasn't previously selected.
4. Go to the menu bar Actions → Atoms/Bonds → Delete.
5. Then save this structure with a a file name (i.e. 1ndv_protein_only.pdb). Go to the menu bar File → Save PBD. Your pdb file will now look similar to this:
6. We also need to generate a .mol2 file. So go to File → Save Mol2 and give it a name 1ndv_protein_only.mol2. The previous file as a .pdb extension and will not be overwritten as a mol2 file is separate from mol2 format.
Adding Hydrogens and Charges
We need to prepare the protein with hydrogens and charges. We must add hydrogens before adding charges.
7. Adding hydrogens. To add hydrogen atoms click on: Tools → Structure Editing → AddH. This command will cause the following dialogue box to appear:
a. Click 'OK'. When the program is finished, the bottom left-hand side will say "Hydrogens added". You may not see any changes in the structure. Here we can see hydrogens on residues surround the zinc atom. If so you can make sure by selecting a couple residues and clicking Actions → Atoms/Bonds → Show. You can unselect these residues by clicking on the background.
b. Now once this check is done save your protein file with hydrogens as a .pdb (i.e 1ndv_protein_H.pdb) and .mol2 (i.e 1ndv_protein_H.mol2). It is useful to save changes in Chimera in steps as each change is not able to be undone.
8. Now we can add charges by Tools → Structure Editing → Add Charge and the following dialogue box will show up:
You can keep the defaults selected and click "OK". If you have a metal coordination atom in your structure Chimera will ask you to specify the atom's charge.
Once the program is finished in the bottom left hand corner will display what the total charge of the structure is. This number should be an integer.
The protein structure is now completely prepared and ready for docking.
The final step is to save two files, a .pdb and a .mol2 file of this structure (i.e. 1ndv_protein_H_charges.pdb and 1ndv_protein_H_charges.mol2)
Ligand
We must first create a .pdb file with only the ligand. To do this we must select the ligand in Chimera. There are two ways to do this:
1.Select an atom on the ligand, and use the up arrow until the entire ligand is selected
-or-
2. Use the toolbar and click "Select" → "Structure" → "Ligand"
Once the ligand is selected use "Select" → "Invert" to select all other atoms present in the session.
Then use "Actions" → "Atoms/Bonds" → "Delete" to delete everything except the ligand.
It should look like this.
You can now save only the ligand in a separate .pbd file. We will name this file 1ndv_ligand_only.pdb.
We will also save the ligand in this state as a .mol2 file. We will name this file 1ndv_no_hydrogens_no_charges.mol2
We now need to add hydrogens and charges like we did previously for the protein. It is important to be careful and to cross reference the structure of the ligand in the PDB to ensure that hydrogens are being added in the right places, and that the overall charge of the ligand remains the same. After adding hydrogens, your structure should look similar to this:
We must now look at the structure in the PDB to ensure Chimera is adding the hydrogens in the proper place. To do this, scroll down on the PDB page of the protein until you see the table labeled "Small Molecules".
Now click on the "2D Diagram" of the ligand to bring up a larger picture.
It appears that Chimera has added an extra Hydrogen to N26. We need to remove that by simply selecting and deleting that hydrogen. The ligand should now look like this:
We can now add charges to the ligand. When the dialogue box appears, be sure to set the overall charge to 0 for this ligand. With this step complete we can now save a .pdb file and .mol2 with the name 1ndv_ligand_H_charge.
Upload all four of the .mol2 files created during the protein and ligand preparation process to Seawulf for use with DOCK.
Surface Spheres
This section will walk you through the steps necessary to identify the binding site of the protein using a function within DOCK to place surface spheres along the protein.
DMS file
In Chimera, open the 1ndv_protein_only.pdb file. Go to Actions → Surface → show. This will display the van der Waals surface of your protein. Your image should be similar to:
There may be a warning that could pop up about letting you know that a failure for disconnected additional components, such as the zinc atom, occurred. Since we have our desired surface we can continue and close the warning box.
Next we need to write the .dms file. To do this select Tools → Structure Editing → Write DMS. You will need to give the file a name (i.e. 1ndv_surface.dms). To check if the .dms file represents the surface of the protein well we can overlay the .dms file with the protein_only file.
To do this, close your current session, open the 4s0v_protein_only.pdb and then open the 4s0v_surface.dms. If everything worked as it should you should see an image similar to:
The small dots (which is the .dms file) is perfectly aligned over the protein structure. Now that we have verified that everything looks good, we can continue.
The 1ndv_surface.dms can be deposited into the folder 002.surface_spheres.
sphgen
Open your to your 002_surface_spheres directory and create a file to run sphgen:
vi INSP
Then hit the "Insert" key to start inserting text and type:
1ndv_surface.dms R X 0.0 4.0 1.4 1ndv.sph
Note: If you are running different filename will need to change the name of your files on the first and last line above. Using a text editor like Sublime text or Notepad can make is easier to make these changes to your code and then you can paste into the vi editor when you are ready.
To save your file hit the colon/semi-colon key and type:
:wq!
to save your work.
Then using command line type the command below to run sphgen on your protein.
sphgen -i INSPH -o OUTSPH
Once this is completed you will see the 1ndv_binding_spheres.sph in your directory. To check the binding_sphere.sph file was generated correctly open it in Chimera with the original .pdb file. The two structures should be aligned to each other. The image should be similar to:
The binding_spheres.sph overlay well with the original protein pdb and are good to proceed to the next step.
sphere selector
This step runs a DOCK program called sphere_selector which is again run on the command line. Return to your 002.surface_spheres directory. Then type:
sphere_selector 1ndv_binding_spheres.sph ../001.structure/1ndv_ligand_H_charge.mol2 10.0
When this program has successfully completed you'll see a new file in your directory called selected_spheres.sph. This file generated spheres which should line up with the binding site of the protein. To verify this open both the selected_spheres.sph file and your original protein/ligand complex file (1ndv.pdb) in Chimera. You should see the spheres located within the binding site of the protein similar to this picture below:
To check the selected spheres are near the ligand you can either: a. Ctrl-click on a sphere and hit the "Up" arrow until all the spheres are selected. Then go to Actions → Atoms/Bonds → Hide. or b. On the Model Panel click on the Cluster 1 - selected_spheres.sph and selecte hide.
We can now see our ligand (green) and can confirm the selected_spheres are near our ligand.
Gridbox
While it is possible to calculate interactions between the ligand and the entire protein surface, this is computationally expensive. In this tutorial we will define a box of interest in the active site of the protein, and DOCK will generate a grid to calculate interaction energies within this box.
Generating the Box
To generate the box we use a DOCK program called showbox. This program is accessed through the command line. First navigate to your 003.gridbox directory and create a new file called showbox.in. Place the following commands into the new file:
Y 8.0 ../002.surface_spheres/selected_spheres.sph 1 1ndv.box.pdb
Remember to change "1ndv" to your protein name. This file may need to be modified for your system if you need a larger box. This file currently draws a box of 8.0 angstroms around the selected spheres. To change this, simply change the "8.0" in the input file.
To run this file type:
showbox < showbox.in
If showbox ran successfully you will now have a file named 1ndv.box.pdb in the directory.
Generating the Grid
Now that the box has been defined, we can generate the grid. We do this using a Dock program called grid. Make a new file called grid.in. Place the following commands into the new file:
allow_non_integral_charges no compute_grids yes grid_spacing 0.4 output_molecule no contact_score no energy_score yes energy_cutoff_distance 9999 atom_model a attractive_exponent 6 repulsive_exponent 9 distance_dielectric yes dielectric_factor 4. bump_filter yes bump_overlap 0.75 receptor_file ../001.structure/1ndv_protein_only_H_charge.mol2 box_file 1ndv.box.pdb vdw_definition_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn score_grid_prefix grid
Be sure you are using the receptor_file and box_file for your protein.
Run the file with the following command:
grid -i grid.in -o 1ndvGridInfo.out
This program will take a few minutes to run. If it ran succesfully you will see three new files in your direcotry:
- grid.bmp
- grid.nrg
- 1ndvGridInfo.out
After successfully generating the box and grid, we can begin energy minimization.
Ligand Energy Minimization
DOCK calculates interaction energies between the protein and the ligand. It is important that we first minimize the ligand to a low energy conformer before we dock it into the binding site of the protein to ensure accurate results.
First navigate to the 004.minization directory and create a new file called min.in. Place the follow lines in the new file.
conformer_search_type rigid use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100.0 ligand_atom_file /gpfs/projects/AMS536/2024/students/group_2_1NDV/dock_screen/001.structure/1ndv_ligand_only_H_charge.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd yes use_rmsd_reference_mol yes rmsd_reference_filename /gpfs/projects/AMS536/2024/students/group_2_1NDV/dock_screen/001.structure/1ndv_ligand_only_H_charge.mol2 use_database_filter no orient_ligand no bump_filter no score_molecules yes contact_score_primary no grid_score_primary yes grid_score_rep_rad_scale 1 grid_score_vdw_scale 1 grid_score_es_scale 1 grid_score_grid_prefix /gpfs/projects/AMS536/2024/students/group_2_1NDV/dock_screen/003.gridbox/grid minimize_ligand yes simplex_max_iterations 1000 simplex_tors_premin_iterations 0 simplex_max_cycles 1 simplex_score_converge 0.1 simplex_cycle_converge 1.0 simplex_trans_step 1.0 simplex_rot_step 0.1 simplex_tors_step 10.0 simplex_random_seed 0 simplex_restraint_min yes simplex_coefficient_restraint 10.0 atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl ligand_outfile_prefix 1ndv.lig.min write_orientations no num_scored_conformers 1 rank_ligands no
Now run the file with the following command:
dock6 -i min.in -o min.out
If the program ran successfully you will see a 1ndv.lig.min.mol2 file.
Save this file to your local computer and open it in Chimera with the original ligand file. You will see that DOCK has made very slight changes to the strucutre.
Footprint Generation
Footprint analysis is a method where the intermolecular hydrogen bonds and energy (van der Waals and electrostatics) interactions can be computed between the ligand and each residue of the receptor. It is used to compare how similar two poses or molecules are to each other.
Open your 005.footprint directory and create a footprint input file by the command below:
vi footprint.in
Next enter the following information for your protein. *File names will need to be changed if you are running different files or from different parameter locations.
conformer_search_type rigid use_internal_energy no ligand_atom_file .../004.energy_min/1ndv.lig.min_scored.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd no use_database_filter no orient_ligand no bump_filter no score_molecules yes contact_score_primary no grid_score_primary no multigrid_score_primary no dock3.5_score_primary no continuous_score_primary no footprint_similarity_score_primary yes fps_score_use_footprint_reference_mol2 yes fps_score_footprint_reference_mol2_filename .../001.structure/1ndv_ligand_only_H_charge.mol2 fps_score_foot_compare_type Euclidean fps_score_normalize_foot no fps_score_foot_comp_all_residue yes fps_score_receptor_filename .../001.structure/1ndv_protein_only_H_charge.mol2 fps_score_vdw_att_exp 6 fps_score_vdw_rep_exp 9 fps_score_vdw_rep_rad_scale 1 fps_score_use_distance_dependent_dielectric yes fps_score_dielectric 4.0 fps_score_vdw_fp_scale 1 fps_score_es_fp_scale 1 fps_score_hb_fp_scale 0 minimize_ligand no atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl ligand_outfile_prefix 1ndv_footprint.out write_footprints yes write_hbonds yes write_orientations no num_scored_conformers 1 rank_ligands no
Next use the command below to generate your footprint analysis:
dock6 -i footprint.in
If the program ran successfully the following three files will be now in your directory:
- 1ndv_footprint.out_footprint_scored.txt
- 1ndvv_footprint.out_hbond_scored.txt
- 1ndv_footprint.out_scored.mol2
There is a python script from the Rizzo group that can be used to generate a graph of the footprint analysis "plot_footprint_single_magnitude.py". This needs to copied into your 005.footprint directory and can be run using the following command:
module load python
python plot_footprint_single_magnitude.py 1ndv_footprint.out_footprint_scored.txt 50
This will generate a .pdf file of your ligand's footprint in your directory. The 50 variable at the end of the code is adjustable to show the top specified number of residues.
Rigid Docking
The first docking method we will be using is rigid docking. In this method, the ligand is treated as a rigid structure. DOCK attempts to fit the ligand into the binding site using the structure provided, and different ligand conformations are not sampled. For this step we will be working in the 0006.rigid_docking directory and will be using the minimized ligand structure.
Create a new file called rigid.in and place the following lines into it:
conformer_search_type rigid use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100.0 ligand_atom_file ../004.energy_min/1ndv.lig.min_scored.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd yes use_rmsd_reference_mol yes rmsd_reference_filename ../004.energy_min/1ndv.lig.min_scored.mol2 use_database_filter no orient_ligand yes automated_matching yes receptor_site_file ../002.surface_spheres/selected_spheres.sph max_orientations 1000 critical_points no chemical_matching no use_ligand_spheres no bump_filter no score_molecules yes contact_score_primary no grid_score_primary yes grid_score_rep_rad_scale 1 grid_score_vdw_scale 1 grid_score_es_scale 1 grid_score_grid_prefix ../003.gridbox/grid minimize_ligand yes simplex_max_iterations 1000 simplex_tors_premin_iterations 0 simplex_max_cycles 1 simplex_score_converge 0.1 simplex_cycle_converge 1.0 simplex_trans_step 1.0 simplex_rot_step 0.1 simplex_tors_step 10.0 simplex_random_seed 0 simplex_restraint_min no atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl ligand_outfile_prefix rigid.out write_orientations no num_scored_conformers 10 write_conformations yes cluster_conformations yes cluster_rmsd_threshold 2.0 rank_ligands no
Remember to change paths so that they point to the files on your system. Now run the program using the following command:
dock6 -i rigid.in -o rigid.out
This step may take a few minutes. You will know it ran successfully when you see a file named rigid.out_scored.mol2. This file will contain your docked ligand in 10 orientations. Let's open this file with the minimized ligand file.
First, open the minimized ligand file in Chimera normally. We will use the View Dock tool in Chimera to view the different orientations of the rigid docking results. To do this click "Tools"→ "Surface/Binding Analysis" → "ViewDock". then select "Dock 4,5 or 6" as the file type. Now select "Column" → "Show" → "Grid Score" so we can see how DOCK scored these orientations. You can now view each orientation individually or a selection of them all at once. Here is our top scoring orientation and our minimized ligand.
Fixed Anchor Docking
Fixed anchor docking allows some heavy atoms to move via rotatable bonds while keeping the majority of the ligand's position fixed during docking.
To model fixed anchor docking, open your 007.fixed_anchor_docking directory and create an input file by the command below:
vi fixed.in
Then type the following lines: - File names will need to be changed if you are running different files or from different parameter locations.
conformer_search_type flex write_fragment_libraries no user_specified_anchor no limit_max_anchors no min_anchor_size 5 pruning_use_clustering yes pruning_max_orients 1000 pruning_clustering_cutoff 100 pruning_conformer_score_cutoff 100.0 pruning_conformer_score_scaling_factor 1.0 use_clash_overlap no write_growth_tree no use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100.0 ligand_atom_file ../004.energy_min/1ndv.lig.min_scored.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd yes use_rmsd_reference_mol yes rmsd_reference_filename ../004.energy_min/1ndv.lig.min_scored.mol2 use_database_filter no orient_ligand no bump_filter no score_molecules yes contact_score_primary no grid_score_primary yes grid_score_rep_rad_scale 1 grid_score_vdw_scale 1 grid_score_es_scale 1 grid_score_grid_prefix ../003.gridbox/grid minimize_ligand yes minimize_anchor yes minimize_flexible_growth yes use_advanced_simplex_parameters no minimize_flexible_growth_ramp no simplex_max_cycles 1 simplex_score_converge 0.1 simplex_cycle_converge 1 simplex_trans_step 1 simplex_rot_step 0.1 simplex_tors_step 10.0 simplex_anchor_max_iterations 500 simplex_grow_max_iterations 500 simplex_grow_tors_premin_iterations 0 simplex_random_seed 0 simplex_restraint_min no atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.9_release/parameters/flex_drive.tbl ligand_outfile_prefix 1ndv_fixed_output write_orientations no num_scored_conformers 10 write_conformations yes cluster_conformations yes cluster_rmsd_threshold 2.0 rank_ligands no
Also note the option below allows for 10 conformers to be generated:
num_scored_conformers 10
To your input file type:
dock6 -i fixed.in -o fixed.out
Once the job is complete, you will see three new files in your directory:
-fixed.out
-1ndv_fixed_output_scored.mol2
-1ndv_fixed_output_conformers.mol2
Next return to your Chimera session and load your original pdb (1ndv.pdb).
To open multiple conformers go to Surface/Binding Analysis → ViewDock and select Dock 6 option to open your multiple conformations in your file 1ndv_fixed_output_scored.mol2. You will see a window like one shown below and can select through the various docking poses. To see Grid_Scores, go to Column → Show → Grid_Score.
Our top pose is skewed due to hydrogens on the ligand looking for form coordination with the Zn atom, whereas our second best grid score pose recapitulates the crystal structure well.
Flex Docking
The last docking method will we try is called flexible docking. This method is the most computationally expensive, but also samples the most conformational space. In this method DOCK will sample conformations of all rotatable bonds and will attempt to minimize the energy of the ligand. Navigate to the 008.flex_docking directory and create a new file with the name flex.in. Add the following lines to the new file:
conformer_search_type flex write_fragment_libraries no user_specified_anchor no limit_max_anchors no min_anchor_size 5 pruning_use_clustering yes pruning_max_orients 1000 pruning_clustering_cutoff 100 pruning_conformer_score_cutoff 100.0 pruning_conformer_score_scaling_factor 1.0 use_clash_overlap no write_growth_tree no use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100.0 ligand_atom_file ../004.energy_min/1ndv.lig.min_scored.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd yes use_rmsd_reference_mol yes rmsd_reference_filename
../004.energy_min/1ndv.lig.min_scored.mol2
use_database_filter no orient_ligand yes automated_matching yes receptor_site_file
../002.surface_spheres/selected_spheres.sph
max_orientations 1000 critical_points no chemical_matching no use_ligand_spheres no bump_filter no score_molecules yes contact_score_primary no grid_score_primary yes grid_score_rep_rad_scale 1 grid_score_vdw_scale 1 grid_score_es_scale 1 grid_score_grid_prefix ../003.gridbox/grid minimize_ligand yes minimize_anchor yes minimize_flexible_growth yes use_advanced_simplex_parameters no minimize_flexible_growth_ramp no simplex_max_cycles 1 simplex_score_converge 0.1 simplex_cycle_converge 1.0 simplex_trans_step 1.0 simplex_rot_step 0.1 simplex_tors_step 10.0 simplex_anchor_max_iterations 500 simplex_grow_max_iterations 500 simplex_grow_tors_premin_iterations 0 simplex_random_seed 0 simplex_restraint_min no atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl ligand_outfile_prefix flex.out write_orientations no num_scored_conformers 1 rank_ligands no
This file will only save 1 conformation. This can be changed by adjusting the "write_orientations" parameter. Use the following command to run the docking:
dock6 -i flex.in -o flex.out
You will know the program ran successfully when you se a flex.out_scored.mol2 file. Save this file on your local system and open it in Chimera along with the minimized ligand file. We can open both of these files normally as we only saved 1 conformation. This is what our docking result looks like in Chimera:
Virtual Screening
Virtual Screening is a method that can dock several hundred to thousands of compounds into a receptor.
To set up your virtual screening script, open your 009.virtual_screening directory and create an input file by the following command below:
vi virtual.in
Then type the following lines: - File names will need to be changed if you are running different files or from different parameter locations.
conformer_search_type flex write_fragment_libraries no user_specified_anchor no limit_max_anchors no min_anchor_size 5 pruning_use_clustering yes pruning_max_orients 1000 pruning_clustering_cutoff 100 pruning_conformer_score_cutoff 100.0 pruning_conformer_score_scaling_factor 1.0 use_clash_overlap no write_growth_tree no use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100.0 ligand_atom_file /gpfs/projects/AMS536/zzz.programs/VS_libraries/VS_library_100.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd no use_database_filter no orient_ligand yes automated_matching yes receptor_site_file ../002.surface_spheres/selected_spheres.sph max_orientations 1000 critical_points no chemical_matching no use_ligand_spheres no bump_filter no score_molecules yes contact_score_primary no grid_score_primary yes grid_score_rep_rad_scale 1 grid_score_vdw_scale 1 grid_score_es_scale 1 grid_score_grid_prefix ../003.gridbox/grid minimize_ligand yes minimize_anchor yes minimize_flexible_growth yes use_advanced_simplex_parameters no minimize_flexible_growth_ramp yes simplex_max_cycles 1 simplex_score_converge 0.1 simplex_initial_score_coverge 5 simplex_cycle_converge 1.0 simplex_trans_step 1.0 simplex_rot_step 0.1 simplex_tors_step 10.0 simplex_anchor_max_iterations 500 simplex_grow_max_iterations 250 simplex_grow_tors_premin_iterations 0 simplex_random_seed 0 simplex_restraint_min no atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl ligand_outfile_prefix virtual.out write_orientations no num_scored_conformers 1 rank_ligands no
Slurm
Slurm is a scheduler that allows jobs to be managed on a cluster without gumming up the flow or resources of the computing cluster. It is important to use slurm scripts on jobs that will either take longer than a few minutes or need debugging.
To create our slurm script, you can vi your file name:
vi 1ndv_virtual.slurm
and then type the following below: - File names and queue details will need to be changed if you are running different files or from different parameter locations.
#!/bin/bash # #SBATCH --job-name=1ndv_vs_tutorial #SBATCH --output=vs_output.txt #SBATCH --ntasks=24 #SBATCH --nodes=6 #SBATCH --time=4:00:00 #SBATCH -p short-28core module load intel/mpi/64/2018/18.0.3 mpirun -np 144 dock6.mpi -i virtual.in -o virtual.out
To get this script to run, on the command line, type:
module load slurm sbatch 1ndv_virtual.slurm
If for any reason you need to cancel your job, you can use:
scancel (followed by your jobid)
Cartesian Minimization
In this step we will minimize all the molecules that were docked during the virtual screen. Navigate to the 010.cartesian_minimization directory and create a new file called min.in. Add the following lines to the new file:
conformer_search_type rigid use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100.0 ligand_atom_file ../009.virtual_screening/virtual.out.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd no use_database_filter no orient_ligand no bump_filter no score_molecules yes contact_score_primary no grid_score_primary no multigrid_score_primary no dock3.5_score_primary no continuous_score_primary yes cont_score_rec_filename ../001.structure/1ndv_protein_only_H_charge.mol2 cont_score_att_exp 6 cont_score_rep_exp 12 cont_score_rep_rad_scale 1 cont_score_use_dist_dep_dielectric yes cont_score_dielectric 4.0 cont_score_vdw_scale 1 cont_score_es_scale 1 minimize_ligand yes simplex_max_iterations 1000 simplex_tors_premin_iterations 0 simplex_max_cycles 1 simplex_score_converge 0.1 simplex_cycle_converge 1.0 simplex_trans_step 1.0 simplex_rot_step 0.1 simplex_tors_step 10.0 simplex_random_seed 0 simplex_restraint_min no atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl ligand_outfile_prefix 1ndv.virtual_screen.min write_orientations no num_scored_conformers 1 rank_ligands no
Do NOT run this on the login node. Create a slurm script similar to this one which we tiltled run.min:
#!/bin/bash # #SBATCH --job-name=1ndv_minimization_tutorial #SBATCH --output=vs_output.txt #SBATCH --ntasks=28 #SBATCH --nodes=6 #SBATCH --time=48:00:00 #SBATCH -p long-28core
dock6 -i min.in -o min.out
Once this is complete, we will rescore the molecules based on their minimized structures.
reScore
For reScoring the virtual screen to find molecule rankings, we can create an input file below:
vi rescore.in
conformer_search_type rigid use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100.0 ligand_atom_file ../009.virtual_screening/virtual.out.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd no use_database_filter no orient_ligand no bump_filter no score_molecules yes contact_score_primary no grid_score_primary no multigrid_score_primary no dock3.5_score_primary no continuous_score_primary no footprint_similarity_score_primary no pharmacophore_score_primary no hbond_score_primary no internal_energy_score_primary no descriptor_score_primary yes descriptor_use_grid_score no descriptor_use_multigrid_score no descriptor_use_continuous_score no descriptor_use_footprint_similarity yes descriptor_use_pharmacophore_score yes descriptor_use_tanimoto no descriptor_use_hungarian yes descriptor_use_volume_overlap yes descriptor_fps_score_use_footprint_reference_mol2 yes descriptor_fps_score_footprint_reference_mol2_filename ../004.energy_min/1ndv.lig.min_scored.mol2 descriptor_fps_score_foot_compare_type Euclidean descriptor_fps_score_normalize_foot no descriptor_fps_score_foot_comp_all_residue yes descriptor_fps_score_receptor_filename ../001.structure/1ndv_protein_only_H_charge.mol2 descriptor_fps_score_vdw_att_exp 6 descriptor_fps_score_vdw_rep_exp 12 descriptor_fps_score_vdw_rep_rad_scale 1 descriptor_fps_score_use_distance_dependent_dielectric yes descriptor_fps_score_dielectric 4.0 descriptor_fps_score_vdw_fp_scale 1 descriptor_fps_score_es_fp_scale 1 descriptor_fps_score_hb_fp_scale 0 descriptor_fms_score_use_ref_mol2 yes descriptor_fms_score_ref_mol2_filename ../004.energy_min/1ndv.lig.min_scored.mol2 descriptor_fms_score_write_reference_pharmacophore_mol2 no descriptor_fms_score_write_reference_pharmacophore_txt no descriptor_fms_score_write_candidate_pharmacophore no descriptor_fms_score_write_matched_pharmacophore no descriptor_fms_score_compare_type overlap descriptor_fms_score_full_match yes descriptor_fms_score_match_rate_weight 5.0 descriptor_fms_score_match_dist_cutoff 1.0 descriptor_fms_score_match_proj_cutoff 0.7071 descriptor_fms_score_max_score 20 descriptor_hms_score_ref_filename ../004.energy_min/1ndv.lig.min_scored.mol2 descriptor_hms_score_matching_coeff -5 descriptor_hms_score_rmsd_coeff 1 descriptor_volume_score_reference_mol2_filename ../004.energy_min/1ndv.lig.min_scored.mol2 descriptor_volume_score_overlap_compute_method analytical descriptor_weight_fps_score 1 descriptor_weight_pharmacophore_score 1 descriptor_weight_hms_score 1 descriptor_weight_volume_overlap_score -1 minimize_ligand no atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl chem_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/chem.defn pharmacophore_defn_file /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/ph4.defn ligand_outfile_prefix 1ndv.output write_footprints yes write_hbonds yes write_orientations no num_scored_conformers 1 rank_ligands no
This file will need to be run using slurm.
vi 1ndv_rescore.slurm
#!/bin/bash # #SBATCH --job-name=1ndv_rescoring #SBATCH --output=rescoring.txt #SBATCH --ntasks-per-node=24 #SBATCH --nodes=6 #SBATCH --time=1:00:00 #SBATCH -p debug-28core module load intel/mpi/64/2018/18.0.3 mpirun -np 144 dock6.mpi -i rescore.in -o rescore.out
and can be run using the command
sbatch 1ndv_rescore.slurm
Once the job is completed 1ndv.output_scored.mol2 will appear in your directory. Open this using the ViewDock analysis to compare molecules using various descriptors.