2023 Denovo tutorial 2 with PDBID 3WZE

From Rizzo_Lab
Revision as of 11:13, 7 May 2023 by Stonybrook (talk | contribs) (Introduction)
Jump to: navigation, search

Introduction

This tutorial is a continuation of the virtual screening tutorial. In this tutorial, we'll continue to work with the receptor and ligand in PDB 3WZE, and we'll attempt to generate new ligands for the receptor using three kinds of de novo design: de novo refinement, focused de novo design, and generic de novo design.

De novo can be directly translated as "of new", but a more deft translation might be "from the beginning" or "from scratch". This method of ligand generation involves procedurally generating a ligand using algorithms within programs like DOCK, and is typically used to build entirely new ligands for proteins by building molecules outwards from an initial anchor one moiety at a time.

Generic de novo design best matches the prior description, in which a pre-selected or random anchor is positioned within the active site of the receptor, and then built outwards in a number of layers occupied by various sampled moieties. Focused de novo design is much like generic de novo design, except that the pool of sampled moieties is curtailed to suit the needs of the researcher. Finally, de novo refinement is when one begins with an already discovered ligand, then deletes some of the molecule and replaces it with a dummy atom, effectively using the remainder of the ligand as the anchor for the de novo design algorithms to modify.

Directories

Make a new directory for de novo refinement:

 mkdir 009_denovo

De Novo Refinement

Ligand Preparation

1. Open the final, energy minimized ligand mol2 file which was used for the virtual screen tutorial, and also open the final receptor mol2 file that was used in that screen. Either Chimera or ChimeraX can be used to open the files. As long as no translations or rotations have occurred during the virtual screen process, the ligand should still be in its native orientation within the receptor's active site, as depicted by the original 3WZE pdb file.

2. Examine the binding pocket of the receptor, and choose a part of the 3WZE ligand that faces towards the interior of the binding pocket. Parts of the ligand that are innermost to the receptor make for the best parts to delete because they tend to have the most potential interactions with the protein, allowing the various groups tested in de novo design to have a better chance of interacting with a group on the protein. Choosing a part of the ligand to delete which faces the cytosol or the channel leading to the cytosol will be less likely to yield new ligands that can bind tightly to the interior of the receptor. To help recognize good sites for deletion, it's a good idea to show sidechains and hbonds, which can allow you to see which parts of the ligand are interacting with the protein.

Dougdenovo1.png

In this image, one can see the ligand sorafinib, and also the two hbonds that it forms with the nearby glutamic acid residue 71. It also forms an hbond with the backbone of the receptor using its amide oxygen. Based on this, we'll truncate those two amides and the entire aromatic ring closest to the camera. The camera is positioned to look from the side of the receptor where the binding pocket is deepest, so deleting everything closer than those amides will delete the parts of the ligand which are innermost.

3. Select and hide the receptor.

4. Orient the ligand so that the area you wish to delete is easy to see. Hold control down on your keyboard, then click and drag to cover the area. This should select the area.

Dougdenovo23.png

5. Now deselect the first atom in the highlighted area. We're going to keep this atom so that it can be changed into a dummy atom. This style of de novo design requires a dummy atom to tell DOCK where to try putting new moieties, and it's easier to keep this nitrogen and change it into a dummy than it is to delete the whole selected area then manually attach a dummy.

Dougdenovo3.png

6. Delete the selected area using Actions->Atoms/Bonds->Delete. Alternatively, if you're using ChimeraX, simply type "delete sel" into the command line.

Dougdenovo4.png

You should end up with a molecule that looks like this. Hover your mouse over that nitrogen we spared from deletion, and note its number. In this case the nitrogen is N14.

7. Save this truncated molecule as a mol2 file.

8. Open the mol2 file in a text editor on desktop, or with a command like "nano" from the command line.

9. Find N14, and change the atom type to "Du1". Also change its bond type to "Du".

Dougdenovo5.png

10. To test whether the mol2 modification worked, open the mol2 with Chimera or ChimeraX. The dummy atom should appear purple or grey, respectively.

Dougdenovo6.png

(Note that this image was taken with ChimeraX)

11. Now that our ligand is prepared, we can move it to Seawulf where we can perform the actual de novo refinement.

Running the Refinement

As with the virtual screen, DOCK can be run with an input file, the text of which will be shown below. However, it's a good idea to make your own input file rather than copying what is written here. That way, you can get a sense of what parameters can be adjusted before a de novo refinement run.

1. In the command line in Seawulf, type

touch de_novo_refine.in

Unlike "nano" or "vi", the "touch" command will allow you to make a blank file.

2. To go through the process of answering DOCK's many questions about your run, and to subsequently generate an input file, type

 dock6 -i de_novo_refine.in

3. Answer the questions. Our input file is as follows:

 conformer_search_type                                        denovo
 dn_fraglib_scaffold_file                                     /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/fraglib_scaffold.mol2
 dn_fraglib_linker_file                                       /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/fraglib_linker.mol2
 dn_fraglib_sidechain_file                                    /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/fraglib_sidechain.mol2
 dn_user_specified_anchor                                     yes
 dn_fraglib_anchor_file                                       Chopped_ligand_for_denovo.mol2
 dn_torenv_table                                              /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/fraglib_torenv.dat
 dn_name_identifier                                           3WZE_refine
 dn_sampling_method                                           graph
 dn_graph_max_picks                                           30
 dn_graph_breadth                                             3
 dn_graph_depth                                               2
 dn_graph_temperature                                         100.0
 dn_pruning_conformer_score_cutoff                            100.0
 dn_pruning_conformer_score_scaling_factor                    2.0
 dn_pruning_clustering_cutoff                                 100.0
 dn_mol_wt_cutoff_type                                        soft
 dn_upper_constraint_mol_wt                                   1000
 dn_lower_constraint_mol_wt                                   0.0
 dn_mol_wt_std_dev                                            35.0
 dn_constraint_rot_bon                                        15
 dn_constraint_formal_charge                                  5
 dn_heur_unmatched_num                                        1
 dn_heur_matched_rmsd                                         2.0
 dn_unique_anchors                                            1
 dn_max_grow_layers                                           1
 dn_max_root_size                                             25
 dn_max_layer_size                                            25
 dn_max_current_aps                                           5
 dn_max_scaffolds_per_layer                                   1
 dn_write_checkpoints                                         yes
 dn_write_prune_dump                                          no
 dn_write_orients                                             no
 dn_write_growth_trees                                        no
 dn_output_prefix                                             3WZE_refine
 use_internal_energy                                          yes
 internal_energy_rep_exp                                      12
 internal_energy_cutoff                                       100.0
 use_database_filter                                          no
 orient_ligand                                                no
 bump_filter                                                  no
 score_molecules                                              yes
 contact_score_primary                                        no
 grid_score_primary                                           yes
 grid_score_rep_rad_scale                                     1
 grid_score_vdw_scale                                         1
 grid_score_es_scale                                          1
 grid_score_grid_prefix                                       grid
 minimize_ligand                                              yes
 minimize_anchor                                              no
 minimize_flexible_growth                                     yes
 use_advanced_simplex_parameters                              no
 simplex_max_cycles                                           1
 simplex_score_converge                                       0.1
 simplex_cycle_converge                                       1.0
 simplex_trans_step                                           1.0
 simplex_rot_step                                             0.1
 simplex_tors_step                                            10.0
 simplex_grow_max_iterations                                  250
 simplex_grow_tors_premin_iterations                          0
 simplex_random_seed                                          0
 simplex_restraint_min                                        yes
 simplex_coefficient_restraint                                10.0
 atom_model                                                   all
 vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/vdw_de_novo.defn
 flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex.defn
 flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6.10/parameters/flex_drive.tbl

4. Run the de novo refinenment in Seawulf using the following command:

 dock6 -i de_novo_refine.in -o de_novo_refine.out

This should take a few minutes to complete, and when it does, there should be three new files:

 3WZE_refine.anchor_1.root_layer_1.mol2
 3WZE_refine.denovo_build.mol2
 de_novo_refine.out

Checking the Results

1. Bring the two mol2 files to your local machine

2. If you're using Chimera, open Chimera and use Tools->Surface/Binding Analysis->ViewDock to open 3WZE_refine.denovo_build.mol2. If you're using ChimeraX, open that file, then type "viewdockx" into the command line.

Dougrefine1.png

Note that this image was taken from ChimeraX. Also, please disregard that I named my ligand "blooble".

3. Look through the results. Because we set "dn_max_grow_layers" to 1, this means that the dummy atom should only be replaced with a single fragment, and there should be no further fragments appended to the single one added.

For our results, we got 10 new molecules. If we had wanted more molecules, we could have increased the value of "dn_graph_max_picks" from 30 to a higher value. This means that dock would sample more fragments (We only got 10 molecules because 20 of the 30 picked fragments must have somehow been incompatible with the anchor).

If you want every possible new molecule based on the fragement library you're using, you could set "dn_sampling_method" to "exhaustive" instead of "graph".