2018 Denovo design tutorial 1 with PDB 2NNQ

From Rizzo_Lab
Revision as of 01:14, 5 March 2018 by AMS536 2018 01 (talk | contribs) (Specifying Primary Residues)
Jump to: navigation, search

2018 Denovo design with PDB 2NNQ (Focused)

Files Needed

Fragment Libraries

A focused fragment library will be used in this tutorial, in order to attempt building the same ligand. A focused fragment library can be generated using the same ligand.

Create a new directory for the fragment library.

mkdir fraglib

Inside the fraglib directory create a new input file for fragment generation.

touch fraglib.in

Generate the fragments by calling the input file through DOCK6

dock6 -i fraglib.in

Answer the prompted questions interactively using the following lines.

conformer_search_type                                        flex
write_fragment_libraries                                     yes
fragment_library_prefix                                      fraglib
fragment_library_freq_cutoff                                 1
fragment_library_sort_method                                 freq
fragment_library_trans_origin                                no
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100.0
ligand_atom_file                                             /Path_to_file/2nnq_lig_withH.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                yes
automated_matching                                           yes
receptor_site_file                                           /Path_to_file/selected_spheres.sph
max_orientations                                             1000
critical_points                                              no
chemical_matching                                            no
use_ligand_spheres                                           no
bump_filter                                                  no
score_molecules                                              no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl
ligand_outfile_prefix                                        output
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Once fragment generation is completed, following files will be generated. (fraglib_linker.mol2, fraglib_rigid.mol2, fraglib_scaffold.mol2, fraglib_sidechain.mol2, fraglib_torenv.dat) Open the mol2 files using chimera and check if the fragments match with the ligand used for fragment generation.

Focused Denovo Growth

The generated fragments will be used to perform the de novo dock. Here the fragments will added together restricted by different properties like maximum layers, formal charge and molecular weight. Before each connection, DOCK will check with the torsion environment file created in frgment generation to see if that particular connection is seen before. If not the connection won't be made.

Create a new directory for denovo dock.

mkdir denovo

Create a new input file for denovo dock.

touch denovo.in

Run the input file through dock.

dock6 -i denovo.in

Answer the prompted question interactively using the following lines.

conformer_search_type                                        flex
write_fragment_libraries                                     yes
fragment_library_prefix                                      fraglib
fragment_library_freq_cutoff                                 1
fragment_library_sort_method                                 freq
fragment_library_trans_origin                                no
use_internal_energy                                          no
ligand_atom_file                                             ./candidate_mol_gen_frag.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              no
atom_model                                                   all
vdw_defn_file                                                ../../../dock/box/vdw_AMBER_parm99.defn
flex_defn_file                                               ../../../dock/dock/flex.defn
flex_drive_file                                              ../../../dock/dock/flex_drive.tbl
ligand_outfile_prefix                                        output
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

Focused Denovo Rescore

Generic de novo growth with PDB 2NNQ

In this section, the de novo growth will be done using a fragment library generated by many random ligands. This allows DOCK to generate novel ligands which are not restricted by the fragments and torsion environment of the crystal ligand.

In the previous section, single grid was used as the scoring function to generate the ligands. In this section, the interactions of the ligand with the most significant residues will be considered for the scoring function. In order to do so, multigrid scoring method where each of the significant residues will be described using unique grids, and a common grid for the rest of the residues.

Specifying Primary Residues

First, the significant residues for the binding of the ligand has to be specified. Create unique directory for the generic de novo growth. Create an input file. (rescore.in) Use the following lines to generate the input file.

conformer_search_type                                        rigid
use_internal_energy                                          no
ligand_atom_file                                             ../2nnq_lig_withH.mol2 (use the ligand mol2 file)
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           no
grid_score_secondary                                         no
multigrid_score_primary                                      no
multigrid_score_secondary                                    no
dock3.5_score_primary                                        no
dock3.5_score_secondary                                      no
continuous_score_primary                                     no
continuous_score_secondary                                   no
footprint_similarity_score_primary                           yes
footprint_similarity_score_secondary                         no
fps_score_use_footprint_reference_mol2                       yes
fps_score_footprint_reference_mol2_filename                  ../2nnq_lig_withH.mol2 (use the ligand mol2 file)
fps_score_foot_compare_type                                  Euclidean
fps_score_normalize_foot                                     no
fps_score_foot_comp_all_residue                              no
fps_score_choose_foot_range_type                             threshold
fps_score_vdw_threshold                                      1
fps_score_es_threshold                                       0.5
fps_score_hb_threshold                                       0.5
fps_score_use_remainder                                      yes
fps_score_receptor_filename                                  ../2nnq_rec.mol2 (use the mol2 file for the receptor)
fps_score_vdw_att_exp                                        6
fps_score_vdw_rep_exp                                        12
fps_score_vdw_rep_rad_scale                                  1
fps_score_use_distance_dependent_dielectric                  yes
fps_score_dielectric                                         4.0
fps_score_vdw_fp_scale                                       1
fps_score_es_fp_scale                                        1
fps_score_hb_fp_scale                                        0
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl
ligand_outfile_prefix                                        rescore.out
write_footprints                                             yes
write_hbonds                                                 no
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no

This will generate three output files.

2nnq.footprint_rescore.out 
rescore.out_footprint_scored.txt
rescore.out_scored.mol2

Use these files to specify the primary residues. Create a new file specify.sh using the following lines which are taken from previous tutorials and altered accordingly.

#!/bin/bash
  grep -A 1 "range_union" footprint.out |
  grep -v "range_union" |
  grep -v "\-" |
  sed -e '{s/,/\n/g}' |
  sed -e '{s/ //g}' |
  sed '/^$/d' |
  sort -n |
  uniq > temp.dat
  for i in `cat temp.dat`; do printf "%0*d\n" 3 $i; done > 2nnq.primary_residues.dat
  for RES in `cat temp.dat`
  do
          grep " ${RES} " rescore.out_footprint_scored.txt  |
          awk -v temp=${RES} '{if ($2 == temp) print $0;}' |
          awk '{print $1 "  " $3 "  " $4}' >> reference.txt
  done
  grep "remainder" rescore.out_footprint_scored.txt |
  sed -e '{s/,/  /g}' |
  tr -d '\n' |
  awk '{print $2 "  " $3 "  " $6}' >> reference.txt
  mv reference.txt 2nnq.reference.txt
  rm temp.dat
export AMBERHOME="/gpfs/projects/AMS536/zzz.programs/amber16"

Use either of the following commands to specify the primary residues using the above file.

chmod 770 specify.sh
./specify.sh

or

bash specify.sh

This will generate another two files which specify the primary residues of the binding site of your receptor.

2nnq.primary_residues.dat
2nnq.reference.txt

If it is successful, 25 residues should be specified as primary residues. (this is just for this system and the number of primary residues can change from system to system)

In the next section, grids will be generated for the primary residues and a common grid for the rest of the residues.

Generation of grids

Create a new directory (multigrid) This will be done by a script file and two input files. The structure of each file is given below.

First input file (2nnq.multigrid.in)

compute_grids                  yes
grid_spacing                   0.4
output_molecule                yes
contact_score                  no
chemical_score                 no
energy_score                   yes
energy_cutoff_distance         9999
atom_model                     a
attractive_exponent            6
repulsive_exponent             9
distance_dielectric            yes
dielectric_factor              4
bump_filter                    yes
bump_overlap                   0.75
receptor_file                  temp.mol2
box_file                       ../../../dock/3.boxgrid/2nnq.box.pdb (use the virtual box generated in dock previously)
vdw_definition_file            /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn
chemical_definition_file       /gpfs/projects/AMS536/zzz.programs/dock6/parameters/chem.defn
score_grid_prefix              temp.rec
receptor_out_file              temp.rec.grid.mol2

Second input file (2nnq.reference_multigrid.in)

conformer_search_type                                        rigid
use_internal_energy                                          yes
internal_energy_rep_exp                                      12
internal_energy_cutoff                                       100.0
ligand_atom_file                                             ../../2nnq_lig_withH.mol2
limit_max_ligands                                            no
skip_molecule                                                no
read_mol_solvation                                           no
calculate_rmsd                                               no
use_database_filter                                          no
orient_ligand                                                no
bump_filter                                                  no
score_molecules                                              yes
contact_score_primary                                        no
contact_score_secondary                                      no
grid_score_primary                                           no
grid_score_secondary                                         no
multigrid_score_primary                                      yes
multigrid_score_secondary                                    no
multigrid_score_rep_rad_scale                                1.0
multigrid_score_vdw_scale                                    1.0
multigrid_score_es_scale                                     1.0
multigrid_score_number_of_grids                              26                  (Should be the total number of primary residues + 1)
multigrid_score_grid_prefix0                                 2nnq.resid_014
multigrid_score_grid_prefix1                                 2nnq.resid_016
multigrid_score_grid_prefix2                                 2nnq.resid_019
multigrid_score_grid_prefix3                                 2nnq.resid_020
multigrid_score_grid_prefix4                                 2nnq.resid_036
multigrid_score_grid_prefix5                                 2nnq.resid_037
multigrid_score_grid_prefix6                                 2nnq.resid_038
multigrid_score_grid_prefix7                                 2nnq.resid_040
multigrid_score_grid_prefix8                                 2nnq.resid_051
multigrid_score_grid_prefix9                                 2nnq.resid_057
multigrid_score_grid_prefix10                                2nnq.resid_058
multigrid_score_grid_prefix11                                2nnq.resid_060
multigrid_score_grid_prefix12                                2nnq.resid_072
multigrid_score_grid_prefix13                                2nnq.resid_074
multigrid_score_grid_prefix14                                2nnq.resid_075
multigrid_score_grid_prefix15                                2nnq.resid_076
multigrid_score_grid_prefix16                                2nnq.resid_078
multigrid_score_grid_prefix17                                2nnq.resid_093
multigrid_score_grid_prefix18                                2nnq.resid_104
multigrid_score_grid_prefix19                                2nnq.resid_106
multigrid_score_grid_prefix20                                2nnq.resid_115
multigrid_score_grid_prefix21                                2nnq.resid_116
multigrid_score_grid_prefix22                                2nnq.resid_117
multigrid_score_grid_prefix23                                2nnq.resid_126
multigrid_score_grid_prefix24                                2nnq.resid_128
multigrid_score_grid_prefix25                                2nnq.resid_remaining  (rest of the residues)
multigrid_score_fp_ref_mol                                   no
multigrid_score_fp_ref_text                                  yes
multigrid_score_footprint_text                               ../2nnq.reference.txt (generated earlier when specifying the primary residues)
multigrid_score_foot_compare_type                            Euclidean
multigrid_score_normalize_foot                               no
multigrid_score_vdw_euc_scale                                1.0
multigrid_score_es_euc_scale                                 1.0
dock3.5_score_secondary                                      no
continuous_score_secondary                                   no
footprint_similarity_score_secondary                         no
pharmacophore_score_secondary                                no
descriptor_score_secondary                                   no
gbsa_zou_score_secondary                                     no
gbsa_hawkins_score_secondary                                 no
SASA_score_secondary                                         no
amber_score_secondary                                        no
minimize_ligand                                              yes
simplex_max_iterations                                       1000
simplex_tors_premin_iterations                               0
simplex_max_cycles                                           1
simplex_score_converge                                       0.1
simplex_cycle_converge                                       1.0
simplex_trans_step                                           1.0
simplex_rot_step                                             0.1
simplex_tors_step                                            10.0
simplex_random_seed                                          0
simplex_restraint_min                                        no
atom_model                                                   all
vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn
flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn
flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl
ligand_outfile_prefix                                        output
write_footprints                                             no
write_orientations                                           no
num_scored_conformers                                        1
rank_ligands                                                 no 

Use the following script to generate the grids. (multigrid.sh)

export PRIMARY_RES=` cat ../2nnq.primary_residues.dat | sed -e 's/\n/ /g' `
export DOCKHOME="/gpfs/projects/AMS536/zzz.programs/dock6_new"
python ${DOCKHOME}/bin/multigrid_fp_gen.py ../../2nnq_rec.mol2 2nnq.resid 2nnq.multigrid.in ${PRIMARY_RES}
rm temp.mol2
rm 2nnq.resid_*.rec.grid.mol2
${DOCKHOME}/bin/dock6 -i 2nnq.reference_multigridmin.in -o 2nnq.reference_multigridmin.out
mv output_scored.mol2 2nnq.lig.multigridmin.mol2
cp 2nnq.lig.multigridmin.mol2 ../multi-grid