2019 DOCK GA tutorial 1 with 2NNQ

From Rizzo_Lab
Jump to: navigation, search

For this experiment required files will include 2NNQ.lig.mol2. Make sure that you use a charged ligand with hydrogens

File:2nnq lig wH charged.png
Charged 2nnq ligand with hydrogen bonds

I. Fragment Library Generation for 2NNQ

Fragment Libraries

For this experiment a general fragment library to account for all the possible ligands the GA can produce. A focused Fragment library will not be enough to account for all the possibilities since it is too small of a set of data.

The first step for this experiment is to generate a directory to perform the work in

     mkdir 2NNQ_GA

Go into this new directory and then create another directory that will store all the fragment molecules

     mkdir fraglib

Enter the file directory fraglib

     cd fraglib

Create a new file called 2NNQ.fraglib

     touch 2NNQ.fraglib
     dock6 -i 2NNQ.fraglib

Answer the following prompts using these responses

    conformer_search_type                                        flex
    write_fragment_libraries                                     yes
    fragment_library_prefix                                      2NNQ.fraglib
    fragment_library_freq_cutoff                                 1
    fragment_library_sort_method                                 freq
    fragment_library_trans_origin                                no
    use_internal_energy                                          no
    ligand_atom_file                                             ../../2NNQ_Tutorial/1.dockprep/2nnq_lig_withH.mol2
    limit_max_ligands                                            no
    skip_molecule                                                no
    read_mol_solvation                                           no
    calculate_rmsd                                               no
    use_database_filter                                          no
    orient_ligand                                                no
    bump_filter                                                  no
    score_molecules                                              no
    atom_model                                                   all
    vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn 
    flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn
    flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl
    ligand_outfile_prefix                                        2NNQ_frag_output
    write_orientations                                           no
    num_scored_conformers                                        1
    rank_ligands                                                 no

The only aspect that is relevant of this fragment library generated is the torsion environment that should be titled 2NNQ.fraglib_torenv.dat Following this step the torsion environments of this molecule will be combined with the file titled full_sorted_fraglib.dat using the python software combine_torenv.py

    py /gpfs/projects/rizzo/zzz.programs/torsion_env_combination/combine_torenv.py 2NNQ.fraglib_torenv.dat /gpfs/projects/rizzo/zzz.programs/dock6.10_2019.06.18/parameters/fraglib_torenv.dat

This should generate a new list of torsion environments titled unique_full_sorted_fraglib.dat

II. Performing a GA using 2NNQ

For this part a new directory will be created in the 2NNQ_GA. This will just be 2NNQ_GA_results. cd into that directory and create a new file titled 2NNQ_GA.in.

    mkdir 2NNQ_GA_results
    cd 2NNQ_GA_results
    touch 2NNQ_GA.in

Following this dock6 will be performed on the molecule

    dock6 -i 2NNQ_GA.in -o 2NNQ_GA.out

Input the following in order to get the GA working properly. The file will ask for ga_mutations. This will prompt you with addition, deletion, substitution, and replacement mutations and respond yes to all them unless there is a specific purpose to your code to not include these mutation types.

    conformer_search_type                                        genetic
    ga_molecule_file                                             /Path/2nnq_lig_withH.mol2
    ga_fraglib_scaffold_file                                     /gpfs/projects/rizzo/zzz.programs/dock6.10_2019.06.14/parameters    /fraglib_ga_scaffold.mol2
    ga_fraglib_linker_file                                       /gpfs/projects/rizzo/zzz.programs/dock6.10_2019.06.14/parameters   /fraglib_linker.mol2
    ga_fraglib_sidechain_file                                    /gpfs/projects/rizzo/zzz.programs/dock6.10_2019.06.14/parameters/fraglib_sidechain.mol2
    ga_torenv_table                                              ../fraglib/unique_full_sorted_fraglib.dat
    ga_max_generations                                           500
    ga_xover_sampling_method_rand                                yes
    ga_xover_max                                                 150
    ga_bond_tolerance                                            0.5
    ga_angle_cutoff                                              0.14
    ga_check_overlap                                             no
    ga_mutations                                                 yes
    ga_mutate_addition                                           yes
    ga_mutate_deletion                                           yes
    ga_mutate_substitution                                       yes  
    ga_mutate_replacement                                        yes
    ga_mutate_parents                                            yes
    ga_pmut_rate                                                 0.3
    ga_omut_rate                                                 0.7
    ga_max_mut_cycles                                            5
    ga_mut_sampling_method                                       rand
    ga_num_random_picks                                          10
    ga_max_root_size                                             5
    ga_energy_cutoff                                             100
    ga_heur_unmatched_num                                        2
    ga_heur_matched_rmsd                                         2
    ga_constraint_mol_wt                                         550
    ga_constraint_rot_bon                                        10
    ga_constraint_H_accept                                       10
    ga_constraint_H_don                                          5
    ga_constraint_formal_charge                                  4
    ga_ensemble_size                                             200
    ga_selection_method                                          elitism
    ga_elitism_combined                                          no
    ga_elitism_option                                            max
    ga_niching                                                   no
    ga_selection_extinction                                      no
    ga_max_num_gen_with_no_crossover                             1000
    ga_output_prefix                                             2NNQ_GA_output
    use_internal_energy                                          yes
    internal_energy_rep_exp                                      12
    internal_energy_cutoff                                       100
    use_database_filter                                          no
    orient_ligand                                                no
    bump_filter                                                  no
    score_molecules                                              yes
    contact_score_primary                                        no
    grid_score_primary                                           no
    gist_score_primary                                           no
    multigrid_score_primary                                      no
    dock3.5_score_primary                                        no
    continuous_score_primary                                     no
    footprint_similarity_score_primary                           no
    pharmacophore_score_primary                                  no
    hbond_score_primary                                          no
    descriptor_score_primary                                     yes
    descriptor_use_grid_score                                    no
    descriptor_use_multigrid_score                               no
    descriptor_use_continuous_score                              no
    descriptor_use_footprint_similarity                          no
    descriptor_use_pharmacophore_score                           no
    descriptor_use_tanimoto                                      no
    descriptor_use_hungarian                                     no
    descriptor_use_volume_overlap                                no
    descriptor_use_gist                                          no
    minimize_ligand                                              yes
    minimize_anchor                                              yes
    minimize_flexible_growth                                     yes
    use_advanced_simplex_parameters                              no
    simplex_max_cycles                                           1
    simplex_score_converge                                       0.1
    simplex_cycle_converge                                       1
    simplex_trans_step                                           1
    simplex_rot_step                                             0.1
    simplex_tors_step                                            10
    simplex_anchor_max_iterations                                500   
    simplex_grow_max_iterations                                  500
    simplex_grow_tors_premin_iterations                          00
    simplex_random_seed                                          0
    simplex_restraint_min                                        yes
    simplex_coefficient_restraint                                10
    atom_model                                                   all
    vdw_defn_file                                                /gpfs/projects/AMS536/zzz.programs/dock6/parameters    /vdw_AMBER_parm99.defn
    flex_defn_file                                               /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn
    flex_drive_file                                              /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl

For the last step send this to the slurm servers generate the results. This test will occur for 500 generations and is too computationally intensive for the head node.

First create a new file to send to the slurm system

     vim 2NNQ_GA.sh

Following this input the following into the 2NNQ_GA.sh script

     #!/bin/bash
     #SBATCH --time=48:00:00
     #SBATCH --nodes=1
     #SBATCH --ntasks=40
     #SBATCH --job-name=2NNQ_GA_input
     #SBATCH --output=2NNQ_GA_output
     #SBATCH -p long-40core
     cd $SLURM_SUBMIT_DIR
     dock6 -i 2NNQ_GA.in -o 2NNQ_GA.out

Following this you will send the script

     qsub 2NNQ_GA.sh

To check on its status use the following

     qstat -u username