2019 DOCK GA tutorial 1 with 2NNQ
For this experiment required files will include 2NNQ.lig.mol2
I. Fragment Library Generation for 2NNQ
Fragment Libraries
For this experiment a general fragment library to account for all the possible ligands the GA can produce. A focused Fragment library will not be enough to account for all the possibilities since it is too small of a set of data.
The first step for this experiment is to generate a directory to perform the work in
mkdir 2NNQ_GA
Go into this new directory and then create another directory that will store all the fragment molecules
mkdir fraglib
Enter the file directory fraglib
cd fraglib
Create a new file called 2NNQ.fraglib
touch 2NNQ.fraglib dock6 -i 2NNQ.fraglib
Answer the following prompts using these responses
conformer_search_type flex write_fragment_libraries yes fragment_library_prefix 2NNQ.fraglib fragment_library_freq_cutoff 1 fragment_library_sort_method freq fragment_library_trans_origin no use_internal_energy no ligand_atom_file ../../2NNQ_Tutorial/1.dockprep/2nnq_lig_withH.mol2 limit_max_ligands no skip_molecule no read_mol_solvation no calculate_rmsd no use_database_filter no orient_ligand no bump_filter no score_molecules no atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6/parameters/vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl ligand_outfile_prefix 2NNQ_frag_output write_orientations no num_scored_conformers 1 rank_ligands no
The only aspect that is relevant of this fragment library generated is the torsion environment that should be titled 2NNQ.fraglib_torenv.dat Following this step the torsion environments of this molecule will be combined with the file titled full_sorted_fraglib.dat using the python software combine_torenv.py
py /gpfs/projects/rizzo/zzz.programs/torsion_env_combination/combine_torenv.py 2NNQ.fraglib_torenv.dat /gpfs/projects/rizzo/zzz.programs/dock6.10_2019.06.18/parameters/fraglib_torenv.dat
This should generate a new list of torsion environments titled unique_full_sorted_fraglib.dat
II. Performing a GA using 2NNQ
For this part a new directory will be created in the 2NNQ_GA. This will just be 2NNQ_GA_results. cd into that directory and create a new file titled 2NNQ_GA.in.
mkdir 2NNQ_GA_results cd 2NNQ_GA_results touch 2NNQ_GA.in
Following this dock6 will be performed on the molecule
dock6 -i 2NNQ_GA.in -o 2NNQ_GA.out
Input the following in order to get the GA working properly. The file will ask for ga_mutations. This will prompt you with addition, deletion, substitution, and replacement mutations and respond yes to all them unless there is a specific purpose to your code to not include these mutation types.
conformer_search_type genetic ga_molecule_file /Path/2nnq_lig_withH.mol2 ga_fraglib_scaffold_file /gpfs/projects/rizzo/zzz.programs/dock6.10_2019.06.14/parameters /fraglib_ga_scaffold.mol2 ga_fraglib_linker_file /gpfs/projects/rizzo/zzz.programs/dock6.10_2019.06.14/parameters /fraglib_linker.mol2 ga_fraglib_sidechain_file /gpfs/projects/rizzo/zzz.programs/dock6.10_2019.06.14/parameters/fraglib_sidechain.mol2 ga_torenv_table ../fraglib/unique_full_sorted_fraglib.dat ga_max_generations 500 ga_xover_sampling_method_rand yes ga_xover_max 150 ga_bond_tolerance 0.5 ga_angle_cutoff 0.14 ga_check_overlap no ga_mutations yes ga_mutate_addition yes ga_mutate_deletion yes ga_mutate_substitution yes ga_mutate_replacement yes ga_mutate_parents yes ga_pmut_rate 0.3 ga_omut_rate 0.7 ga_max_mut_cycles 5 ga_mut_sampling_method rand ga_num_random_picks 10 ga_max_root_size 5 ga_energy_cutoff 100 ga_heur_unmatched_num 2 ga_heur_matched_rmsd 2 ga_constraint_mol_wt 550 ga_constraint_rot_bon 10 ga_constraint_H_accept 10 ga_constraint_H_don 5 ga_constraint_formal_charge 4 ga_ensemble_size 200 ga_selection_method elitism ga_elitism_combined no ga_elitism_option max ga_niching no ga_selection_extinction no ga_max_num_gen_with_no_crossover 1000 ga_output_prefix 2NNQ_GA_output use_internal_energy yes internal_energy_rep_exp 12 internal_energy_cutoff 100 use_database_filter no orient_ligand no bump_filter no score_molecules yes contact_score_primary no grid_score_primary no gist_score_primary no multigrid_score_primary no dock3.5_score_primary no continuous_score_primary no footprint_similarity_score_primary no pharmacophore_score_primary no hbond_score_primary no descriptor_score_primary yes descriptor_use_grid_score no descriptor_use_multigrid_score no descriptor_use_continuous_score no descriptor_use_footprint_similarity no descriptor_use_pharmacophore_score no descriptor_use_tanimoto no descriptor_use_hungarian no descriptor_use_volume_overlap no descriptor_use_gist no minimize_ligand yes minimize_anchor yes minimize_flexible_growth yes use_advanced_simplex_parameters no simplex_max_cycles 1 simplex_score_converge 0.1 simplex_cycle_converge 1 simplex_trans_step 1 simplex_rot_step 0.1 simplex_tors_step 10 simplex_anchor_max_iterations 500 simplex_grow_max_iterations 500 simplex_grow_tors_premin_iterations 00 simplex_random_seed 0 simplex_restraint_min yes simplex_coefficient_restraint 10 atom_model all vdw_defn_file /gpfs/projects/AMS536/zzz.programs/dock6/parameters /vdw_AMBER_parm99.defn flex_defn_file /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex.defn flex_drive_file /gpfs/projects/AMS536/zzz.programs/dock6/parameters/flex_drive.tbl
For the last step send this to the slurm servers generate the results. This test will occur for 500 generations and is too computationally intensive for the head node.
First create a new file to send to the slurm system
vim 2NNQ_GA.sh
Following this input the following into the 2NNQ_GA.sh script
#!/bin/bash #SBATCH --time=48:00:00 #SBATCH --nodes=1 #SBATCH --ntasks=40 #SBATCH --job-name=2NNQ_GA_input #SBATCH --output=2NNQ_GA_output #SBATCH -p long-40core
cd $SLURM_SUBMIT_DIR dock6 -i 2NNQ_GA.in -o 2NNQ_GA.out
Following this you will send the script
qsub 2NNQ_GA.sh
To check on its status use the following
qstat -u username