Difference between revisions of "Development Goals Archive"
From Rizzo_Lab
(→DOCK_DN - De Novo Design) |
(→DOCK_GA - Genetic Algorithm) |
||
Line 153: | Line 153: | ||
=DOCK_GA - Genetic Algorithm= | =DOCK_GA - Genetic Algorithm= | ||
+ | =DOCK_GA - Genetic Algorithm= | ||
+ | {| border="1" cellpadding="8" cellspacing="0" style="background:white; text-align:left; width:90%" | ||
+ | |- style="background:lightblue" | ||
+ | ! style="width:50%" !|Tasks | ||
+ | ! style="width:10%" !|Owner | ||
+ | ! style="width:30%" !|Complete? | ||
+ | |- | ||
+ | |Horizontal pruning issue not catching chemically identical molecules || LEP || no | ||
+ | |- | ||
+ | |Fix bug that collapsed atom coordinates || everywhere? nowhere? somewhere. || LEP || YES MUAHAHAHAHA! | ||
+ | |- | ||
+ | |Added Delimeter header ||conf_gen_ga || LEP || Yes | ||
+ | |- | ||
+ | |Fix xover only feature ||conf_gen_ga || LEP || Yes | ||
+ | |- | ||
+ | |Put in error messages for mut_rate > 1 || conf_gen_ga ||LEP || Yes | ||
+ | |- | ||
+ | |Manual user-defined mutation type ||conf_gen_ga || LEP || Yes | ||
+ | |- | ||
+ | |Remove check only option || conf_gen_ga || LEP || Yes | ||
+ | |- | ||
+ | |Add single molecule evolution in testcase in install dir. || install/test/genetic || LEP ||yes | ||
+ | |- | ||
+ | |Add leading 0's to xover output filenames || conf_gen_ga.cpp || JDB || no | ||
+ | |- | ||
+ | |DNM replacement unable to build list? || conf_gen_ga.cpp || JDB || no | ||
+ | |- | ||
+ | |Multi-layer replacement for Amides || conf_gen_ga.cpp || JDB || no | ||
+ | |- | ||
+ | |Compute delta slope of fitness score || congen_ga.cpp || BTB ||no | ||
+ | |- | ||
+ | | slow down molecular evolution so there are less drastic canges between each successive generation || congen_ga.cpp || BTB ||no | ||
+ | |- | ||
+ | | bring in new parents (e.g. from a pool of molecules) based on convergence || confgen_ga.cpp || BTB || no | ||
+ | |- | ||
+ | | user defined point vs on-the-fly convergence || confgen_ga.cpp || BTB || no | ||
+ | |- | ||
+ | | metropolis selection for tournament/roulette || conf_gen_ga.cpp || BTB || no | ||
+ | |- | ||
+ | |} | ||
+ | <br> | ||
+ | |||
+ | ==To Do List== | ||
+ | # tanimoto coefficient percent change - might be inaccurate due to tan coef behavior | ||
+ | # Rotatable bond changes (???) | ||
+ | # Limit number of aromatic rings. | ||
+ | #-xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS | ||
+ | #Nonexhaustive xover (pick subset of xover based on probability) | ||
+ | #Nonexhaustive xover for each pair of parents - have a set number of bonds for xover rather than trying all of them. | ||
+ | #Crossover on multiple points simultaneously (2-3 crossovers on a given pair to make 3+ children rather than 2+). | ||
+ | #Adaptive maintenance of ensemble based on convergence (extinction, delta max, etc.). | ||
+ | #Stop convergence. | ||
+ | #Mutations- | ||
+ | ##Adaptive mutation rate - change rate of mutation based on some internal criteria. | ||
+ | ##Pick location of mutation based on some internal criteria. | ||
+ | ##Pick mutation type based on ensemble behavior. | ||
+ | ##If molecules are too large, boost deletion (useful for elitism). | ||
+ | ##If molecules are too small, boost additive mutations. | ||
+ | ##If molecules are too similar, boost replacements and substitutions. | ||
+ | ##mutation type selection based on probability vs ensemble | ||
+ | ##Complete x # y mutation so far so less prevalent etc | ||
+ | ##Note: 3 layer substitutions probably aren't going to work. | ||
+ | ##2-layer replacements. | ||
+ | #fitness- | ||
+ | ##turn on and off niching adaptive/extinction | ||
+ | ##reduce boost of fragments and all poor mols with niching | ||
+ | ##pareto/mulitobjective ga | ||
+ | #selection- | ||
+ | ##Adaptively keep differing #'s of parents and children based on some internal criteria (similarity?) | ||
+ | #extinction- | ||
+ | #which molecules are best- | ||
+ | ##best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used | ||
+ | ## geometric diversity using Hingarian and Tan pruning | ||
+ | #Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN. | ||
+ | #Choose crossover pairs based on measure of similarity (boost crossovers between dissimilar molecules). | ||
+ | |||
+ | == Known Bugs == | ||
+ | #Molecules Processed bug (dock.cpp) | ||
+ | #verbose mol stats (amber typer) | ||
+ | #molecule being renamed when going into repl even tho it's the same molecule | ||
=DOCK_CV - Covalent Docking= | =DOCK_CV - Covalent Docking= |
Revision as of 13:19, 24 October 2022
Contents
DOCK_VS - Virtual Screening/Traditional Docking
Tasks | src | Owner | Complete? |
---|---|---|---|
verbose ==2 option in dock6 beta | utils.cpp | LEP | yes |
Add total conformers samples | |||
Check amide bond rotation during sampling - it's nto a bug it was fixed back in 2014 | LEP | yes | |
Write out # of HBond Donors and Acceptors | conf_gen_dn, library_file | LEP | yes |
put in compiler directives to compile with or without timespec | dock.cpp | LEP | yes |
Fix bug that prints out 2/3 sigfigs instead of 6 for MW and FC | library_file, filter, amber_typer | LEP | Yes |
Fix nano/micro/milisecond timer | dock.cpp | GDRM | Yes |
ga flag and verbose == 2 for premin_mol in simplex | simplex.cpp | LEP | Yes |
Merge Hackathon changes to beta for clean faster code | pow/memcpy/mpi pointers everywhere | LEP | Yes |
Add Tip3p atom type to dock | vdw.defn fingerprint | LEP | Yes |
Hide secondary scoring function permanently | lots | LEP | Yes |
Merge GIST into latest dock | grid, master_score, score_descriptor, score_gist | LEP | Yes |
Add second layer of verbosity | utils, conf_gen_dn so far | LEP | Yes |
RDKit integration with DOCK | GDM | Yes | |
Modify Grid to show error on nonintegrality | BTB | Yes |
DOCK_DN - De Novo Design
Task | Owner | Notes |
---|---|---|
Possible torenv check for dump molecules after capping before printing. | ||
Overhaul the simple-build function | Trent | 6.10 release |
When minimizing with descriptor score, make sure fingerprint is turned off | xxx | |
Speed up fingerprint calculations by saving reference ligand as a permanent object | WJA | |
Add pre-min conformations to growth trees | WJA | |
Add verbose flag options | WJA | |
Put molecular properties (RB, MW, etc) in mol2 header | WJA | |
Put ensemble properties (RB, MW, etc) output stream at the end of each layer | WJA | |
Check formal charge prune | BCF | |
Combination of horizontal pruning metrics (let's consider dropping tanimoto prune and just using hungarian prune) | WJA | |
Finish implementing growth trees | WJA | |
Revisit orienting to make sure it is working as intended | WJA | |
Fixed a bug where we were marking scaffold_this_layer as true for any fragment | WJA | |
Update random sampling function to use last layer changes in graph function | WJA | |
Do that same thing for the exhaustive function | WJA | |
I don't think we ever clear the scaf_link_sid vector, we definitely should do that somewhere | WJA | |
Update exhaustive to combine all frags into one library, just like graph / random. | WJA | |
hbond accept/donor descriptor implementation | Lauren | |
increase orienting verbose statistics for dn | Chris | |
acceptance based on freq of torsenv | John | |
secondary torenv check of prune dump molecules and testing | Lauren & John | |
SMILEs and ZINC script (for dn and ga) | Lauren & John | |
add dn name with date and counter function | Lauren | |
Check MGS+(-50)TAN before and after fingerprinting fix for 663 systems | Lauren | |
determine if random seed is reset for each aps | Lauren | |
Create testset for each dn function | Lauren | |
Test simple build function with merged de novo | Lauren | |
clean make_unique script for release | Lauren & Stephen | |
merge GA into dock/dn | Lauren | |
MPI wrapper for 192 processors (8 nodes) for testsets on rizzo cluster | Dwight & Lauren | |
Create short testsets for denovo frag gen, focused fragment generic for DOCK6.9 release | Lauren | |
merge parameter files of de novo with DOCK | Dwight & Lauren | |
add dn_defn file for separate defn with Hydrogens | Lauren | |
Implement csingleton fix for orienting fragments with less than 3 heavy atoms | Lauren | |
Test bfochtman fix for rotatable bonds within an user defined anchor | Lauren | |
Test csingleton fix for orienting fragments with Du | Lauren | |
Test MGS focused fragment library results with dn paper | Lauren | |
editting script to calculate SMILE string of de novo molecules in OpenBabel | Stephen | |
smooth function cutoff for mw | Stephen | |
Rework VS protocol to integrate de novo protocol more smoothly | Lauren & John | |
Fix torsion problem for prune_dump molecules | Lauren & John |
This is the Rizzo lab wiki page for coordinating bugs and progress on the de novo project.
Valgrind clean version of the code on cluster that Rizzo lab should be using:
Lauren:
/gpfs/projects/rizzo/zzz.programs/dock6.9_release This version includes all changes of the merge.
Path to Generic Fragment Library:
/gpfs/projects/rizzo/leprentis/gen-frags-12
Path to Frequency Anchors:
/gpfs/projects/rizzo/leprentis/zinc1_ancs_freq
List of SB2012 systems that we will use for tests:
For now, let's use 5-15 rotatable bonds inclusive; total = 709 systems ("drug-like" size molecules). De novo paper only used 663 systems that removed 46 systems where the cognate ligand did not fall with a +/-2 formal charge. (5through15 = 709, 5through15_ch2 = 663)
{5RB = 107; 6RB = 96; 7RB = 103; 8RB = 75; 9RB = 66; 10RB = 75; 11RB = 57; 12RB = 41; 13RB = 38; 14RB = 26; 15RB = 25} </br>
DOCK_GA - Genetic Algorithm
DOCK_GA - Genetic Algorithm
Tasks | Owner | Complete? | |
---|---|---|---|
Horizontal pruning issue not catching chemically identical molecules | LEP | no | |
Fix bug that collapsed atom coordinates | everywhere? nowhere? somewhere. | LEP | YES MUAHAHAHAHA! |
Added Delimeter header | conf_gen_ga | LEP | Yes |
Fix xover only feature | conf_gen_ga | LEP | Yes |
Put in error messages for mut_rate > 1 | conf_gen_ga | LEP | Yes |
Manual user-defined mutation type | conf_gen_ga | LEP | Yes |
Remove check only option | conf_gen_ga | LEP | Yes |
Add single molecule evolution in testcase in install dir. | install/test/genetic | LEP | yes |
Add leading 0's to xover output filenames | conf_gen_ga.cpp | JDB | no |
DNM replacement unable to build list? | conf_gen_ga.cpp | JDB | no |
Multi-layer replacement for Amides | conf_gen_ga.cpp | JDB | no |
Compute delta slope of fitness score | congen_ga.cpp | BTB | no |
slow down molecular evolution so there are less drastic canges between each successive generation | congen_ga.cpp | BTB | no |
bring in new parents (e.g. from a pool of molecules) based on convergence | confgen_ga.cpp | BTB | no |
user defined point vs on-the-fly convergence | confgen_ga.cpp | BTB | no |
metropolis selection for tournament/roulette | conf_gen_ga.cpp | BTB | no |
To Do List
- tanimoto coefficient percent change - might be inaccurate due to tan coef behavior
- Rotatable bond changes (???)
- Limit number of aromatic rings.
- -xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS
- Nonexhaustive xover (pick subset of xover based on probability)
- Nonexhaustive xover for each pair of parents - have a set number of bonds for xover rather than trying all of them.
- Crossover on multiple points simultaneously (2-3 crossovers on a given pair to make 3+ children rather than 2+).
- Adaptive maintenance of ensemble based on convergence (extinction, delta max, etc.).
- Stop convergence.
- Mutations-
- Adaptive mutation rate - change rate of mutation based on some internal criteria.
- Pick location of mutation based on some internal criteria.
- Pick mutation type based on ensemble behavior.
- If molecules are too large, boost deletion (useful for elitism).
- If molecules are too small, boost additive mutations.
- If molecules are too similar, boost replacements and substitutions.
- mutation type selection based on probability vs ensemble
- Complete x # y mutation so far so less prevalent etc
- Note: 3 layer substitutions probably aren't going to work.
- 2-layer replacements.
- fitness-
- turn on and off niching adaptive/extinction
- reduce boost of fragments and all poor mols with niching
- pareto/mulitobjective ga
- selection-
- Adaptively keep differing #'s of parents and children based on some internal criteria (similarity?)
- extinction-
- which molecules are best-
- best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used
- geometric diversity using Hingarian and Tan pruning
- Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN.
- Choose crossover pairs based on measure of similarity (boost crossovers between dissimilar molecules).
Known Bugs
- Molecules Processed bug (dock.cpp)
- verbose mol stats (amber typer)
- molecule being renamed when going into repl even tho it's the same molecule