Difference between revisions of "DOCK GA Development Goals"
From Rizzo_Lab
								
												
				| BrockBoysan (talk | contribs)  (→To Do List) | BrockBoysan (talk | contribs)  | ||
| (10 intermediate revisions by 2 users not shown) | |||
| Line 11: | Line 11: | ||
| |- | |- | ||
| |Added Delimeter header ||conf_gen_ga || LEP || Yes | |Added Delimeter header ||conf_gen_ga || LEP || Yes | ||
| + | |- | ||
| + | |Move ga_utilities to db_utilities, GA utilities requires running evolution, strips the name, etc. || conf_gen_ga || ? || no | ||
| |- | |- | ||
| |Fix xover only feature ||conf_gen_ga || LEP || Yes | |Fix xover only feature ||conf_gen_ga || LEP || Yes | ||
| Line 21: | Line 23: | ||
| |- | |- | ||
| |Add single molecule evolution in testcase in install dir. || install/test/genetic || LEP ||yes | |Add single molecule evolution in testcase in install dir. || install/test/genetic || LEP ||yes | ||
| − | |||
| − | |||
| |- | |- | ||
| |Add leading 0's to xover output filenames || conf_gen_ga.cpp || JDB || no | |Add leading 0's to xover output filenames || conf_gen_ga.cpp || JDB || no | ||
| Line 29: | Line 29: | ||
| |- | |- | ||
| |Multi-layer replacement for Amides || conf_gen_ga.cpp || JDB || no | |Multi-layer replacement for Amides || conf_gen_ga.cpp || JDB || no | ||
| + | |- | ||
| + | |Compute delta slope of fitness score || congen_ga.cpp || BTB ||no | ||
| + | |- | ||
| + | | slow down molecular evolution so there are less drastic canges between each successive generation || congen_ga.cpp || BTB ||no | ||
| + | |- | ||
| + | | bring in new parents (e.g. from a pool of molecules) based on convergence || confgen_ga.cpp || BTB || no | ||
| + | |- | ||
| + | | user defined point vs on-the-fly convergence || confgen_ga.cpp || BTB || no | ||
| + | |- | ||
| + | | metropolis selection for tournament/roulette || conf_gen_ga.cpp || BTB  || no | ||
| + | |- | ||
| + | | Selection of fragments for mutation based on Fragment Score || conf_gen_ga.cpp ||  BTB || no | ||
| + | |- | ||
| |} | |} | ||
| <br> | <br> | ||
| Line 34: | Line 47: | ||
| ==To Do List== | ==To Do List== | ||
| − | |||
| # tanimoto coefficient percent change - might be inaccurate due to tan coef behavior | # tanimoto coefficient percent change - might be inaccurate due to tan coef behavior | ||
| − | # Rotatable bond changes | + | # Rotatable bond changes (???) | 
| − | #  | + | # Limit number of aromatic rings. | 
| #-xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS   | #-xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS   | ||
| − | # | + | #Nonexhaustive xover (pick subset of xover based on probability) | 
| − | #  | + | #Nonexhaustive xover for each pair of parents - have a set number of bonds for xover rather than trying all of them. | 
| − | #2-3  | + | #Crossover on multiple points simultaneously (2-3 crossovers on a given pair to make 3+ children rather than 2+). | 
| − | # | + | #Adaptive maintenance of ensemble based on convergence (extinction, delta max, etc.).   | 
| − | + | #Stop convergence.  | |
| #Mutations- | #Mutations- | ||
| − | ## | + | ##Adaptive mutation rate - change rate of mutation based on some internal criteria. | 
| − | ## | + | ##Pick location of mutation based on some internal criteria. | 
| − | ## | + | ##Pick mutation type based on ensemble behavior. | 
| − | ##molecules too large boost deletion | + | ##If molecules are too large, boost deletion (useful for elitism). | 
| − | ##molecules too small,  | + | ##If molecules are too small, boost additive mutations. | 
| − | ## | + | ##If molecules are too similar, boost replacements and substitutions.   | 
| ##mutation type selection based on probability vs ensemble | ##mutation type selection based on probability vs ensemble | ||
| − | ## | + | ##Complete x # y mutation so far so less prevalent etc | 
| − | ##3 layer  | + | ##Note: 3 layer substitutions probably aren't going to work. | 
| − | ## | + | ##2-layer replacements.  | 
| #fitness- | #fitness- | ||
| ##turn on and off niching adaptive/extinction | ##turn on and off niching adaptive/extinction | ||
| Line 60: | Line 72: | ||
| ##pareto/mulitobjective ga | ##pareto/mulitobjective ga | ||
| #selection- | #selection- | ||
| − | ## | + | ##Adaptively keep differing #'s of parents and children based on some internal criteria (similarity?) | 
| − | |||
| #extinction-   | #extinction-   | ||
| − | |||
| − | |||
| − | |||
| #which molecules are best- | #which molecules are best- | ||
| ##best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used | ##best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used | ||
| ## geometric diversity using Hingarian and Tan pruning | ## geometric diversity using Hingarian and Tan pruning | ||
| #Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN. | #Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN. | ||
| + | #Choose crossover pairs based on measure of similarity (boost crossovers between dissimilar molecules). | ||
| == Known Bugs == | == Known Bugs == | ||
Latest revision as of 11:59, 1 August 2024
| Tasks | src | Owner | Complete? | 
|---|---|---|---|
| Horizontal pruning issue not catching chemically identical molecules | LEP | no | |
| Fix bug that collapsed atom coordinates | everywhere? nowhere? somewhere. | LEP | YES MUAHAHAHAHA! | 
| Added Delimeter header | conf_gen_ga | LEP | Yes | 
| Move ga_utilities to db_utilities, GA utilities requires running evolution, strips the name, etc. | conf_gen_ga | ? | no | 
| Fix xover only feature | conf_gen_ga | LEP | Yes | 
| Put in error messages for mut_rate > 1 | conf_gen_ga | LEP | Yes | 
| Manual user-defined mutation type | conf_gen_ga | LEP | Yes | 
| Remove check only option | conf_gen_ga | LEP | Yes | 
| Add single molecule evolution in testcase in install dir. | install/test/genetic | LEP | yes | 
| Add leading 0's to xover output filenames | conf_gen_ga.cpp | JDB | no | 
| DNM replacement unable to build list? | conf_gen_ga.cpp | JDB | no | 
| Multi-layer replacement for Amides | conf_gen_ga.cpp | JDB | no | 
| Compute delta slope of fitness score | congen_ga.cpp | BTB | no | 
| slow down molecular evolution so there are less drastic canges between each successive generation | congen_ga.cpp | BTB | no | 
| bring in new parents (e.g. from a pool of molecules) based on convergence | confgen_ga.cpp | BTB | no | 
| user defined point vs on-the-fly convergence | confgen_ga.cpp | BTB | no | 
| metropolis selection for tournament/roulette | conf_gen_ga.cpp | BTB | no | 
| Selection of fragments for mutation based on Fragment Score | conf_gen_ga.cpp | BTB | no | 
To Do List
- tanimoto coefficient percent change - might be inaccurate due to tan coef behavior
- Rotatable bond changes (???)
- Limit number of aromatic rings.
- -xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS
- Nonexhaustive xover (pick subset of xover based on probability)
- Nonexhaustive xover for each pair of parents - have a set number of bonds for xover rather than trying all of them.
- Crossover on multiple points simultaneously (2-3 crossovers on a given pair to make 3+ children rather than 2+).
- Adaptive maintenance of ensemble based on convergence (extinction, delta max, etc.).
- Stop convergence.
- Mutations-
- Adaptive mutation rate - change rate of mutation based on some internal criteria.
- Pick location of mutation based on some internal criteria.
- Pick mutation type based on ensemble behavior.
- If molecules are too large, boost deletion (useful for elitism).
- If molecules are too small, boost additive mutations.
- If molecules are too similar, boost replacements and substitutions.
- mutation type selection based on probability vs ensemble
- Complete x # y mutation so far so less prevalent etc
- Note: 3 layer substitutions probably aren't going to work.
- 2-layer replacements.
 
- fitness-
- turn on and off niching adaptive/extinction
- reduce boost of fragments and all poor mols with niching
- pareto/mulitobjective ga
 
- selection-
- Adaptively keep differing #'s of parents and children based on some internal criteria (similarity?)
 
- extinction-
- which molecules are best-
- best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used
- geometric diversity using Hingarian and Tan pruning
 
- Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN.
- Choose crossover pairs based on measure of similarity (boost crossovers between dissimilar molecules).
Known Bugs
- Molecules Processed bug (dock.cpp)
- verbose mol stats (amber typer)
- molecule being renamed when going into repl even tho it's the same molecule
