Difference between revisions of "DOCK GA Development Goals"

From Rizzo_Lab
Jump to: navigation, search
(To Do List)
 
(11 intermediate revisions by 2 users not shown)
Line 11: Line 11:
 
|-
 
|-
 
|Added Delimeter header ||conf_gen_ga || LEP || Yes
 
|Added Delimeter header ||conf_gen_ga || LEP || Yes
 +
|-
 +
|Move ga_utilities to db_utilities, GA utilities requires running evolution, strips the name, etc. || conf_gen_ga || ? || no
 
|-
 
|-
 
|Fix xover only feature ||conf_gen_ga || LEP || Yes
 
|Fix xover only feature ||conf_gen_ga || LEP || Yes
Line 21: Line 23:
 
|-
 
|-
 
|Add single molecule evolution in testcase in install dir. || install/test/genetic || LEP ||yes
 
|Add single molecule evolution in testcase in install dir. || install/test/genetic || LEP ||yes
|-
 
|Compute delta slope of fitness score || congen_ga.cpp || LEP ||no
 
 
|-
 
|-
 
|Add leading 0's to xover output filenames || conf_gen_ga.cpp || JDB || no
 
|Add leading 0's to xover output filenames || conf_gen_ga.cpp || JDB || no
Line 29: Line 29:
 
|-
 
|-
 
|Multi-layer replacement for Amides || conf_gen_ga.cpp || JDB || no
 
|Multi-layer replacement for Amides || conf_gen_ga.cpp || JDB || no
 +
|-
 +
|Compute delta slope of fitness score || congen_ga.cpp || BTB ||no
 +
|-
 +
| slow down molecular evolution so there are less drastic canges between each successive generation || congen_ga.cpp || BTB ||no
 +
|-
 +
| bring in new parents (e.g. from a pool of molecules) based on convergence || confgen_ga.cpp || BTB || no
 +
|-
 +
| user defined point vs on-the-fly convergence || confgen_ga.cpp || BTB || no
 +
|-
 +
| metropolis selection for tournament/roulette || conf_gen_ga.cpp || BTB  || no
 +
|-
 +
| Selection of fragments for mutation based on Fragment Score || conf_gen_ga.cpp ||  BTB || no
 +
|-
 
|}
 
|}
 
<br>
 
<br>
Line 34: Line 47:
  
 
==To Do List==
 
==To Do List==
# slow down molecular evolution so there are less drastic canges between each successive generation
 
 
# tanimoto coefficient percent change - might be inaccurate due to tan coef behavior
 
# tanimoto coefficient percent change - might be inaccurate due to tan coef behavior
# Rotatable bond changes
+
# Rotatable bond changes (???)
# Number of aromatic rings  
+
# Limit number of aromatic rings.
 
#-xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS  
 
#-xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS  
#nonexhaustive xover (pick subset of xover based on probability)
+
#Nonexhaustive xover (pick subset of xover based on probability)
# nonexhaustive xover for each pair of parents - have a set number of bonds for xover
+
#Nonexhaustive xover for each pair of parents - have a set number of bonds for xover rather than trying all of them.
#2-3 point xover at once
+
#Crossover on multiple points simultaneously (2-3 crossovers on a given pair to make 3+ children rather than 2+).
#adaptive maintenance ensemble based on ensemble convergence THIS
+
#Adaptive maintenance of ensemble based on convergence (extinction, delta max, etc.).  
#bring in new parents (e.g. from a pool of molecules) based on convergence
+
#Stop convergence.
 
#Mutations-
 
#Mutations-
##adaptive mutation rate THIS
+
##Adaptive mutation rate - change rate of mutation based on some internal criteria.
##pick location of mutation based on something
+
##Pick location of mutation based on some internal criteria.
##pick mutation type based on behavior of ensemble
+
##Pick mutation type based on ensemble behavior.
##molecules too large boost deletion
+
##If molecules are too large, boost deletion (useful for elitism).
##molecules too small, add more groups
+
##If molecules are too small, boost additive mutations.
##change ...boost replace/sub
+
##If molecules are too similar, boost replacements and substitutions.  
 
##mutation type selection based on probability vs ensemble
 
##mutation type selection based on probability vs ensemble
##complete x # y mutation so far so less prevalent etc
+
##Complete x # y mutation so far so less prevalent etc
##3 layer subs do no work so don't do them
+
##Note: 3 layer substitutions probably aren't going to work.
##replace > 1 segment
+
##2-layer replacements.
 
#fitness-
 
#fitness-
 
##turn on and off niching adaptive/extinction
 
##turn on and off niching adaptive/extinction
Line 60: Line 72:
 
##pareto/mulitobjective ga
 
##pareto/mulitobjective ga
 
#selection-
 
#selection-
##metropolis selection for tournament/roulette
+
##Adaptively keep differing #'s of parents and children based on some internal criteria (similarity?)
##adaptive keep #p and #o
 
 
#extinction-  
 
#extinction-  
##user defined point vs on-the-fly convergence THIS
 
#stop-
 
##convergence
 
 
#which molecules are best-
 
#which molecules are best-
 
##best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used
 
##best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used
 
## geometric diversity using Hingarian and Tan pruning
 
## geometric diversity using Hingarian and Tan pruning
 
#Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN.
 
#Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN.
 +
#Choose crossover pairs based on measure of similarity (boost crossovers between dissimilar molecules).
  
 
== Known Bugs ==
 
== Known Bugs ==

Latest revision as of 10:59, 1 August 2024

Tasks src Owner Complete?
Horizontal pruning issue not catching chemically identical molecules LEP no
Fix bug that collapsed atom coordinates everywhere? nowhere? somewhere. LEP YES MUAHAHAHAHA!
Added Delimeter header conf_gen_ga LEP Yes
Move ga_utilities to db_utilities, GA utilities requires running evolution, strips the name, etc. conf_gen_ga  ? no
Fix xover only feature conf_gen_ga LEP Yes
Put in error messages for mut_rate > 1 conf_gen_ga LEP Yes
Manual user-defined mutation type conf_gen_ga LEP Yes
Remove check only option conf_gen_ga LEP Yes
Add single molecule evolution in testcase in install dir. install/test/genetic LEP yes
Add leading 0's to xover output filenames conf_gen_ga.cpp JDB no
DNM replacement unable to build list? conf_gen_ga.cpp JDB no
Multi-layer replacement for Amides conf_gen_ga.cpp JDB no
Compute delta slope of fitness score congen_ga.cpp BTB no
slow down molecular evolution so there are less drastic canges between each successive generation congen_ga.cpp BTB no
bring in new parents (e.g. from a pool of molecules) based on convergence confgen_ga.cpp BTB no
user defined point vs on-the-fly convergence confgen_ga.cpp BTB no
metropolis selection for tournament/roulette conf_gen_ga.cpp BTB no
Selection of fragments for mutation based on Fragment Score conf_gen_ga.cpp BTB no



To Do List

  1. tanimoto coefficient percent change - might be inaccurate due to tan coef behavior
  2. Rotatable bond changes (???)
  3. Limit number of aromatic rings.
  4. -xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS
  5. Nonexhaustive xover (pick subset of xover based on probability)
  6. Nonexhaustive xover for each pair of parents - have a set number of bonds for xover rather than trying all of them.
  7. Crossover on multiple points simultaneously (2-3 crossovers on a given pair to make 3+ children rather than 2+).
  8. Adaptive maintenance of ensemble based on convergence (extinction, delta max, etc.).
  9. Stop convergence.
  10. Mutations-
    1. Adaptive mutation rate - change rate of mutation based on some internal criteria.
    2. Pick location of mutation based on some internal criteria.
    3. Pick mutation type based on ensemble behavior.
    4. If molecules are too large, boost deletion (useful for elitism).
    5. If molecules are too small, boost additive mutations.
    6. If molecules are too similar, boost replacements and substitutions.
    7. mutation type selection based on probability vs ensemble
    8. Complete x # y mutation so far so less prevalent etc
    9. Note: 3 layer substitutions probably aren't going to work.
    10. 2-layer replacements.
  11. fitness-
    1. turn on and off niching adaptive/extinction
    2. reduce boost of fragments and all poor mols with niching
    3. pareto/mulitobjective ga
  12. selection-
    1. Adaptively keep differing #'s of parents and children based on some internal criteria (similarity?)
  13. extinction-
  14. which molecules are best-
    1. best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used
    2. geometric diversity using Hingarian and Tan pruning
  15. Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN.
  16. Choose crossover pairs based on measure of similarity (boost crossovers between dissimilar molecules).

Known Bugs

  1. Molecules Processed bug (dock.cpp)
  2. verbose mol stats (amber typer)
  3. molecule being renamed when going into repl even tho it's the same molecule