Difference between revisions of "DOCK GA Development Goals"

From Rizzo_Lab
Jump to: navigation, search
(To Do List)
 
(16 intermediate revisions by 2 users not shown)
Line 22: Line 22:
 
|Add single molecule evolution in testcase in install dir. || install/test/genetic || LEP ||yes
 
|Add single molecule evolution in testcase in install dir. || install/test/genetic || LEP ||yes
 
|-
 
|-
|Compute delta slope of fitness score || congen_ga.cpp || LEP ||no
+
|Add leading 0's to xover output filenames || conf_gen_ga.cpp || JDB || no
 +
|-
 +
|DNM replacement unable to build list? || conf_gen_ga.cpp || JDB || no
 +
|-
 +
|Multi-layer replacement for Amides || conf_gen_ga.cpp || JDB || no
 +
|-
 +
|Compute delta slope of fitness score || congen_ga.cpp || BTB ||no
 +
|-
 +
| slow down molecular evolution so there are less drastic canges between each successive generation || congen_ga.cpp || BTB ||no
 +
|-
 +
| bring in new parents (e.g. from a pool of molecules) based on convergence || confgen_ga.cpp || BTB || no
 +
|-
 +
| user defined point vs on-the-fly convergence || confgen_ga.cpp || BTB || no
 +
|-
 +
| metropolis selection for tournament/roulette || conf_gen_ga.cpp || BTB  || no
 +
|-
 
|}
 
|}
 
<br>
 
<br>
Line 28: Line 43:
  
 
==To Do List==
 
==To Do List==
# slow down molecular evolution so there are less drastic canges between each successive generation
 
 
# tanimoto coefficient percent change - might be inaccurate due to tan coef behavior
 
# tanimoto coefficient percent change - might be inaccurate due to tan coef behavior
# Rotatable bond changes
+
# Rotatable bond changes (???)
# Number of aromatic rings  
+
# Limit number of aromatic rings.
 
#-xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS  
 
#-xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS  
#nonexhaustive xover (pick subset of xover based on probability)
+
#Nonexhaustive xover (pick subset of xover based on probability)
# nonexhaustive xover for each pair of parents - have a set number of bonds for xover
+
#Nonexhaustive xover for each pair of parents - have a set number of bonds for xover rather than trying all of them.
#2-3 point xover at once
+
#Crossover on multiple points simultaneously (2-3 crossovers on a given pair to make 3+ children rather than 2+).
#adaptive maintenance ensemble based on ensemble convergence THIS
+
#Adaptive maintenance of ensemble based on convergence (extinction, delta max, etc.).
#bring in new parents based on convergence
+
#Stop convergence.
 
#Mutations-
 
#Mutations-
##adaptive mutation rate THIS
+
##Adaptive mutation rate - change rate of mutation based on some internal criteria.
##pick location of mutation based on something
+
##Pick location of mutation based on some internal criteria.
##pick mutation type based on behavior of ensemble
+
##Pick mutation type based on ensemble behavior.
##molecules too large boost deletion
+
##If molecules are too large, boost deletion (useful for elitism).
##molecules too small, add more groups
+
##If molecules are too small, boost additive mutations.
##change ...boost replace/sub
+
##If molecules are too similar, boost replacements and substitutions.  
 
##mutation type selection based on probability vs ensemble
 
##mutation type selection based on probability vs ensemble
##complete x # y mutation so far so less prevalent etc
+
##Complete x # y mutation so far so less prevalent etc
##3 layer subs do no work so don't do them
+
##Note: 3 layer substitutions probably aren't going to work.
##replace > 1 segment
+
##2-layer replacements.
 
#fitness-
 
#fitness-
 
##turn on and off niching adaptive/extinction
 
##turn on and off niching adaptive/extinction
Line 54: Line 68:
 
##pareto/mulitobjective ga
 
##pareto/mulitobjective ga
 
#selection-
 
#selection-
##metropolis selection for tournament/roulette
+
##Adaptively keep differing #'s of parents and children based on some internal criteria (similarity?)
##adaptive keep #p and #o
 
 
#extinction-  
 
#extinction-  
##user defined point vs on-the-fly convergence THIS
 
#stop-
 
##convergence
 
 
#which molecules are best-
 
#which molecules are best-
 
##best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used
 
##best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used
 
## geometric diversity using Hingarian and Tan pruning
 
## geometric diversity using Hingarian and Tan pruning
 +
#Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN.
 +
#Choose crossover pairs based on measure of similarity (boost crossovers between dissimilar molecules).
  
 
== Known Bugs ==
 
== Known Bugs ==

Latest revision as of 11:09, 17 October 2022

Tasks src Owner Complete?
Horizontal pruning issue not catching chemically identical molecules LEP no
Fix bug that collapsed atom coordinates everywhere? nowhere? somewhere. LEP YES MUAHAHAHAHA!
Added Delimeter header conf_gen_ga LEP Yes
Fix xover only feature conf_gen_ga LEP Yes
Put in error messages for mut_rate > 1 conf_gen_ga LEP Yes
Manual user-defined mutation type conf_gen_ga LEP Yes
Remove check only option conf_gen_ga LEP Yes
Add single molecule evolution in testcase in install dir. install/test/genetic LEP yes
Add leading 0's to xover output filenames conf_gen_ga.cpp JDB no
DNM replacement unable to build list? conf_gen_ga.cpp JDB no
Multi-layer replacement for Amides conf_gen_ga.cpp JDB no
Compute delta slope of fitness score congen_ga.cpp BTB no
slow down molecular evolution so there are less drastic canges between each successive generation congen_ga.cpp BTB no
bring in new parents (e.g. from a pool of molecules) based on convergence confgen_ga.cpp BTB no
user defined point vs on-the-fly convergence confgen_ga.cpp BTB no
metropolis selection for tournament/roulette conf_gen_ga.cpp BTB no



To Do List

  1. tanimoto coefficient percent change - might be inaccurate due to tan coef behavior
  2. Rotatable bond changes (???)
  3. Limit number of aromatic rings.
  4. -xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS
  5. Nonexhaustive xover (pick subset of xover based on probability)
  6. Nonexhaustive xover for each pair of parents - have a set number of bonds for xover rather than trying all of them.
  7. Crossover on multiple points simultaneously (2-3 crossovers on a given pair to make 3+ children rather than 2+).
  8. Adaptive maintenance of ensemble based on convergence (extinction, delta max, etc.).
  9. Stop convergence.
  10. Mutations-
    1. Adaptive mutation rate - change rate of mutation based on some internal criteria.
    2. Pick location of mutation based on some internal criteria.
    3. Pick mutation type based on ensemble behavior.
    4. If molecules are too large, boost deletion (useful for elitism).
    5. If molecules are too small, boost additive mutations.
    6. If molecules are too similar, boost replacements and substitutions.
    7. mutation type selection based on probability vs ensemble
    8. Complete x # y mutation so far so less prevalent etc
    9. Note: 3 layer substitutions probably aren't going to work.
    10. 2-layer replacements.
  11. fitness-
    1. turn on and off niching adaptive/extinction
    2. reduce boost of fragments and all poor mols with niching
    3. pareto/mulitobjective ga
  12. selection-
    1. Adaptively keep differing #'s of parents and children based on some internal criteria (similarity?)
  13. extinction-
  14. which molecules are best-
    1. best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used
    2. geometric diversity using Hingarian and Tan pruning
  15. Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN.
  16. Choose crossover pairs based on measure of similarity (boost crossovers between dissimilar molecules).

Known Bugs

  1. Molecules Processed bug (dock.cpp)
  2. verbose mol stats (amber typer)
  3. molecule being renamed when going into repl even tho it's the same molecule