DOCK GA Development Goals

From Rizzo_Lab
Revision as of 10:59, 1 August 2024 by BrockBoysan (talk | contribs)
(diff) ←Older revision | view current revision (diff) | Newer revision→ (diff)
Jump to: navigation, search
Tasks src Owner Complete?
Horizontal pruning issue not catching chemically identical molecules LEP no
Fix bug that collapsed atom coordinates everywhere? nowhere? somewhere. LEP YES MUAHAHAHAHA!
Added Delimeter header conf_gen_ga LEP Yes
Move ga_utilities to db_utilities, GA utilities requires running evolution, strips the name, etc. conf_gen_ga  ? no
Fix xover only feature conf_gen_ga LEP Yes
Put in error messages for mut_rate > 1 conf_gen_ga LEP Yes
Manual user-defined mutation type conf_gen_ga LEP Yes
Remove check only option conf_gen_ga LEP Yes
Add single molecule evolution in testcase in install dir. install/test/genetic LEP yes
Add leading 0's to xover output filenames conf_gen_ga.cpp JDB no
DNM replacement unable to build list? conf_gen_ga.cpp JDB no
Multi-layer replacement for Amides conf_gen_ga.cpp JDB no
Compute delta slope of fitness score congen_ga.cpp BTB no
slow down molecular evolution so there are less drastic canges between each successive generation congen_ga.cpp BTB no
bring in new parents (e.g. from a pool of molecules) based on convergence confgen_ga.cpp BTB no
user defined point vs on-the-fly convergence confgen_ga.cpp BTB no
metropolis selection for tournament/roulette conf_gen_ga.cpp BTB no
Selection of fragments for mutation based on Fragment Score conf_gen_ga.cpp BTB no



To Do List

  1. tanimoto coefficient percent change - might be inaccurate due to tan coef behavior
  2. Rotatable bond changes (???)
  3. Limit number of aromatic rings.
  4. -xover (guided based on score) - Good v Good ; Bad v Good ; Bad v Bad THIS
  5. Nonexhaustive xover (pick subset of xover based on probability)
  6. Nonexhaustive xover for each pair of parents - have a set number of bonds for xover rather than trying all of them.
  7. Crossover on multiple points simultaneously (2-3 crossovers on a given pair to make 3+ children rather than 2+).
  8. Adaptive maintenance of ensemble based on convergence (extinction, delta max, etc.).
  9. Stop convergence.
  10. Mutations-
    1. Adaptive mutation rate - change rate of mutation based on some internal criteria.
    2. Pick location of mutation based on some internal criteria.
    3. Pick mutation type based on ensemble behavior.
    4. If molecules are too large, boost deletion (useful for elitism).
    5. If molecules are too small, boost additive mutations.
    6. If molecules are too similar, boost replacements and substitutions.
    7. mutation type selection based on probability vs ensemble
    8. Complete x # y mutation so far so less prevalent etc
    9. Note: 3 layer substitutions probably aren't going to work.
    10. 2-layer replacements.
  11. fitness-
    1. turn on and off niching adaptive/extinction
    2. reduce boost of fragments and all poor mols with niching
    3. pareto/mulitobjective ga
  12. selection-
    1. Adaptively keep differing #'s of parents and children based on some internal criteria (similarity?)
  13. extinction-
  14. which molecules are best-
    1. best first pruning - now uses descriptor score even if niching ned to delta to fitness/niching when used
    2. geometric diversity using Hingarian and Tan pruning
  15. Determine whether the dummy vdw file is necessary for all DN (including GA) or just DN.
  16. Choose crossover pairs based on measure of similarity (boost crossovers between dissimilar molecules).

Known Bugs

  1. Molecules Processed bug (dock.cpp)
  2. verbose mol stats (amber typer)
  3. molecule being renamed when going into repl even tho it's the same molecule