Difference between revisions of "De novo Developer Progress"
From Rizzo_Lab
(→Current Coding Progress:) |
(→The input file we should be using:) |
||
(22 intermediate revisions by the same user not shown) | |||
Line 2: | Line 2: | ||
<br> | <br> | ||
− | === | + | === Valgrind clean version of the code on cluster that Rizzo lab should be using: === |
Lauren: | Lauren: | ||
− | /gpfs/ | + | /gpfs/projects/rizzo/zzz.programs/dock6.9_release |
− | This version includes the | + | This version includes all changes of the merge. |
− | |||
− | |||
Path to Generic Fragment Library: | Path to Generic Fragment Library: | ||
Line 18: | Line 16: | ||
=== Current Coding Progress: === | === Current Coding Progress: === | ||
Working on these currently: | Working on these currently: | ||
− | # Lauren: Check | + | |
− | + | # Lauren: Check MGS+(-50)TAN before and after fingerprinting fix for 663 systems | |
# Lauren: Implement Roulette fragment picking into graph and random as an option | # Lauren: Implement Roulette fragment picking into graph and random as an option | ||
− | # Lauren: | + | # Lauren: Implement Adjacency Matrix into fraglib/dn (initialize matrix and utilize matrix for graph and random fragment picking) |
− | |||
# Lauren: Testing with sb2012 default values for molecules passed on to the next layer and root (looking for timing and efficiency). | # Lauren: Testing with sb2012 default values for molecules passed on to the next layer and root (looking for timing and efficiency). | ||
− | # | + | # Lauren&John: Rework VS protocol to integrate de novo protocol more smoothly |
− | |||
− | |||
Line 41: | Line 36: | ||
# Lauren: Adjacency matrix vs tors env | # Lauren: Adjacency matrix vs tors env | ||
# Lauren: Addition of "3mer" combination fragment check (post tors check) | # Lauren: Addition of "3mer" combination fragment check (post tors check) | ||
− | # Lauren: Min and Max for charge to replace absolute value of charge.( | + | # Lauren: Min and Max for charge to replace absolute value of charge.(Broke everything) |
# Lauren: Capping groups for post growth process (halogens and methyls) | # Lauren: Capping groups for post growth process (halogens and methyls) | ||
− | + | # Lauren: Fix Frag_String output into chimera for Refinement situations (current space can remove the spaces in the mol2 file - temp fix) | |
+ | # Stephen: Change scaling factor to a function of decay (currently a straight line to lowest score cutoff) | ||
+ | # Lauren&John: SMILEs and ZINC script (for dn and ga) | ||
+ | # Lauren: incorporate tan pruning as final step (post growth) as user option (replace make_unique script) | ||
<br> | <br> | ||
Completed: | Completed: | ||
+ | # <strike>Lauren: determine if random seed is reset for each aps</strike> | ||
+ | #<strike> Lauren: Create testset for each dn function </strike> | ||
+ | # <strike>Lauren: Test simple build function with merged de novo </strike> | ||
+ | # <strike>Lauren&Stephen: clean make_unique script for release</strike> | ||
+ | # <strike>Lauren: merge GA into dock/dn </strike> | ||
# <strike>Dwight & Lauren: MPI wrapper for 192 processors (8 nodes) for testsets on rizzo cluster </strike> | # <strike>Dwight & Lauren: MPI wrapper for 192 processors (8 nodes) for testsets on rizzo cluster </strike> | ||
# <strike>Lauren: Create short testsets for denovo frag gen, focused fragment generic for DOCK6.9 release </strike> | # <strike>Lauren: Create short testsets for denovo frag gen, focused fragment generic for DOCK6.9 release </strike> | ||
# <strike>Dwight+Lauren: merge parameter files of de novo with DOCK </strike> | # <strike>Dwight+Lauren: merge parameter files of de novo with DOCK </strike> | ||
− | # <strike>Lauren: Implement csingleton fix for orienting fragments with less than 3 heavy atoms</strike> | + | # <strike>Lauren: add dn_defn file for separate defn with Hydrogens </strike> |
− | # <strike>Lauren: Test bfochtman fix for rotatable bonds within an user defined anchor</strike> | + | # <strike>Lauren: Implement csingleton fix for orienting fragments with less than 3 heavy atoms </strike> |
− | # <strike>Stephen: editting script to calculate SMILE string of de novo molecules in OpenBable</strike> | + | # <strike>Lauren: Test bfochtman fix for rotatable bonds within an user defined anchor </strike> |
+ | # <strike>Lauren: Test csingleton fix for orienting fragments with Du </strike> | ||
+ | #<strike>Lauren: Test MGS focused fragment library results with dn paper </strike> | ||
+ | # <strike>Stephen: editting script to calculate SMILE string of de novo molecules in OpenBable </strike> | ||
<br> | <br> | ||
− | === List of features that we definitely want for the 6. | + | === List of features that we definitely want for the 6.9 release: === |
{| border="1" cellpadding="8" cellspacing="0" style="background:white; text-align:left; width:90%" | {| border="1" cellpadding="8" cellspacing="0" style="background:white; text-align:left; width:90%" | ||
Line 63: | Line 69: | ||
! style="width:10%" !|Complete? | ! style="width:10%" !|Complete? | ||
|- | |- | ||
− | |Smooth pruning scaling function || LEP || | + | |<strike>Smooth pruning scaling function</strike> || LEP || |
|- | |- | ||
− | |Roulette function to Random and Graph as an option || LEP || | + | |<strike>Roulette function to Random and Graph as an option</strike> || LEP || |
|- | |- | ||
− | |Overhaul the simple build function || LEP || | + | |<strike>Overhaul the simple build function</strike> || LEP || |
|- | |- | ||
|When minimizing with descriptor score, make sure fingerprint is turned off || xxx || | |When minimizing with descriptor score, make sure fingerprint is turned off || xxx || | ||
Line 104: | Line 110: | ||
=== List of features/ideas for future releases: === | === List of features/ideas for future releases: === | ||
+ | * Using different references for different layers of dn growth | ||
* Stereo centers / volume overlap pruning | * Stereo centers / volume overlap pruning | ||
* Capping group functions (H, CH3, Halogen) | * Capping group functions (H, CH3, Halogen) | ||
Line 128: | Line 135: | ||
<br> | <br> | ||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− | |||
− |
Latest revision as of 11:53, 4 February 2019
This is the Rizzo lab wiki page for coordinating bugs and progress on the de novo project.
Contents
Valgrind clean version of the code on cluster that Rizzo lab should be using:
Lauren:
/gpfs/projects/rizzo/zzz.programs/dock6.9_release This version includes all changes of the merge.
Path to Generic Fragment Library:
/gpfs/projects/rizzo/leprentis/gen-frags-12
Path to Frequency Anchors:
/gpfs/projects/rizzo/leprentis/zinc1_ancs_freq
Current Coding Progress:
Working on these currently:
- Lauren: Check MGS+(-50)TAN before and after fingerprinting fix for 663 systems
- Lauren: Implement Roulette fragment picking into graph and random as an option
- Lauren: Implement Adjacency Matrix into fraglib/dn (initialize matrix and utilize matrix for graph and random fragment picking)
- Lauren: Testing with sb2012 default values for molecules passed on to the next layer and root (looking for timing and efficiency).
- Lauren&John: Rework VS protocol to integrate de novo protocol more smoothly
Need to be fixed:
- score_molecules and internal_energy problem (for simple_build)
- HMS needs to fixed when no heavy atoms matching
- Get rid of some scoring functions and only use descriptor score
Not working on these right now:
- Lauren: Adjacency matrix vs tors env
- Lauren: Addition of "3mer" combination fragment check (post tors check)
- Lauren: Min and Max for charge to replace absolute value of charge.(Broke everything)
- Lauren: Capping groups for post growth process (halogens and methyls)
- Lauren: Fix Frag_String output into chimera for Refinement situations (current space can remove the spaces in the mol2 file - temp fix)
- Stephen: Change scaling factor to a function of decay (currently a straight line to lowest score cutoff)
- Lauren&John: SMILEs and ZINC script (for dn and ga)
- Lauren: incorporate tan pruning as final step (post growth) as user option (replace make_unique script)
Completed:
-
Lauren: determine if random seed is reset for each aps Lauren: Create testset for each dn function-
Lauren: Test simple build function with merged de novo -
Lauren&Stephen: clean make_unique script for release -
Lauren: merge GA into dock/dn -
Dwight & Lauren: MPI wrapper for 192 processors (8 nodes) for testsets on rizzo cluster -
Lauren: Create short testsets for denovo frag gen, focused fragment generic for DOCK6.9 release -
Dwight+Lauren: merge parameter files of de novo with DOCK -
Lauren: add dn_defn file for separate defn with Hydrogens -
Lauren: Implement csingleton fix for orienting fragments with less than 3 heavy atoms -
Lauren: Test bfochtman fix for rotatable bonds within an user defined anchor -
Lauren: Test csingleton fix for orienting fragments with Du Lauren: Test MGS focused fragment library results with dn paper-
Stephen: editting script to calculate SMILE string of de novo molecules in OpenBable
List of features that we definitely want for the 6.9 release:
Task | Owner | Complete? |
---|---|---|
LEP | ||
LEP | ||
LEP | ||
When minimizing with descriptor score, make sure fingerprint is turned off | xxx | |
Speed up fingerprint calculations by saving reference ligand as a permanent object | WJA | yep |
Add pre-min conformations to growth trees | WJA | yep |
Add verbose flag options | WJA | yep |
Put molecular properties (RB, MW, etc) in mol2 header | WJA | yep |
Put ensemble properties (RB, MW, etc) output stream at the end of each layer | WJA | yep |
Check formal charge prune | BCF | yep |
Combination of horizontal pruning metrics (let's consider dropping tanimoto prune and just using hungarian prune) | WJA | yep |
Finish implementing growth trees | WJA | yep |
Revisit orienting to make sure it is working as intended | WJA | yep |
Fixed a bug where we were marking scaffold_this_layer as true for any fragment | WJA | yep |
Update random sampling function to use last layer changes in graph function | WJA | yep |
Do that same thing for the exhaustive function | WJA | yep |
I don't think we ever clear the scaf_link_sid vector, we definitely should do that somewhere | WJA | yep |
Update exhaustive to combine all frags into one library, just like graph / random. | WJA | yep |
List of features/ideas for future releases:
- Using different references for different layers of dn growth
- Stereo centers / volume overlap pruning
- Capping group functions (H, CH3, Halogen)
- Incorporate GA at the end of each layer
- Overhaul the simple-build function
- Monte carlo algorithm that checks bond frequency
- Scaling max root / layer size with layer
- Select torenv before selecting fragment. Will need to overhaul fraggraph, will keep us from needing to assemble mols that will not pass torenv.
- Add fragname string to restart and dump files, already done for final and fraglib files.
- Add ZINC name to torenv table
- Unusual behavior during library generation when frequency cutoff == 0
- Print out how many molecules cannot be capped. (Difference between ensemble size and dump.)
- building from anchor 0 -> building from scf.98
- Possible torenv check for dump molecules after capping before printing.
- keep tables of what fragments (and torsion types) are already included in a growing molecule (i.e.e the name string has this info) and only accept a new fragment (or torsion type) within certain ranges and probabilities. In other words use knowledge of chemical makeup probabilities to keep from over including or under including certain fragment and bond types (essentially use datamining to help us only build molecules within certain boundaries)
List of SB2012 systems that we will use for tests:
For now, let's use 5-15 rotatable bonds inclusive; total = 709 systems ("drug-like" size molecules). De novo paper only used 663 systems that removed 46 systems where the cognate ligand did not fall with a +/-2 formal charge. (5through15 = 709, 5through15_ch2 = 663)
{5RB = 107; 6RB = 96; 7RB = 103; 8RB = 75; 9RB = 66; 10RB = 75; 11RB = 57; 12RB = 41; 13RB = 38; 14RB = 26; 15RB = 25}