GADD
MUG'01 -- 6-9 March 2001 -- Santa Fe, NM
previous
|
toc
|
next
Jeremy Yang
The GADD Algorithm
Create FRAGs and CFRAGs files, possibly by running
cpfrags
and
ringsmi
on a suitable "seed" database.
Create a Thor database. Load with initial population of FRAGs and CFRAGs. Call this generation zero. More common fragments have lower ID numbers.
smiles in database may be:
FRAGs: fragments (odd id #)
CFRAGs: carbon fragments (even id #)
UMOLs: unfinished molecules (1+ FRAG, 1+ CFRAG, 1+ attachments)
FMOLs: finished molecules (1+ FRAG, 1+ CFRAG, 0 attachments)
LOOP until a QUIT CONDITION:
For each smiles in database, attempt one mating with another smiles:
FRAGs mate with CFRAGs.
CFRAGs mate with FRAGs.
UMOLs mate with FRAGs or CFRAGs.
FMOLs don't mate.
Select fragments with probability distribution reflecting the composition of the seed database.
Mating is accomplished with one of three smirks for single, double, and triple attachment points.
Zap any leftover disconnected wildcard atoms.
Write to database each new, fit smiles (normally UMOL at this stage).
Test each UMOL for "loose" fitness.
Convert UMOLs to FMOLs by H-terminating attachment points.
Test each FMOL for final "tight" fitness.
Write to database each new, fit FMOL smiles.
Optionally, sample fit smiles.
Note: N[I] = number of FMOLs.
QUIT CONDITIONS:
population limit reached (N[I]==Nmax)
generation limit reached (I==Imax)
population converges (N[I]==N[I-1])
ENDGAME:
Delete all UMOLs.
All final smiles in db should be FRAGs, CFRAGs, or FMOLs.
FITNESS CONDITIONS:
Mol Weight [default 200-500]
HB donors [default 0-5]
HB acceptors [default 0-10]
Rotatable bonds [default 0-8]
Heavy Atoms [default 15-50]
ClogP [default -10 to +5]
Charge [default -2 to +2]
Reject matches with -badsmarts file.
MUG'01 -- 6-9 March 2001 -- Santa Fe, NM
Daylight Chemical Information Systems Inc.
info@daylight.com