Thor/Merlin 4.5: Fingerprint-tuples


Fingerprint-tuples

  • The fingerprint program will produce component-tuples of fingerprints (FPP datatype) when given the new -m option.

  • A database may contain both FP and FPP data which will be used as available and needed. For databases of large dot-disconnected mixtures, this can produce a significant increase in screening speed (with a penalty in database/pool size)

    Example timings using a "real-life" database

    The following timings were made for atom-level searches over a subset of the Chiron database of combinatorial libraries. These libraries were represented as large (up to 20K) dot-disconnected SMILES. The same searches were run by merlinserver locally (NPROCS = 0, without FPP optimization) and using merlinsmartstalk via the parallel processing mechanism (but with NPROCS=1, i.e., only one CPU and FPP optimization enabled). The merlin process grew very large during non-FPP-enabled searches, swapping out other processes (but not paging against itself). Timings were made by merlinserver itself (via +MERLIN_DO_ACCOUNTING).

    Elapsed time for merlin searches using different methods on a single 50 MHz CPU Sun SPARCstation-20 over database JEFF (average of 2 runs).
    NPROCS search
    target
    search
    task
    search
    process
    elapsed
    time, S
    speed
    factor
    0 Fc1ccccc1O superstructure search merlinserver 1066.36   5528 X
    Fc1ccccc1O similarity search merlinserver 0.519
    similarity distribution sort merlinserver 0.002
    Total elapsed time, S: 1066.881
    1 Fc1ccccc1O superstructure search program object 0.190
    Fc1ccccc1O similarity search merlinserver 0.001
    similarity distribution sort merlinserver 0.002
    Total elapsed time, S: 0.193
    0 n1nscc1 superstructure search merlinserver 137.725   650 X
    n1nscc1 similarity search merlinserver 0.002
    similarity distribution sort merlinserver 0.002
    Total elapsed time, S: 137.729
    1 n1nscc1 superstructure search program object 210
    n1nscc1 similarity search merlinserver 0.001
    similarity distribution sort merlinserver 0.001
    Total elapsed time, S: 0.212

    Daylight Chemical Information Systems, Inc.
    info@daylight.com