Chemistry Cartridge: SIMILAR/SCORE

Daylight 4.61, dayblob 46107, Oracle Server 8.13

The Chemistry Cartridge adds the SIMILAR operator to SQL. When applied to a SMILES-containing column, rows containing structures which are similar to a given structure are selected. The binary Tanimoto coefficient is used for similarity determination: a value of 1.0 means "identical", 0.0 means "completely dissimliar". The level of similarity to be required is user-adjustable. At high levels (~0.8 or more), this is a fast, robust search type. At low levels (~0.65 or less), the results are not meaningful and result in large answer sets.

This example finds structures at least 0.80 similar to a given structure using svrmgr (rather than sqlplus to show the auxilliary operator score in action. Other examples are also available.


Oracle table: COMPOUND_118616 (nci95 structures)
Show similarity of structures at least 0.80 similar to this structure:


NCCc1ccc(O)c(O)c1

SVRMGR session follows (show .sql file):


Oracle Server Manager Release 3.1.3.0.0 - Beta

(c) Copyright 1997, Oracle Corporation.  All Rights Reserved.

Oracle8 Enterprise Edition Release 8.1.3.0.0 - Beta
With the Partitioning and Objects options
PL/SQL Release 8.1.3.0.0 - Beta

SVRMGR> Connected.
Timing                          ON
COMPOUND_I  SCORE(1)   SMILES                       
---------- ----------- ----------------------------
      5469 0.808510661 COc1ccc(CCN)c(C)c1OC
      8779 1.000000000 NCCc1ccc(O)c(O)c1
     16732 0.894117653 COc1ccc(CCN)cc1O
     29783 0.915662646 CNCCc1ccc(O)c(O)c1
     33028 0.974358976 NCCc1cc(O)c(O)c(O)c1
     33812 0.904761910 C[N+](C)(C)CCc1ccc(O)c(O)c1
     35638 0.845238090 CC(N)Cc1ccc(O)cc1
     50810 0.843373477 C[N+](C)(C)CCc1ccc(O)cc1
     55075 0.844444454 Cc1c(O)c(O)c(C)c(CCN)c1O
     57923 0.835164845 NCC(O)c1ccc(O)c(O)c1
     74983 0.870588243 COc1cccc(CCN)c1
     77534 0.817204297 NCC(=O)c1ccc(O)c(O)c1
     83896 0.817204297 COc1cc(CCN)cc(OC)c1OC
     85586 0.853932559 COc1ccc(CCN)cc1OC
     91200 0.800000012 COc1cc(CC(C)N)ccc1O
     93815 0.843373477 CN(C)CCc1ccc(O)cc1
     96782 0.822222233 NCC(O)c1cccc(O)c1
    102401 0.921052635 NCCc1ccc(O)cc1
    110045 0.915662646 NCCc1cc(O)c(O)cc1O
    111003 0.853658557 CNCCc1ccc(O)cc1
    114207 0.833333313 COc1ccc(CCN)cc1
    114600 0.883720934 COc1cc(CCN)ccc1O                                                                
22 rows selected.
Parse             0.00 (Elapsed)     0.00 (CPU)
Execute/Fetch     1.09 (Elapsed)     0.01 (CPU)
Total             1.09               0.01
Timing                          OFF
SVRMGR> 
Server Manager complete.


Daylight Chemical Information Systems, Inc.
info@daylight.com