Jack Delany
TABLE OF CONTENTS:
Quick Start:
This section gives a quick overview of the installation procedures for the
cartridge. Please consult the detailed installation section for more
information on each of these items.
Installation:
This section gives detailed installation instructions.
Alternately, rather than granting DBA privilege to c$dcischem, you can grant
the minimal required privileges:
It is recommended by Oracle that the system tablespace not be used for
user code. Decide which user tablespace to use to store the
cartridge definitions and create or enlarge that tablespace. The cartridge
code only requires about 2MB of space, so size isn't a huge issue. For a
developmental or test system you can skip these two and just use the system
tablespace for the cartridge definitions.
and substitute the absolute path to your cartridge installation directory.
or
E-mail the hostid value to Daylight, and we'll e-mail back the required
license key. NOTE: The license key is not required in order to run
the installation scripts. The license key can be added after installation is
complete. The cartridge will not successfully run without the license key.
The output from svrmgrl will consist of quite a few notes and messages.
One should not see any error messages from svrmgrl. If one gets a message
about the COMPATIBLE parameter being set incorrectly, see the troubleshooting
section below before proceeding further. A sample of the output follows:
or
Note: The Oracle Installer will start and configure the listener if you do a
full default Oracle installation, however the listener will not be configured
to restart automatically after a reboot. It is best to verify the
configuration and state of the listener at this point, however don't be
suprised if the listener configuration is already correct.
There are two ways to modify the configuration for the listener. One can
either edit the configuration files directly or use the program 'netasst',
which provides a GUI front end to the configuration.
In the netasst window, one must configure the listener with a new address.
Under 'listeners', select the current listener and select 'ADD ADDRESS'.
The Protocol should be 'IPC', the Key should be 'EXTPROC0'. The Protocol
Stack Support should be 'Net8 Clients'.
Also in the netasst window, one must add a new 'Net Service Name'. Select the
'+' button to add a service name. You'll be prompted for items in the
following order:
Then, save the configuration and exit. If the listener isn't running on the
machine, it can be started with:
Alternately, one can edit the listener configuration files directly and add
the required entries. A typical, correct network configuration will include
the following entries in the files:
$ORACLE_HOME/network/admin/listener.ora
and
$ORACLE_HOME/network/admin/tnsnames.ora
Verify that these sections are present in the two configuration files, then
start (or restart) the listener.
If the above three queries don't give the expected results, see the next
section for troubleshooting.
Installation Troubleshooting:
There are a number of details which must be correct for the cartridge
to work properly. For a new Oracle installation, these are typically already
set correctly, however the list of issues is included here for diagnostic
purposes:
the cause is the specification of the c$dcischemlib library. Recheck the
lines in the file create.sql and verify that absolute path to the file
'ddlib.so' matches the specification in the statement:
It will be necessary to completely remove and reinstall the cartridge in order
to fix this problem. See the section on 'Cartridge De-installation'.
In this case, you must contact Daylight and get a valid license key for the
cartridge. Provide the hostid to Daylight (see installation above) and we'll
provide to you the insert statement which must be run. If you have inserted
the license key, then perhaps the c$dcischem.license table is not visible to
the current user. Try:
If this table isn't visible, then perhaps the role 'daycart' hasn't been
granted to the current user. Remember that each user must have the 'daycart'
role granted to it in order to use the cartridge. The 'daycart' role grants
select on c$dcischem.license and c$dcischem.progob, execute on the cartridge
functions, and create table privilege.
when attempting to access any Daylight Cartridge functions. In this case,
verify that the listener is running:
Note that the listener is running, and that there is a service listed for
PLSExtProc. Also, verify that the files:
$ORACLE_HOME/network/admin/listener.ora and
$ORACLE_HOME/network/admin/tnsnames.ora have the appropriate entries. If
necessary, restart the listener to make sure that any configuration file
changes have taken effect.
libclntsh.so.8.0
This file is found in $ORACLE_HOME/lib/. It is preferred that the UNIX
Oracle userid have the LD_LIBRARY_PATH set in the .profile or .cshrc to
include $ORACLE_HOME. If this is not the case, define LD_LIBRARY_PATH and
then restart the Oracle instance and listener.
Verification
The subdirectory 'dcischem/TEST' includes a number of stand-alone SQL
scripts which test the cartridge installation. The shell script 'run_tests'
will execute all the SQL scripts in turn and compare the output to reference
output files. Any discrepencies will be reported. The script 'run_tests'
does attempt to log in as 'c$dcischem/secret', so if you've changed the
password and privileges on user 'c$dcischem' you'll need to edit run_tests to
reflect this change.
The scripts do use the 'EXPLAIN PLAN' facility, which requires that the plan
table be present for the test user. The script
$ORACLE_HOME/rdbms/admin/utlxplan.sql creates the plan table; this should be
executed before running the 'run_tests' script.
Also, note that the SQL scripts give useful examples of using the cartridge
functionality in each category.
De-Installation
In order to cleanly de-install the cartridge, all dependent indexes must be
dropped from the system first. That is, any indexes created using one of the
Daylight indextypes (ddexact, ddrole, ddgraph, ddblob) should be dropped. If
one does not drop all the indexes first, then they will be marked as
invalid. If this occurs, the indexes and their associated index tables
(<indexname> || '_DDT') must be dropped manually.
After any dependent indexes have been dropped, one can run the script
'clean.sql'. This script will remove all cartridge index definitions,
functions, operators, packages, roles, and synonyms. The script 'clean.sql'
will remove everything which is installed by the script 'create.sql' with the
exception of the license and program object tables. These tables are
preserved since it is convenient to not need to recreate and populate these
tables. The user c$dcischem will require DBA privilege to be granted in order
to run the script. Example output follows:
It is safe to repeatedly run the scripts create.sql and clean.sql. The
create.sql script creates a bunch of objects, types, etc. The clean.sql
script removes them. Neither script will modify any other objects, types, or
tables owned by c$dcischem or any other user on the system. The only
components created by the create.sql script which are not removed by the
clean.sql script are the license and progob tables. These two tables are
preserved by the clean.sql script for convenience.
When one runs create.sql after clean.sql, one will see warnings that the
tables license and progob already exist, however these warnings don't impact
the outcome of the cartridge re-installation.
After running clean.sql, the database will be left in its original
pre-cartridge state. One can then delete the license and progob tables, if
desired, to eliminate all remnants of the cartridge.
Note that the user c$dcischem will continue to exist after running clean.sql
and any non-cartridge tables, objects, etc. owned by c$dcischem will be
unaffected by either clean.sql or create.sql.
The other way to remove the cartridge is to drop the user c$dcischem. Note
that this does not remove public synonyms or roles, so these should be dropped
manually (or via clean.sql).
This will remove the user c$dcischem and all dependent objects. It will be
necessary to repeat the entire installation procedure if this method is
chosen.
PL/SQL Functions, SQL Operators:
There are quite a few user-accessible PL/SQL functions and SQL operators
implemented for the cartridge. All are stateless. All of the PL/SQL
functions are contained in a PL/SQL package called 'ddpackage'. This
package is owned by 'c$dcischem'. Privileges required to access these
functions are granted by the 'daycart' role.
The default installation (create.sql) also creates public synonyms for the
package 'ddpackage' and all of the operators described herein. Hence, from
any Oracle user, one can access the functions and operators by the signature
documented below (eg: select tanimoto(...) from table rather than
select c$dcischem.tanimoto(...) from table).
The one place where synonyms aren't allowed is in indextype specifications;
hence you must refer to indextypes by their full names when creating indexes
as a user other than c$dcischem (eg. create index <name> on small(smi)
indextype is c$dcischem.ddblob).
General Purpose Functions:
The general purpose functions are typically used for debugging only.
Operators and public synonyms are not created for these functions during the
installation. Hence, the functions fsetdebug(), ftestlicense(), fhostid(),
and fversion() must be referenced by their fully-qualified names
(eg. c$dcischem.ddpackage.ftestlicense()).
This value is set on a per-session basis. Other values are reserved for
future use.
The license is contained in a special table which must be created and
populated at cartridge install time.
Comparison Functions:
Program Object Functions:
There are two lower-level functions which invoke program objects. It is
expected that they will almost always be called from a PL/SQL wrapper layer,
which would be responsible for packaging the communication between Oracle and
the program object in a meaningful way. Hence, the default cartridge
installation does not create operators for these two functions.
Valid delimiters for the lines in the 'message' parameter include the normal
line-termination characters for UNIX, Mac, and PC: '\n', '\r', and '\r\n'.
The function properly handles all three termination cases. The returned
VARCHAR2 string is delimited by UNIX line-termination '\n'.
Valid delimiters for the lines in the 'message' parameter include the normal
line-termination characters for UNIX, Mac, and PC: '\n', '\r', and '\r\n'.
The function properly handles all three termination cases. The returned
VARCHAR2 string is delimited by UNIX line-termination '\n'.
The function returns success or failure.
Extensible Indexes:
The default installation (create.sql) creates public synonyms for the
package 'ddpackage' and all of the operators. The one place where synonyms
aren't allowed is in indextype specifications; indextypes must be refered to
by their full names when creating indexes as a user other than c$dcischem
(eg. create index <namegt; on small(smi) indextype is c$dcischem.ddblob).
All four of the extensible indexes create Oracle tables to maintain their
required internal data. The ddexact, ddgraph, and ddrole indextypes use
regular Oracle tables to store the data; the ddblob indextype uses an Oracle
table with a single row containing BLOB data. These internal tables are
maintained automatically by the index code; there is normally no need for a
user or application developer to know about these auxillary tables.
There are two situations where the developer or DBA must pay attention to
these auxillary tables. First, on index creation, the auxillary data tables
are created in the users default tablespace. Second, if an index gets
corrupted, the developer or DBA may need to delete the auxillary data table
manually.
Exact lookup indextype (ddexact):
This indextype supports the exact() operator. This is the least interesting
index, as it implements a simple string comparison between the indexed column
and the query. In that sense, the built in BTree index within Oracle has
the similar functionality. The BTree index even supports the "greater
than" and "less than" operators, which this index doesn't.
The main advantage of the ddexact indextype is that it can index strings up
to 4000 bytes (the maximum varchar2 length), while the internal BTree can
only index strings up to approximately 40% of a block; for a database with
a 2048 byte blocksize, this means that the largest indexable varchar2 string
using the BTree is approximately 700 bytes. Similarly, in a future version
of the cartridge this index will support the comparison of LOB datatypes;
the BTree index does not.
This indextype uses an auxillary data table within Oracle to maintain its
index data. This table is kept in sync with the base table in real time and
strictly obeys the Oracle transaction model for all DML operations. The
name of the auxillary data table is: <index name> || '_DDT'.
The index can be created on a VARCHAR2 column with the following general
statement:
The valid form for the operator which uses the index is:
This index makes no assumptions about the content of the indexed column. The
applications are responsible for normalizing both the data in the indexed
column and and queries to the exact() operator. This allows the ddexact
indextype to properly handle both Unique and Absolute SMILES without
complication. In the following example, the USMILES column is normalized
using smi2cansmi(smiles, 0) and the ASMILES column is normalized using
smi2cansmi(smiles, 1). Note also that when performing the exact() query
on a column, the query is normalized using the same method as the column
data.
NOTE: Because this indextype accesses the base table during a index scan,
operations which modify the base table based on the results of the index scan
aren't allowed. This is similar to the case where a trigger on a column
attempts to modify the column data. The message which one receives is the
following:
In order to perform this function, one must use two steps. One option would
be:
This indextype supports the graph() and tautomer() operators. It creates a
tabular index of the graph and tautomeric information (hydrogen count and
net charge) for rapid retrieval. The indexed search only requires an index
lookup based on the graph of the query, followed by comparison of the net
charge and hydrogen count for a tautomer search.
This indextype uses a special data table within Oracle to maintain its
index data. This table is kept in sync with the base table in real time and
strictly obeys the Oracle transaction model for DML operations. The
name of the auxillary data table is: <index name> || '_DDT'.
The index can be created on a VARCHAR2 column with the following general
statement:
The valid forms for the operators to use the index are:
Unlike the ddexact index, the ddgraph indextype always interprets the column
as SMILES. Invalid SMILES are rejected. There is no concern with the ddgraph
indextype as to whether the SMILES are Unique or Absolute, as stereochemical
and isomeric information are discarded for the graph() and tautomer() lookups.
NOTE: As with the ddexact indextype, this indextype accesses the base table
during a index scan. See the previous section on the limitations to
performing modifications of the base table based on the results of an index
scan.
Reaction role index (ddrole):
This indextype supports component level searches for exact molecules within
reactions by role. It creates a tabular index of each component in every
reaction, by role. Component level searches are implemented simply by a
index lookup of the component, followed by validation of the role.
This indextype uses a special data table within Oracle to maintain its
index data. This table is kept in sync with the base table in real time and
strictly obeys the Oracle transaction model for all DML operations. The
name of the auxillary data table is: <index name> || '_DDT'.
The index can be created on a VARCHAR2 column with the following general
statement:
The valid forms for the operators to use the index are:
If a row value contains a molecule SMILES, that row is ignored for the ddrole
index. Like the ddexact index, the ddrole index makes no assumptions about
the canonicalization of the SMILES in the index column. Hence, the column
must be normalized by the application to the appropriate SMILES and the query
on the column must match that normalization (see the explanation re. ddexact
above).
NOTE: As with the ddexact indextype, this indextype accesses the base table
during a index scan. See the previous section on the limitations to
performing modifications of the base table based on the results of an index
scan.
SMARTS and Similarity search index (ddblob):
This index caches the SMILES and fingerprints in a single binary large
object for rapid searching. Substructure and similarity searches are
implemented through this blob-based image of the structural data. The blob
is persistant; it is automatically stored in the Oracle database and
retrieved for processing. The blob is kept synchronized with the base
table in real time and strictly obeys the Oracle transaction model for all
DML operations. The name of the auxillary data table which contains the
BLOBs is: <index name> || '_DDT'.
The index can be created on either a SMILES or fingerprint column, provided
that the fingerprints are all the same size. If the index is created on a
SMILES column, all six searches are implemented on that column:
contains(), matches(), tanimoto(), tversky(), euclid(), and fingertest(). If
the index is created on a fingerprint column, the four
fingerprint-specific searches (tanimoto(), tversky(), euclid(), and
fingertest()) are supported for that column.
The index creation command takes an optional parameter 'fpsize'. This
value is the number of bits that the blob index will use for fingerprint
creation.
When indexing a SMILES column, the blob will use the 'fpsize'
parameter for the creation size for fingerprints within the internal blob
index or a default value of 512 bits if the 'fpsize' is not specified.
When indexing a fingerprint column, all fingerprints within that
column must match the 'fpsize' parameter. If the 'fpsize' parameter is
not specified, the size of the first fingerprint encountered by the index
code is used. Any fingerprints which do not match this size are rejected
by the index.
The query for tanimoto(), tversky(), euclid(), and fingertest() can be
either a fingerprint or a SMILES. If a SMILES, it will be used to create
a normal fingerprint of the appropriate size for the index which will be
used for comparison. If the query is a fingerprint, the size of the
fingerprint must match the size of the fingerprint in the column (or in the
blob for a SMILES index). The indexes and operators never perform automatic
folding of fingerprints during searches.
The index can be created on a VARCHAR2 column with the following general
statements:
The valid forms for the operators to use the index are:
For a simple table, one need not create a fingerprint column or use the
fingerprint indexes. One only needs to create an index on a SMILES column to
enable all structural searching functions.
If one wants structural searches on multiple SMILES columns within a single
table, then an index must be created on each SMILES column in the table.
Furthermore, we can create and index fingerprint columns also. The index
automatically will adjust fingerprint operations based on the size of the data
in the fingerprint column. So, for example, if a column of fingerprints is
added to our test table and populated:
There are some additional sublties explained using the following example
queries:
Using Program Objects:
This section describes the configuration and use of program objects from
within Oracle.
Program objects are stand-alone executable programs which communicate via
standard input and standard output using the "Pipetalk" protocol (See the
Daylight Theory Manual for more details). The cartridge supports the ability
to execute external program objects on the same machine as the Oracle server,
and to communicate with the program objects in a robust, general way.
The table c$dcischem.progob is the dictionary of program objects available to
the Oracle server. In order to use a program object, it must be defined in
this dictionary table.
For example, consider the clogp program object. The code for the clogp
example is included in the CONTRIB subdirectory. It can be defined to the
system with:
This defines a program object with the symbolic name 'clogp'. It corresponds
to the absolute path: '/oracle/o815/progob/clogptalk.sh', which must be an
executable program object. In this case, the arguments to the program object
(whitespace delimited) are NULL.
The file '/oracle/o815/progob/clogptalk.sh' looks like the following:
When Oracle executes the program object, the environment is empty. Hence, all
required environment variables must be defined in a shell wrapper before
calling the actual program object.
Since program objects are initially executed under the Oracle userid, a
strict security policy must be obeyed to prevent access to unauthorized Oracle
privileges. The policy, checked and enforced by the cartridge code
before executing any external program object, is the following:
Running program objects are persistant on the server, and are executed on a
per-user-session basis. The first time a program object is accessed within a
user session it is started, and the program object will continue to run until
the user session is terminated. If the program object crashes during the user
session, the cartridge will restart it and retry the transaction. Because of
the ability to restart program objects, program objects ideally should be
stateless. All data required for a transaction should be passed to the
program object within that transaction.
Program object communication through the ddpackage.fprogob() function is via
VARCHAR2 or CLOB datatypes. The string types can contain one or more
delimited lines of data. Because of this line-oriented data communication,
most program objects will require a PL/SQL wrapper function to convert from
Oracle datatypes to line-oriented data and to parse returned results.
Continuing the clogp example, consider a simple function which will calculate
clogp:
Executing the fprogob() function directly using the clogptalk program object
results in a full line of data being returned from the program object for each
SMILES:
The wrapper function fclogp() simply parses out the second field, converts it
to a number, and returns it. The wrapper function hides the details from the
user and allows the function to return the desired numeric value.
More complex examples (eg. MOLFILE -> SMILES conversion) are included in the
CONTRIB directory.
Optimizer Support.
The cartridge will provide estimates of the costs of extensible index queries
to the Oracle optimizer. The Oracle optimizer uses these cost estimates for
its decision making with respect to the execution plan for a given SQL query.
The execution plan chosen by Oracle can have a dramatic effect on the time and
resources required to perform a particular query. The cartridge has the
ability to provide selectivity and resource estimates of varying precision to
Oracle, depending upon the configuration of the database index.
The default cartridge installation includes the code for performing cost
estimates, but does not enable the cartridge optimizer code. All costs for
index queries default to zero. Any Oracle query will preferrentially execute
the extensible index query first.
In order to activate the optimizer code for the Daylight cartridge, one must
execute the following commands:
Once the optimizer code is enabled, one can view the cost estimates provided
to Oracle and the execution plan chosen using the "explain plan"
command in SQL.
The command "analyze index <indexname> compute statistics"
will collect and store information about the size of the index table for the
ddgraph, ddrole, and ddexact indextypes. This size information will then be
used to refine the cost estimates provided to the Oracle optimizer. The ddblob
indextype does not require a separate "analyze index" command; the
blob index always knows its exact size.
Note that the cost and selectivity estimates are not going to be reasonable in
all cases; developers may find that forcing a particular execution order will
result in better query performance. Oracle SQL supports a number of hints
which allow the developer to control the execution order. See the section on
"Optimizer Hints" in the Oracle SQL Reference manual for more
details.
It can also be useful to specify the functional form of a query in order to
eliminate consideration of the index path for the execution plan. Consider
the following query:
By default, (assuming that both required indexes exist) this query will use
indexes for both the graph and contains search, and then will merge the
results. In most cases, it would be much more efficient to perform the graph
index search, then perform the functional version of contains() on the handful
of hits returned from the graph search. The following query will force the
contains() operation to occur functionally:
In order to disable the cartridge optimizer code, any computed statistics must
be dropped with "analyze index <indexname> delete statistics".
Then, The cartridge optimizer code can be disabled with the following commands:
Once the cartridge optimizer code is disabled, cost estimates revert to their
default values of zero cost.
SQL> create user c$dcischem identified by secret;
SQL> grant dba to c$dcischem;
SQL> grant create operator to c$dcischem;
SQL> grant create indextype to c$dcischem;
SQL> grant create table to c$dcischem;
SQL> grant create library to c$dcischem;
SQL> grant create procedure to c$dcischem;
SQL> grant create role to c$dcischem;
SQL> grant create type to c$dcischem;
SQL> grant create public synonym to c$dcischem;
SQL> create tablespace user_ts <...> # see the SQL reference manual and
# DBA guide for full syntax.
SQL> alter user c$dcischem default tablespace user_ts;
create or replace library c$dcischemlib
as '/oracle/o815/dcischem/ddlib.so';
$ /usr/bin/hostid
80ad7574
$ sqlplus c$dcischem/secret
SQL> select ddpackage.fhostid() from dual;
DDPACKAGE.FHOSTID()
---------------------------------------------------------------
80ad7574
$ svrmgrl < create.sql
$ svrmgrl < create.sql
Oracle Server Manager Release 3.1.5.0.0 - Production
(c) Copyright 1997, Oracle Corporation. All Rights Reserved.
Oracle8i Enterprise Edition Release 8.1.5.0.0 - Production
With the Partitioning and Java options
PL/SQL Release 8.1.5.0.0 - Production
SVRMGR> Connected.
SVRMGR> Echo OFF
SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR>
SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR>
SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR>
SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR>
SVRMGR> Connected.
SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> Statement processed.
<...>
Statement processed.
No errors for PACKAGE DDPACKAGE
Statement processed.
No errors for PACKAGE BODY DDPACKAGE
Statement processed.
<...>
No errors for TYPE DDEXACT_IM
Statement processed.
No errors for TYPE BODY DDEXACT_IM
<...>
No errors for TYPE DDGRAPH_IM
Statement processed.
No errors for TYPE BODY DDGRAPH_IM
<...>
No errors for TYPE DDROLE_IM
Statement processed.
No errors for TYPE BODY DDROLE_IM
<...>
No errors for TYPE DDBLOB_IM
Statement processed.
No errors for TYPE BODY DDBLOB_IM
<...>
SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR> SVRMGR>
Server Manager complete.
$
SQL> revoke dba from c$dcischem;
SQL> revoke create public synonym to c$dcischem;
SQL> revoke create type to c$dcischem;
SQL> revoke create role to c$dcischem;
SQL> revoke create operator from c$dcischem;
SQL> revoke create indextype from c$dcischem;
SQL> revoke create table from c$dcischem;
SQL> revoke create library from c$dcischem;
SQL> revoke create procedure from c$dcischem;
$ $ORACLE_HOME/bin/lsnrctl
LSNRCTL for Solaris: Version 8.1.5.0.0 - Production on 14-MAR-00 12:00:03
(c) Copyright 1998 Oracle Corporation. All rights reserved.
Welcome to LSNRCTL, type "help" for information.
LSNRCTL> start
...
LSNRCTL> quit
$
LISTENER =
(DESCRIPTION_LIST =
(DESCRIPTION =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
(PROTOCOL_STACK =
(PRESENTATION = TTC)
(SESSION = NS)
)
)
...
)
SID_LIST_LISTENER =
(SID_LIST =
(SID_DESC =
(SID_NAME = PLSExtProc)
(ORACLE_HOME = /oracle/o815)
(PROGRAM = extproc)
)
...
)
EXTPROC_CONNECTION_DATA =
(DESCRIPTION =
(ADDRESS_LIST =
(ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC0))
)
(CONNECT_DATA =
(SERVICE_NAME = PLSExtProc)
)
)
...
SQL> connect sys/<pw>
SQL> create user mug identified by coffee;
SQL> grant daycart to mug;
SQL> connect mug/coffee
SQL> select ddpackage.fhostid() from dual;
DDPACKAGE.FHOSTID()
---------------------------------------------------------------
80ad7574
SQL> select ddpackage.ftestlicense('daycart') from dual;
DDPACKAGE.FTESTLICENSE('DAYCART')
---------------------------------
1
SQL> select smi2cansmi('NCC', 0) from dual;
SMI2CANSMI('NCC',0)
---------------------------------------------------------------
CCN
SQL> create or replace operator smi2cansmi binding (varchar2, number)
*
ORA-00406: COMPATIBLE parameter needs to be 8.1.0.0.0 or greater
SQL> select ddpackage.fhostid() from dual;
*
ERROR at line 1:
ORA-06520: PL/SQL: Error loading external library
ORA-06522: ld.so.1: extprocPLSExtProc: fatal: /oracle/o85/dcischem/ddlib.so:
open failed: No such file or directory
create or replace library c$dcischemlib
as '/oracle/o815/dcischem/ddlib.so';
SQL> select ddpackage.ftestlicense('daycart') from dual;
ERROR at line 1:
ORA-29400: data cartridge error
C$DCISCHEM-003: FAILED: License not available
SQL> select * from c$dcischem.license;
PRODUCT KEY EXPIRATION
--------- -------------------------------- ----------
daycart 8befd3d06296e9b89abfb158b431d278 01-JAN-01
SQL> select smi2cansmi('NCC', 0) from dual
*
ERROR at line 1:
ORA-28575: unable to open RPC connection to external procedure agent
$ lsnrctl
LSNRCTL> status
Connecting to (DESCRIPTION ...
STATUS of the LISTENER
------------------------
Alias LISTENER
Version TNSLSNR for Solaris: Version 8.1.5.0.0 - Production
Start Date 14-MAR-00 12:00:37
Uptime 0 days 0 hr. 17 min. 15 sec
Trace Level off
Security OFF
SNMP OFF
Listener Parameter File /oracle/o815/network/admin/listener.ora
Listener Log File /oracle/o815/network/log/listener.log
Services Summary...
PLSExtProc has 1 service handler(s)
dev has 3 service handler(s)
The command completed successfully
$ run_tests
Testing dd_basic_test ...
Comparing output to reference file ...
dd_basic_test test OK.
Testing dd_exact_test ...
Comparing output to reference file ...
dd_exact_test test OK.
< ... >
All tests passed.
$ svrmgrl < clean.sql
Oracle Server Manager Release 3.1.6.0.0 - Production
Copyright (c) 1997, 1999, Oracle Corporation. All Rights Reserved.
Oracle8i Enterprise Edition Release 8.1.6.0.0 - Production
With the Partitioning option
JServer Release 8.1.6.0.0 - Production
SVRMGR> Connected.
SVRMGR> Echo OFF
SVRMGR> SVRMGR> Statement processed.
SVRMGR> SVRMGR> Connected.
Echo OFF
Statement processed.
...
Server Manager complete.
$
SQL> drop user c$dcischem cascade;
function ddpackage.fsetdebug ( value IN NUMBER ) => NUMBER
0 - No logging at all.
1 - Error messages only (the default value).
5 - Warnings and errors.
9 - Notes, warnings, and errors.
function ddpackage.ftestlicense ( product IN VARCHAR2 ) => NUMBER
function ddpackage.fhostid ( ) => VARCHAR2
function ddpackage.fversion ( which in VARCHAR2 ) => VARCHAR2
'toolkit' - Returns the current Daylight Toolkit version.
'daycart' - Returns the cartridge executable version.
function ddpackage.fsmi2cansmi ( smiles IN VARCHAR2, type IN NUMBER )
=> VARCHAR2
operator smi2cansmi ( smiles IN VARCHAR2, type IN NUMBER ) => VARCHAR2
function ddpackage.fsmi2netch ( smiles IN VARCHAR2 ) => NUMBER
operator smi2netch ( smiles IN VARCHAR2 ) => NUMBER
function ddpackage.fsmi2hcount ( smiles IN VARCHAR2 ) => NUMBER
operator smi2hcount ( smiles IN VARCHAR2 ) => NUMBER
function ddpackage.fsmi2mf ( smiles IN VARCHAR2 ) => VARCHAR2
operator smi2mf ( smiles IN VARCHAR2 ) => VARCHAR2
function ddpackage.fsmi2amw ( smiles IN VARCHAR2 ) => NUMBER
operator smi2amw ( smiles IN VARCHAR2 ) => NUMBER
function ddpackage.fsmi2graph ( smiles IN VARCHAR2 ) => VARCHAR2
operator smi2graph ( smiles IN VARCHAR2 ) => VARCHAR2
function ddpackage.fsmi2fp ( smiles IN VARCHAR2, min IN NUMBER, max IN NUMBER,
nbits IN NUMBER ) => VARCHAR2
operator smi2fp ( smiles IN VARCHAR2, min IN NUMBER, max IN NUMBER,
nbits IN NUMBER ) => VARCHAR2
function ddpackage.fsmi2xfp ( smiles IN VARCHAR2, min IN NUMBER,
max IN NUMBER, nbits IN NUMBER ) => VARCHAR2
operator smi2xfp ( smiles IN VARCHAR2, min IN NUMBER,
max IN NUMBER, nbits IN NUMBER ) => VARCHAR2
function ddpackage.ffoldfp ( fpstr IN VARCHAR2, nbits IN NUMBER,
dens IN NUMBER ) => VARCHAR2
operator foldfp ( fpstr IN VARCHAR2, nbits IN NUMBER,
dens IN NUMBER ) => VARCHAR2
function ddpackage.fbitcount ( fpstr IN VARCHAR2 ) => NUMBER
operator bitcount ( fpstr IN VARCHAR2 ) => NUMBER
function ddpackage.fnbits ( fpstr IN VARCHAR2 ) => NUMBER
operator nbits ( fpstr IN VARCHAR2 ) => NUMBER
function ddpackage.fisfp ( fpstr IN VARCHAR2 ) => NUMBER
operator isfp ( fpstr IN VARCHAR2 ) => NUMBER
function ddpackage.fatomnorm ( smiles IN VARCHAR2, list IN VARCHAR2,
ntuple IN NUMBER, isotype IN NUMBER) => VARCHAR2
operator atomnorm ( smiles IN VARCHAR2, list IN VARCHAR2,
ntuple IN NUMBER, isotype IN NUMBER) => VARCHAR2
function ddpackage.fbondnorm ( smiles IN VARCHAR2, list IN VARCHAR2,
ntuple IN NUMBER, isotype IN NUMBER) => VARCHAR2
operator bondnorm ( smiles IN VARCHAR2, list IN VARCHAR2,
ntuple IN NUMBER, isotype IN NUMBER) => VARCHAR2
function ddpackage.fexact ( a IN VARCHAR2, b IN VARCHAR2) => NUMBER
operator exact ( a IN VARCHAR2, b IN VARCHAR2) => NUMBER
function ddpackage.fgraph ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) =>
NUMBER
operator graph ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) => NUMBER
function ddpackage.ftautomer ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) =>
NUMBER
operator tautomer ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) => NUMBER
function ddpackage.freactant ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) =>
NUMBER
operator reactant ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) => NUMBER
function ddpackage.fagent ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) =>
NUMBER
operator agent ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) => NUMBER
function ddpackage.fproduct ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) =>
NUMBER
operator product ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2 ) => NUMBER
function ddpackage.fcontains ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2) =>
NUMBER
operator contains ( smiles1 IN VARCHAR2, smiles2 IN VARCHAR2) => NUMBER
function ddpackage.fmatches ( smiles IN VARCHAR2, smarts IN VARCHAR2) => NUMBER
operator matches ( smiles IN VARCHAR2, smarts IN VARCHAR2) => NUMBER
function ddpackage.feuclid ( fp_or_smi1 IN VARCHAR2, fp_or_smi2 IN VARCHAR2 )
=> NUMBER
operator euclid ( fp_or_smi1 IN VARCHAR2, fp_or_smi2 IN VARCHAR2 ) => NUMBER
function ddpackage.ftanimoto ( fp_or_smi1 IN VARCHAR2,
fp_or_smi2 IN VARCHAR2 ) => NUMBER
operator tanimoto ( fp_or_smi1 IN VARCHAR2, fp_or_smi2 IN VARCHAR2 ) => NUMBER
function ddpackage.ftversky ( fp_or_smi1 IN VARCHAR2, fp_or_smi2 IN VARCHAR2,
alpha IN NUMBER, beta IN NUMBER ) => NUMBER
operator tversky ( fp_or_smi1 IN VARCHAR2, fp_or_smi2 IN VARCHAR2,
alpha IN NUMBER, beta IN NUMBER ) => NUMBER
function ddpackage.ffingertest ( fp_or_smi1 IN VARCHAR2,
fp_or_smi2 IN VARCHAR2 ) => NUMBER
operator fingertest ( fp_or_smi1 IN VARCHAR2, fp_or_smi2 IN VARCHAR2 )
=> NUMBER
function ddpackage.fprogob ( name IN VARCHAR2, message IN VARCHAR2) => VARCHAR2
function ddpackage.fprogob ( name IN VARCHAR2, message IN OUT CLOB,
answer IN OUT CLOB) => NUMBER
SQL> create index <index name> on <table> ( <column> )
indextype is c$dcischem.ddexact;
SQL> select * from <table> where exact(<column>, <SMILES>) = 1;
SQL> create table test (id number, usmiles varchar2(4000),
asmiles varchar2(4000));
SQL> desc test;
Name Null? Type
--------------------------------- -------- ----------------------------
ID NUMBER
USMILES VARCHAR2(4000)
ASMILES VARCHAR2(4000)
SQL> insert into test values (1234, smi2cansmi('Cl/C=C/Cl', 0),
smi2cansmi('Cl/C=C/Cl', 1));
SQL> select * from test;
ID USMILES ASMILES
-------- ------- -------
1234 ClC=CCl Cl/C=C/Cl
SQL> create index usmi_index on test(usmiles) indextype
is c$dcischem.ddexact;
SQL> create index asmi_index on test(asmiles) indextype
is c$dcischem.ddexact;
SQL> select * from test
where exact(usmiles, smi2cansmi('Cl/C=C/Cl', 0)) = 1;
ID USMILES ASMILES
-------- ------- -------
1234 ClC=CCl Cl/C=C/Cl
SQL> select * from test
where exact(asmiles, smi2cansmi('Cl/C=C/Cl', 1)) = 1;
ID USMILES ASMILES
-------- ------- -------
1234 ClC=CCl Cl/C=C/Cl
SQL> delete from <table> where exact(smiles, 'CCN') = 1;
ERROR at line 1:
ORA-29903: error in executing ODCIIndexFetch() routine
ORA-29400: data cartridge error
ERROR ORA-04091: table C$DCISCHEM.T1 is mutating, trigger/function may
not see it
C$DCISCHEM-101: CARTRIDGE ERROR
ORA-06512: at "C$DCISCHEM.DDEXACT_IM", line 159
ORA-06512: at line 1
SQL> create table <temptable> (zapcol) as
select rowid from <table> where exact(smiles, 'CCN') = 1;
SQL> delete from <table> where rowid in
(select zapcol from <temptable>)
Graph and Tautomer indextype (ddgraph):
SQL> create index <index name> on <table> ( <SMILES column> )
indextype is c$dcischem.ddgraph;
SQL> select * from <table> where graph(<SMILES column>, <SMILES>) = 1;
SQL> select * from <table> where tautomer(<SMILES column>, <SMILES>) = 1;
SQL> create index <index name> on <table> ( <SMILES column> )
indextype is c$dcischem.ddrole;
SQL> select * from <table> where reactant(<SMILES column>, <SMILES>) = 1;
SQL> select * from <table> where agent(<SMILES column>, <SMILES>) = 1;
SQL> select * from <table> where product(<SMILES column>, <SMILES>) = 1;
-- Default internal fingerprint size of 512 bits
SQL> create index <index name> on
<table> ( <SMILES column> ) indextype is c$dcischem.ddblob;
-- Specify internal fingerprint size of 64 bits
SQL> create index <index name> on
<table> ( <SMILES column> ) indextype is c$dcischem.ddblob
parameters ('fpsize=64');
-- Fpsize taken from size of data in FP column (fpsize parameter, if it were
specified, must match the size of data in the column)
SQL> create index <index name> on
<table> ( <FP column> ) indextype is c$dcischem.ddblob;
SQL> select * from <table>
where contains(<SMILES column>, <SMILES query>) = 1;
SQL> select * from <table>
where matches(<SMILES column>, <SMARTS query>) = 1;
SQL> select * from <table>
where tanimoto(<SMILES or FP column>, <SMILES or FP>) > 0.8;
SQL> select * from <table>
where tversky(<SMILES or FP column>, <SMILES or FP>, 0.5, 0.5) > 0.8;
SQL> select * from <table>
where euclid(<SMILES or FP column>, <SMILES or FP>) < 0.2;
SQL> select * from <table>
where fingertest(<SMILES or FP column>, <SMILES or FP>) = 1;
-- We specified a preferred size for the internally generated fingerprints
-- as 1024 bits, overriding the default of 512 bits. These fingerprints
-- are internal to the blob and aren't visible.
SQL> create index asmi_blob_index on test(asmiles) indextype is
c$dcischem.ddblob parameters ('fpsize=1024');
SQL> select * from test where matches(asmiles, '[N;H1,H2]ccC=O') = 1;
SQL> select * from test where tanimoto(asmiles, 'c1ccccc1') > 0.8;
-- A fingerprint can be used directly as a query for tanimoto(),
-- euclid(), fingertest() and tversky() instead of a SMILES. The
-- fingerprint size must match the size of the fingerprints used in
-- index creation. In this case, 1024 bits.
SQL> select * from test where
tanimoto(asmiles, smi2fp('OC(=O)CS', 0, 7, 1024)) > 0.8;
SQL> alter table test add (fp varchar (15));
SQL> update test set fp = smi2fp(usmiles, 0, 7, 64);
-- This index, on the FP column, uses the size of the fingerprints in
-- the column (64 bits, in this case) in index creation.
SQL> create index fp_blob_index on test(fp) indextype is
c$dcischem.ddblob;
-- As with an index on a SMILES column, a SMILES query or a fingerprint
-- can be used as a query for tanimoto(), euclid(), fingertest() and
-- tversky().
SQL> select smi2fp('OC(=O)CS', 0, 7, 64) from dual;
SMI2FP('OC(=O)CS',0,7,64)
--------------------------
IM9vA4w127g.2
-- The following two queries give identical results.
SQL> select * from test where tanimoto(fp, 'IM9vA4w127g.2') > 0.8;
SQL> select * from test where tanimoto(fp, 'OC(=O)CS') > 0.8;
-- This screens and searches using the index; internally it uses 1024 bit
-- fingerprints. The output is generated using the functional form, at
-- its default of 512 bits. Remember that the functional forms will use a
-- fingerprint size of 512 bits unless one or both of the parameters is a
-- fingerprint. The tanimoto values output by the functional form may not
-- match our search criteria based on the folding-related differences
-- between 1024-bit and 512-bit fingerprints.
SQL> select smiles, tanimoto(asmiles, 'NCCc1ccccc1') from test
where tanimoto(asmiles, 'NCCc1ccccc1') > 0.8;
-- This screens and searches using the index; internally using 1024 bit
-- fingerprints. The output is generated using the functional form; this
-- time calculated at 1024 bits. By passing a fingerprint of 1024 bits
-- to the functional form of tanimoto(), we force its use of 1024 bits.
-- In this case, the tanimoto values output by the functional form will
-- exactly match those used by the index.
SQL> select smiles, tanimoto(asmiles, smi2fp('NCCc1ccccc1', 0, 7, 1024))
from test where tanimoto(asmiles, 'NCCc1ccccc1') > 0.8;
-- This searches using the index f1, using the 128 bit fingerprints from
-- the actual column for comparison. Again, the tanimoto values computed
-- by the functional form are calculated at 512 bits and will likely not
-- match those used by the index for the query.
SQL> select smiles, tanimoto(fp, 'NCCc1ccccc1') from test
where tanimoto(fp, 'NCCc1ccccc1') > 0.8;
SQL> desc progob
Name Null? Type
--------------------------------------- -------- -------------------------
NAME VARCHAR2(60)
PATH VARCHAR2(4000)
ARGS VARCHAR2(4000)
SQL> insert into progob values
('clogp', '/oracle/o815/progob/clogptalk.sh', NULL);
#!/bin/sh
DY_ROOT=/usr3/thor/day462/v462
export DY_ROOT
DY_LICENSEDATA=/usr/local/daylight/dy_license.dat
export DY_LICENSEDATA
LD_LIBRARY_PATH=/opt/SUNWspro/SC2.0:/usr/openwin/lib:/usr/lib:$DY_ROOT/lib
export LD_LIBRARY_PATH
$DY_ROOT/bin/clogptalk
create function fclogp (sosdata in varchar2) return number
as
v1 varchar2(4000);
rc number;
off1 number;
off2 number;
begin
v1 := c$dcischem.ddpackage.fprogob('clogp', sosdata);
off1 := instr(v1, ' ', 1, 1);
off2 := instr(v1, ' ', 1, 2);
rc := to_number(substr(v1, off1, off2 - off1));
return rc;
end;
SQL> select ddpackage.fprogob('clogp', 'c1ccccc1') from dual;
DDPACKAGE.FPROGOB('CLOGP','C1CCCCC1')
-----------------------------------------------------------------------------
c1ccccc1 2.142 0 LogPstar: 2.13
SQL> select fclogp('c1ccccc1') from dual;
FCLOGP('C1CCCCC1')
------------------
2.142
SQL> connect c$dcischem/secret
SQL> associate statistics with packages ddpackage using ddoptimizer;
SQL> associate statistics with indextypes ddexact using ddoptimizer;
SQL> associate statistics with indextypes ddgraph using ddoptimizer;
SQL> associate statistics with indextypes ddrole using ddoptimizer;
SQL> associate statistics with indextypes ddblob using ddoptimizer;
SQL> select * from <table> where
graph(smiles, 'c1ccccc1') = 1 and
contains(smiles, 'C') = 1;
SQL> select * from <table> where
graph(smiles, 'c1ccccc1') = 1 and
ddpackage.fcontains(smiles, 'C') = 1;
SQL> connect c$dcischem/secret
SQL> disassociate statistics with indextypes ddblob using ddoptimizer;
SQL> disassociate statistics with indextypes ddgraph using ddoptimizer;
SQL> disassociate statistics with indextypes ddrole using ddoptimizer;
SQL> disassociate statistics with indextypes ddexact using ddoptimizer;
SQL> disassociate statistics with packages ddpackage using ddoptimizer;