Daylight Summer School

Lab: DayCart extensions to SQL

For HINTS

This set of labs guides the user through the use of Daycart functionality in SQL and PL/SQL.


LOAD DEMO DATA:

The first operation will be to load the demo tables into your local Oracle instance. The data is available from the ftp server. The filenames are: tcm01.dmp, demo.sql, acd_demo.sql, wdi_demo.sql, and wombat_demo.sql.

From Import files:

The file "tcm01.dmp" is an Oracle dump file. It was generated using the Oracle "export" program, a utility which allows one to dump data from an Oracle database into a more portable binary format. The "import" utility will load this binary data into the database.

$ imp mug/coffee
Import: Release 8.1.5.0.0 - Production on Mon May 24 15:52:51 2004

(c) Copyright 1999 Oracle Corporation.  All rights reserved.


Connected to: Oracle8i Enterprise Edition Release 8.1.5.0.0 - Production
With the Partitioning and Java options
PL/SQL Release 8.1.5.0.0 - Production

Import file: expdat.dmp > tcm01.dmp

Enter insert buffer size (minimum is 8192) 30720> <CR>

Export file created by EXPORT:V08.01.05 via conventional path

Warning: the objects were exported by TCM, not by you

import done in US7ASCII character set and US7ASCII NCHAR character set
List contents of import file only (yes/no): no > <CR>

Ignore create error due to object existence (yes/no): no > <CR>

Import grants (yes/no): yes > <CR>

Import table data (yes/no): yes > <CR>

Import entire export file (yes/no): no > yes

. importing TCM's objects into MUG
. . importing table                    "TCM01_MOL"       6780 rows imported
. . importing table                  "TCM01_PLANT"       1541 rows imported
. . importing table              "TCM01_PLANT_MOL"      12102 rows imported
. . importing table                    "TCM01_REF"        656 rows imported
. . importing table                "TCM01_MOL_REF"       7917 rows imported
Import terminated successfully without warnings.

From SQL Scripts:

Regular SQL scripts can be executed directly from within the SQLPlus environment:

$ sqlplus mug/coffee
SQL> @demo

...  (lots of output)

SQL> @acd_demo

...  (lots of output)

SQL> @wdi_demo

...  (lots of output)

SQL> @wombat_demo

...  (lots of output)

The main table of interest is: demo. It has chemical structures and ID numbers. The other tables are subsets of the ACD, WDI, and Wombat databases.

You can find out the names of the tables which are owned by the current user with a SQL select statement (from the data dictionary):

SQL>  select table_name from user_tables;

You can get information about the columns in the tables with the 'describe' command. For example:

SQL>  desc demo


Simple SQL Operators

  1. Find Dopamine (NCCc1ccc(O)c(O)c1) in the "demo" table.

  2. Find structures similar to Dopamine in the "demo" table. Impose a tanimoto cutoff of 0.7. Alternately, find the ten most similar structures to Dopamine in the "demo" table.

  3. Count the number of compounds in the demo table which have thiophenol (Sc1ccccc1) as a substructure

  4. Perform a SMARTS query on the demo table.


PL/SQL Functions

  1. Write a simple PL/SQL routine to compute the number of atoms and rings in naphthalene (c1cccc2c1cccc2).

  2. Create a PL/SQL function which implements a new (trivial) molecular property, the ring ratio, here defined as the fraction of atoms in a molecule which are in a ring.


Oracle Data Manipulation

  1. Create a demo table as a subset of tcm01_mol. Only include the first 100 records from tcm01_mol.

  2. Add a column for average molecular weight. Populate the new table with calculated values.

  3. Validate the calculated molecular weight (new_mw) against the stored molecular weight (mw). Look for a discrepancy greater than 1.0.

  4. Add columns for rotatable bonds, hydrogen donors, and ClogP and populate the table.


Multi-Table Queries

  1. Find out how many structures are common between the demo table and the tcm01_mol table.

  2. Refine the query and find all structures in both demo and tcm01_mol that contain the phenyl ethylamine (NCCc1ccccc1) substructure.

Daylight Chemical Information Systems Inc.
support@daylight.com