Daylight Summer School 2002, June 5-7, Santa Fe, NM
DayCart TCM Interface and Worksheet --
WITH HINTS
The Daycart TCM interface is a simple CGI interface which uses
a combination of normal CGI and Oracle Call Interface code to
provide a web interface to the Traditional Chinese Medicines
data. The interface can be accessed
HERE.
The web interface shows, in the source for each HTML page, the
SQL query or queries which were used to retrieve the data.
You will query the TCM data from the Web interface and then will
perform the same queries from SQLPlus and get the results.
Starting SQLPLUS:
telnet day0, login mug/coffee
sqlplus mug/coffee
The main table of interest is: tcm01_mol. It has chemical structures.
The other four tables (tcm01_plant, tcm01_plant_mol,
tcm01_ref, tcm01_mol_ref) contain data about plants and references which
are associated with the molecules.
You can get information about the columns in the tables with the 'describe'
command. For example:
describe tcm01_mol;
TCM Interface Queries
- Perform a structural query for Dopamine (NCCc1cc(O)c(O)cc1). Note
that the Web interface returns three records. Now, perform the same
query using SQLPlus. Again, you'll find the actual SQL query embedded
in the source to the html ("View...Page Source" from Netscape).
Note that the SQLPlus query returns the TCMM field, but this isn't
displayed in the web page. Where is it used?
- Perform a structural query for the following SMILES: NCCc1cc(S)ccc1,
chosen because it is not in the database. Note that the CGI attempts to
find exact matches, then substructures, and failing both of those, will
return similar structures. Perform the similarity search using SQLPlus.
Again, you'll find the actual SQL query embedded in the source to the html.
Modify the SQL query to display the molecular weight of each hit in the
results.
Modify the SQL query to only display the hits with a molecular weight of
greater than 150.
Oracle Data Manipulation
-
Create a demo table as a subset of tcm01_mol. Use your initials as part
of the name (eg. tcm_jjd) so it doesn't collide with a classmates. Only
include the first 100 records from tcm01_mol.
-
Add a column for average molecular weight. Populate the new table with
calculated values.
-
Validate the calculated molecular weight (new_mw) against the stored
molecular weight (mw). Look for a discrepancy greater than 1.0.
Multi-Table Queries
-
Find out how many structures are common between the nci table and the
tcm01_mol table.
-
Refine the query and find all structures in both nci and
tcm01_mol that contain the phenyl ethylamine substructure.
-
Both the nci and tcm01_mol tables have a CAS Registry number
column. Use the matching SMILES to validate the CAS registry numbers. That
is, for all the structures in common between the two tables, how many have
CAS registry number discrepancies?
Daylight Chemical Information Systems Inc.
support@daylight.com