Cabinet and CARD: Federated Access to Chemical and Biological Data
Scott Dixon Metaphorics, LLC
ABSTRACT
Cabinet (Chemical And Biological INformatics NETwork) comprises a set of tools to generate federated chemical and biological informatics servers and a federation of servers built upon these tools. The Cabinet tool set includes tools for handling chemical structures (built upon the Daylight CIS toolkits) as well as protein sequences, Enzyme Commission numbers , protein and ligand structures, images, enzyme networks and other biological and genomics data types. The servers built with Cabinet tools speak HTTP protocol and so can interact with web browsers. They also communicate with each other to exchange queries.
Each Cabinet server provides access to a certain type of information. For example, the QSAR server provides access to Quantitative Structure Activity Relationships while the Empath server provides access to information about metabolic pathways. Each server has a data model which is appropriate to the data that it serves and the data models can be quite different from one another. In a federated system there is no need to have a unified overall data model. Instead, Cabinet servers share a set of common languages which they use to exchange queries with one another. For example, the servers use the SMILES language to define chemical structures. Thus, one Cabinet server might send a SMILES to the other servers to find out what they know about that that molecule or similar molecules. Each Cabinet server is free to answer that query based on it's own data model. The answers from that query are collected and presented to the user as new hyperlinks which can be followed. Each page of information presented by a Cabinet server typically has one or more hyperlinks which cause such queries to be sent to the other servers to find related information. The rule in Cabinet is to click on a hyperlink if you want more information about an item or related items. Thus, there is typically no complicated query system to learn and no need to understand complex data schema before one can use Cabinet.
This presentation will include a short overview of Cabinet servers and then introduce one of the most recent additions to the Cabinet server suite: CARD (Cabinet Access to Relational Databases). CARD allows users to generate new Cabinet servers using simple HTML templates to control the page appearance and SQL and simple scripting to retrieve the data from a back end RDBMS. The allows users to integrate their own data, typically stored in Oracle or other relational systems, into the Cabinet federation with the ability to exchange information and queries with all the other Cabinet servers. CARD provides a powerful new way to explore internal databases and visualize the connections between those data and other Cabinet servers with a wide variety of other data.