1. Introduction

Back to Table of Contents

Daylight's Goal
To provide the best known computer algorithms for chemical information processing to those who need them; to provide chemical information systems capable of handling all of the chemical information in the world.

Computers are used to solve many problems in chemistry, including predicting the properties of a molecule, maintaining and searching databases of chemical properties, and deducing structure from chemical properties. These tasks are challenging to the computer scientist and to the chemist.

Unfortunately, many workers in this field waste a great deal of their time creating and recreating what might be termed the chemical information infrastructure -- programs to read connection tables, maintain databases, depict molecules, perform sub- and superstructure searches, similarity searches, and so forth. Although all of this infrastructure is well understood and widely duplicated, it has never been made available in any reusable form. The result is that on any particular chemistry project that uses computers, the majority of time is spent recreating the infrastructure.

The Daylight Toolkit provides this chemical-information infrastructure. Using a simple set of functions based loosely on an "Object Oriented Programming" model, the Daylight Toolkit allows programmers to get directly to their unique chemical problems; they needn't be bothered with reinventing the infrastructure. With the Daylight Toolkit, programmers can get directly to the problem at hand, often saving so much time that projects can be finished in a fraction of the time that a reinvent-the-wheel approach would have required.

The Daylight Toolkit supports several popular languages, including C and FORTRAN, and is available on a variety of platforms, including several UNIX machines, PCs and Macintoshes.

1.1 Daylight Toolkit Modules

The Toolkit is divided into several modules, available separately or as packages (some require other modules to be present; for example the Toolbase and SMILES modules are required by all other modules). Below is a brief outline of each Toolkit module's features:

ToolBase This module is the foundation for all Daylight Toolkit modules. Includes Handles, streams and sequences, error functions, polymorphism, string objects, and other basic functionality.

SMILES (Simplified Molecule Input Line Entry System) SMILES input and output; unique-SMILES generation; molecular connection tables; isomerism; chirality; addition, deletion, and modification of atoms and bonds; substructure objects (but not substructure searching -- see SMARTS). (See Chapter 2 for a complete description of SMILES).

SMARTS (SMILES Arbitrary Target Specification) A substructure search system utilizing SMARTS, an extension of SMILES which allows chemically meaningful expressions to be constructed.

Fingerprints Generating molecular "fingerprints", characteristic arrays of bits that allow high-speed screening for substructure- search systems, and similarity metrics for molecules.

Depict 2-D schematic representations (depictions) of molecules; 3- D conformations; depiction modification; atom, bond, and whole- depiction labels; rotations.

Thor (THesaurus Oriented Retrieval) Chemical Databases using a thesaurus-like approach; allows high-speed storage and retrieval of chemical information using ambiguous or inexact identifiers, synonyms, trade names, and so forth.

Merlin High-speed in-memory searching of chemical structure and chemical information.

1.2 Audience and Background

This document is intended for programmers who will be incorporating the Daylight Toolkit into their own programs (applications). Because of the diversity of backgrounds expected in such an audience (from Computer Scientists to Chemists to System Administrators), we try to err on the side of being verbose.

Experience with computer programming and chemistry is expected; in particular, you must be familiar with your application language (FORTRAN, C, Pascal, etc.), and with basic chemical nomenclature. Familiarity with data structures such as hash tables is helpful but not necessary, as is an acquaintance with the concept of "Object Oriented Programming".

1.3 Other References

The Daylight Chemical Information Systems: Theory of Operation manual is background for this manual. A thorough understanding of its contents is necessary before this material will make sense.

This manual is intended to serve as a tutorial introduction to toolkit programming. The on-line manual pages serve as the authoritative reference for toolkit functionality and behavior.

This document is a companion to the Daylight Toolkit Reference Card, which contains exact specifications for the functions available in the Daylight Toolkit, and comes in language-specific versions (C, FORTRAN, etc.).

The Daylight Toolkit also comes with a number of ready-to-compile example programs in the "contrib" directory. These can be extremely useful as a starting point in working with the Toolkit. We suggest that you glance through the examples before serious perusal of this manual to get an idea of what's there. The examples may clarify many of the explanations given here.

1.4 Conventions

Because the Daylight Toolkit is designed to work with a variety of languages, we use a generic or "function prototype" technique to describe each function. For example, consider the following function prototype:

     dt_type(object ob) => integer

It translates to the following:

     C Prototype:        int dt_type(dt_Handle ob)

     FORTRAN Prototype:  integer function dt_type(ob)
                         integer ob

NOTE: The actual function calls for a given language many be significantly different from the "prototypes" shown in this manual. Consult the online man-pages for exact function specifications.

In particular, strings are represented differently in most languages. For example, the function prototype:

     dt_stringvalue(Handle ob) => string s
would translate to the following C function:
     char *dt_stringvalue(int *len, dt_Handle ob)
Notice that the actual function doesn't even have the same number of parameters as the function prototype! In general, you should read the manual to get the function's description, then refer to the Daylight Toolkit Reference Card for the function's exact syntax.

Descriptions of functions that return strings often refer to the invalid string, which is often returned when a function that returns a string detects an error. The specific definition of the invalid string is language dependent.

1.5 Compiling and Linking

As a Daylight Toolkit programmer, you will compile source code to object code and link object code to an executable binary using the Daylight Toolkit and operating system libraries. The following information shows the syntax for building a Daylight Toolkit program.

1.5.1 Compiling

Let's say you have the following source code in a file named smiles.c:
#include <stdio.h>
#include <string.h>
#include "dt_smiles.h"

void main(int argc, char **argv) {
    dt_Handle mol;

    if (2 > argc)
        printf("Usage: %s <SMILES>",argv[0]);
    else if (NULL_OB == (mol=dt_smilin(strlen(argv[1]),argv[1])))
        printf("SMILES is not valid.\n");
    else
        printf("SMILES is valid.\n");
    dt_dealloc(mol);
}
The syntax for compiling source code is:
    compiler [ options ] file

The compiler typically is the operating system standard C-code (cc) or FORTRAN (f77) compiler, or the GNU project C and C++ (gcc) or FORTRAN (g77) compiler.

The option to compile source code to object code is -c and the option to specify the directory location of Daylight "#include" files is -I$DY_ROOT/include.

The file is smiles.c.

The following compiles the smiles.c source code to object code file and producing an object code file named smiles.o.

    cc -c -I$DY_ROOT/include smiles.c

1.5.2 Linking

Now, let's link the object code to an executable binary. The syntax for linking source code is:
    compiler [ options ] file toolkits [ libraries ]

The compiler is the same as before.

The option to link object code to an executable binary is -o <filename>, e.g., -o smiles, and the option to specify the directory location of the Daylight Toolkit libraries is -L$(DY_ROOT)/lib.

The file is smiles.o.

The toolkits define Daylight functions. In this case, the SMILES Toolkit is required by smiles.c (which calls dt_smilin(3) and dt_dealloc(3)), and is specified by -ldt_smiles.

The libraries are not needed, as no operating system functions are used in smiles.c.

The following links the smiles.o object code to the the SMILES Toolkit, producing an executable binary file named smiles.

    cc -o smiles -L$(DY_ROOT)/lib smiles.o -ldt_smiles

Alternative, you may combine compile and link command, e.g.,

    cc -o smiles -I$DY_ROOT/include -L$DY_ROOT/lib smiles.c -ldt_smiles

1.5.3 Toolkit Libraries

1.5.4 Advanced Programming

X libraries is -L$(XVIEW_LIB) -L$(X_LIB). The definition of XVIEW_LIB and X_LIB is operating system dependent and shown in the table below:

Platform XVIEW_LIB X_LIB
Red Hat Linux /usr/openwin/lib /usr/X11/lib
SGI Irix 32-bit /usr/local/openwin/lib /usr/lib32
SGI Irix 64-bit /usr/local/openwin/lib /usr/lib64
SUN Solaris /usr/openwin/lib /usr/X/lib

The files are the object code, which may be several files. In this case, there's one file, smiles.o. The toolkits required will depend on what parts of the Daylight Toolkit you use in your program. You can determine which toolkits are required from the "Library Linkage" section of the Daylight Toolkit Functions manual pages or visiting the Daylight website at http://www.daylight.com/dayhtml/doc/man/man3/index.html. The syntax for linking to a library is -l<library> and multiple libraries must be linked in a specific order on IRIX and Solaris (Linux excluded). Below is a list of Daylight Toolkits, library syntax and dependancies in the order of required linkage.

Toolkit Syntax (DY_LIB) Dependancies
SMILES -ldt_smiles none
SMARTS -ldt_smarts -ldt_smiles
Fingerprint -ldt_finger -ldt_smiles
Reaction -ldt_smiles none
Reaction w/ Transforms -ldt_smarts -ldt_smiles
Thor -ldt_thor -ldt_ipcx -ldt_smiles
Merlin -ldt_merlin -ldt_ipcx -ldt_smarts -ldt_smiles
Rubicon -ldc_rube -ldt_depict -ldt_smarts -ldt_smiles
Program Object -ldt_progob -ldt_smiles
Grins Widget -ldw_xvgrins none
TDT Widget -ldw_xvtdt none
Basic Widget -ldw_xview none
Depict -ldt_depict none
"contrib" Depict -ldl_xview or -ldl_stubs none
"contrib -ldu none
none (for database definitions) -ldt_datatype none
none (for XView applications) -ldt_apputils none

Finally, the libraries required will depend on your use of X graphics and operating system routines. Some Daylight Toolkits require X libraries (Depict Toolkit) and all require operating system libraries. Similar to Toolkit libraries, the syntax for linking to a library is -l<library> and must be linked in a specific order. Below is a list of X Graphics and operating systems and library syntax in the order of required linkage.

X Graphics Syntax (GFX_LIB)
XView -lxview -lolgx -lX11 -lXext
Operating System Syntax (OS_LIB)
Red Hat Linux -lnsl -ldl -lm
SGI Irix 32-bit -lsocket -lnsl -lw -ldl -lm -lmalloc
SGI Irix 64-bit -lsocket -lw -lm -lmalloc
SUN Solaris -lsocket -lnsl -lw -ldl -lintl -lm -lmalloc
Note: XView graphics is not available on SGI Irix 64-bit systems. Now we have all the information we need to link together a Daylight Toolkit program. If you programmed use of all Daylight functions, the syntax for linking a HelloWorld C program on a 32-bit SGI Irix would be: cc -o HelloWorld HelloWorld.c $(DY_ROOT/lib) $(XVIEW_LIB) $(X_LIB) $(DY_LIB) $(GFX_LIB) $(OS_LIB) ${DY_ROOT}/contrib/lib/libdl_stubs Back to Table of Contents
Go to next chapter: Basics: Daylight Toolkit Objects.