Daylight Summer School 2000, June 7-9, Santa Fe, NM

Daylight Toolkit - Class Notes

These are intended to be brief notes supplementing and outlining the course material presented in the course Introduction to Daylight. The Daylight manuals should be considered the text for the course and the authoritative documentation, and should be used in conjunction with these notes for best results!

In particular, the Daylight Toolkit Programmer's Guide and the Daylight Toolkit Reference Manual are the relevant manuals for the toolkit unit.

Introduction:

Toolkit definition:

A set of high performance chemical information algorithms with a robust and stable interface.

Design Goals:

A. Overall Capabilities

The Daylight Toolkit is a set of object libraries providing function calls for C or Fortran toolkit programs. Perl is also available in the form of a package DayPerl which must be compiled by the user and acts as a Perl wrapper to the C functions. The Toolkit is intended for use by both expert and novice programmers. In addition to the libraries themselves, the Daylight release contains several example toolkit programs, some of which may be useful in themselves, and some which may be useful as building blocks for other programs. These examples are located in the "contrib" directory, at $DY_ROOT/contrib/src/ directory. There are C, Fortran, and Perl example programs, the most extensive set for C.

The Daylight Toolkit has a strictly defined functional interface. It is defined by the dt_* functions and constants and their prescribed syntax as documented in the man pages. Daylight is committed to supporting this stable interface, thereby allowing forward compatibility of toolkit programs. Thus, features and functions may be added, but not taken away.

  1. Dependencies of libdt libraries

    ToolkitObject libraryLicense entryPrerequisites
    SMILES-ldt_smilessmiles(none)
    SMARTS-ldt_smartssmartsSMILES
    Depict-ldt_depictdepictSMILES
    Fingerprint-ldt_fingerfingerprintSMILES
    Program Object-ldt_progobprogramobjectSMILES
    Thor-ldt_thorthorSMILES
    Merlin-ldt_merlinmerlinSMILES
    and SMARTS
    Rubicon-ldc_ruberubiconSMILES,
    SMARTS
    and Depict
    Monomer-ldt_monomermonomerSMILES,
    Thor
    Reaction-ldt_smilesreactionSMILES
    Reaction w/ Reaction Transform Capability-ldt_smilesreactionSMILES
    and SMARTS

    Linking order (lower level libs last):
    Shared objectStatic object
    (none) libdu.a
    libdw_xvgrins.solibdw_xvgrins.a
    libdw_xvtdt.solibdw_xvtdt.a
    libdw_xview.solibdw_xview.a
    libdl_xview.solibdl_xview.a
    libdt_apputils.solibdt_apputils.a (new in 4.62, only needed by Xview apps)
    libdt_depict.solibdt_depict.a
    libdl_stubs.solibdl_stubs.a
    libdt_merlin.solibdt_merlin.a
    libdt_thor.solibdt_thor.a
    libdt_ipcx.solibdt_ipcx.a
    libdt_monomer.solibdt_monomer.a
    libdc_rube.solibdc_rube.a
    libdt_progob.solibdt_progob.a
    libdt_finger.solibdt_finger.a
    libdt_smarts.solibdt_smarts.a
    libdt_datatype.solibdt_datatype.a
    libdt_smiles.solibdt_smiles.a

  2. XView Widgets Toolkit

    ToolkitObject libraryLicense entryPrerequisites
    Widgets (basic)-ldw_xviewwidgets depictwidgetSMILES
    and Depict
    GrinsWidget-ldw_xvgrinsgrinswidgetSMILES
    and Depict
    TDTWidget-ldw_xvtdttdtwidgetSMILES,
    Depict
    and Thor
    3DWidget-ldw_xview3dwidgetSMILES
    and Depict

  3. Rubicon Toolkit

    The Rubicon Toolkit is a set of function calls not fully conforming to the object paradigm, thus prefixed "dc_", since it is a C interface. It still uses normal toolkit molecule and conformation objects, so it is very toolkit-like.

  4. DayCGI Tools

    The Daylight system has integrated web capabilities which allow access to Daylight tools via web browser. The "DayCGI toolkit" is not a rigorously defined API as is the dt_ function libraries, the true Daylight Toolkit. Rather, DayCGI programming means combining an assortment of Daylight and non-Daylight tools to deliver Daylight database and computational services via the web.

  5. Java Tools

    In version 4.62, JavaGRINS is introduced providing a full function molecular editor for use in web applications. This tool will be incorporated in Daylight web applications and provided for use in custom user applications.

  6. Reaction Toolkit

    The Reaction Toolkit doesn't have a separate object library, because it's capabilities are integrated into the SMILES and SMARTS Toolkits. There are some examples available:

  7. Remote Toolkit

    The Remote Toolkit is a system for providing toolkit services to personal computers by means of the DayToolserver, which runs on a unix machine. Remote Toolkit programs are identical to normal toolkit programs except for a few functions which must be called to make the client/server connection. In this way, native Mac and Windows applications may be written using Daylight toolkit calls.

  8. The "Daylight User" (du_) Function Library

    The contrib directory contains several functions not part of the toolkit but rather built upon toolkit functions. They have been compiled into one library libdu.a for convenience.

  9. The "Drawing Library" (dl_) Functions

    The toolkit doesn't know anything about the existance of output devices for drawing structures (rendering). Hence, there is an intermediate interface which allows the toolkit to draw structures. This drawing library is called by the toolkit when it needs to performs drawing operations. The functions in the drawing library are simple drawing primitives such as: "draw line", "move to", "draw circle", etc.

    A drawing library must be supplied at link-time to any application which uses the depict toolkit. Contrib contains several examples:

B. The Toolkit view

  1. objects and handles

    The API (Application Programming Interface) comprised by the toolkit possesses object oriented features but is not a complete object-oriented language such as Smalltalk or Java. The concept of an object is essential to the toolkit interface, however, and atoms, bonds, molecules, datatrees, reactions, and many other things are toolkit objects. They may be created, manipulated, and destroyed easily using the toolkit interface. The Daylight Toolkit manages the objects for you - you need not be concerned with details of how the Toolkit represents the molecule or depiction.

    An object is identified by its "handle" which is a simple and lightweight integer datatype. The handles themselves contain no information - they are not pointers to complex structures. Handles are opaque in that they cannot be dereferenced -- no access to the internal data structures is allowed. Handles are unique. Only one handle refers to one object. Handles are valid when they refer to an existing object, or invalid if they have been deallocated by the program, revoked by the toolkit or never assigned.

    Because objects are managed by the Daylight Toolkit, the interface to various programming languages is straightforward: the Daylight Toolkit works equally well with C, FORTRAN, Pascal, or LISP.

    Objects are self-describing: Each object "knows" what it is. Many toolkit functions are polymorphic: they will take a variety of different object types The function "asks" the object what type it is and performs the appropriate action.

  2. Object Types

    ObjectDescription
    atomatom in a molecule
    bondbond in a molecule
    columncolumn of data in a pool
    conformation3-D conformation
    cyclering in a molecule
    databasedatabase object
    datafielddatum in a dataitem
    dataitemdataitem in a TDT
    datatreea THOR datatree (TDT)
    datatypedatatype definition
    depiction2-D coordinates of a molecule
    fieldtypeone datafield's type
    fingerprintfingerprint object
    hitlistsearch/sort results in a Merlin database
    integerintger object
    merserverMerlin server connection
    moleculemolecule object
    monomermonomer object
    monomersetset of monomers
    monomertabletable of monomer definitions
    monopatternsearch pattern for a monomer
    multimermultimer object (CHUCKLES)
    pathresults of a structural search
    pathsetset of path objects
    patternstructural pattern (SMARTS)
    poolMerlin database ("pool")
    programexternal program object
    reactionreaction object
    realreal number object
    sequenceordered list of objects
    serverTHOR server connection
    streamenumerated constituents of another object
    stringstring object
    substructsubstructure object
    transformgeneric reaction
    varimervarimer object (CHORTLES)
    varipatternvarimer pattern (CHARTS)
    vbindobject providing faster evaluation of a match

  3. Streams

    The toolkit function dt_stream is able to derive streams of constituent objects from parent objects. Streams of atoms or bonds can be derived from a molecule, streams of datatrees can be derived from a database, streams of molecules can be derived from a reaction. Streams are easily obtained and used, but may be revoked if the parent object is modified. Deallocating a stream does not affect its member objects.

  4. Sequences

    Sequences behave like streams but are not deallocated if a member object is modified. The are simply ordered lists of objects. Objects may be added or deleted from a sequence. Deallocating a sequence does not affect its member objects.

  5. Properties

    Named properties were introduced with version 4.51, superceding the previous dt_adjunct/dt_setadjunct mechanism. Any number of properties of any types may be associated with any toolkit object.

  6. Toolkit Object Relationships -- Parent objects and base objects

    A parent/child example (molecule/atoms):

    void do_stuff(dt_Handle ob)
    {
      dt_Handle atoms, atom, myparent;
    
      atoms = dt_stream(ob, TYP_ATOM);
      while (NULL_OB != (atom = dt_next(atoms)
        {
           /*** perform operations on each atom ***/
           ...
           myparent = dt_parent(atom);
        }
      dt_dealloc(atoms);
      return;
    }
    

    A base/derivative example (molecule/depiction):

    void do_other_stuff(dt_Handle ob)
    {
      dt_Handle depiction, mybase;
    
      depiction = dt_alloc_depiction(ob);
      ...
      dt_calcxy(depiction);
      dt_depict(depiction);
      ...
      mybase = dt_base(depiction);
      return;
    }
    

  7. dt_dealloc and memory leaks

    Regardless of the programming language used to access the toolkit, the toolkit will use dynamic memory allocation in creating objects. The implication of this is that the process size of an executable program will grow as objects are allocated and care must be taken that it doesn't grow uncontrollably. If a "memory leak" is present in a loop, the process size will increase until the program crashes or the computer is paralyzed. So the issue becomes how and when to reliably deallocate objects to avoid memory leaks.

    In general, the essential task is to deallocate objects which are allocated inside loops. For example, if a new molecule is allocated for each iteration of a loop, either by dt_alloc_mol or dt_smilin, it must be deallocated inside the loop. When a molecule is deallocated, all atoms and bonds beloning to the parent molecule, and any streams over the molecule, are deallocated. Likewise, if each iteration of a loop allocates a TDT object, it must be deallocated. Objects which are allocated only once in a program need not be deallocated, but it is considered good form to do so.

  8. SMILES and the Toolkit

    SMILES is a language for representing a molecule as an ASCII string. The toolkit can parse a SMILES into a molecule and express a molecule as a SMILES. But it should be noted that the molecule object is not itself a SMILES. This correspondence between a toolkit object and a linguistic representation is common in the toolkit.

    ObjectLanguage
    datatreeTDT
    fingerprintFP datum
    moleculeSMILES
    monomerMonomer SMILES
    multimerCHUCKLES
    patternSMARTS
    reactionReaction SMILES
    transformSMIRKS
    varimerCHORTLES
    varipatternCHARTS

  9. Persistence of Objects and Returned Values

    Objects persist until they are deallocated by the program, revoked by the toolkit, or until the program terminates. In contrast, in C, a returned string is a (char *) pointer to a location in memory containing the string, managed by the toolkit, and not guaranteed to persist after the next toolkit function call.

C. Programming languages

  1. C

    C is the native language of the toolkit and so the overhead required is least when programming in C. Thus, the speed of C programs may be superior to Fortran. C compilers are readily available on all machines, including Free Software Foundation GNU C (gcc). Most of the contrib code available from Daylight is in C, providing a head start for many programming tasks. But many programmers, expecially novice programmers, find C to be hazardous and cryptic.

  2. F77

    Fortran compilers are readily available thanks to the historical importance and proliferation of Fortran code. Limitations in Fortran-77 were corrected in Fortran-90, but F90 is not as widely used and not supported by Daylight. Much less contrib code is available in F77.

  3. Perl

    An interpreted scripting language, Perl programs will generally be slower than C or Fortran equivalents. However, ease of use and no-compilation are advantages far outweighing performance loss for many users. The amount of contrib code is small but growing, and C programs are generally easily translated to Perl.

  4. others

    It is possible to construct wrappers for various other compiled and scripting languages, including Pascal, LISP, C++, Java, tcl, python, and guile. These are not currently supported by Daylight, however.

D. Programming tools and environments

  1. Debuggers

    Debugging is essential to any project. In addition to usual debugging methods, the Daylight Toolkit provides error handling functions which can be used to debug, and "vigilance". The vigilant toolkit monitors handles and can report whether a handle is valid or not.

    The following functions comprise this facility:

    dt_invalid
    dt_vh_stop_here
    dt_vh_count (unsupported)

  2. Purify

    Purify is a useful tool for discovering memory leaks in applications. We use Purify extensively for our internal development. There are other tools which perform similar functions (Insure), however we can provide more hints to Purify users than users of other packages.

    A well-behaved toolkit program shows a common set of behaviors on completion:

E. The functions

See the Daylight Toolkit Reference Manual, for the complete functional definitions of all 373 functions in version 4.62.

F. Contrib examples

G. Web development

There is no Daylight toolkit for web development in any strict sense, with an integrated and consistent developer environment. However, there are several tools in the Daylight suite which are useful for web development

H. Case studies

  1. WinMerlin (Bernd Rohde, Novartis)
    WMSpread.gif Scatter.gif FieldList.gif

    Tools used:

  2. Stigmata (Norah MacCuish, Daylight)

    Tools used:

  3. mol2smi (Jeremy Yang, Daylight)

    Tools used:

  4. UC_SELCT (Geoff Skillman, UCSF)

    Tools used:

  5. daybase.cgi (Jeremy Yang, Daylight)

    Tools used:

  6. Diversity Map (Bernd Rohde, Novartis)

    Tools used:

  7. SPURT (Synthesis Planning Using Reaction Types) (Bernd Rohde, Novartis)

    Tools used:


Daylight Chemical Information Systems Inc.
support@daylight.com