13. Program Object ToolkitBack to Table of Contents
13.1 IntroductionProgram objects are used to provide two-way communication with an external process, e.g., the clogp program for computing hydrophobicity for a structure represented in SMILES. Using program objects, a calling program can start an external program, send it input, receive its output, and perform other tasks while the external program remains running and ready for more input. A number of programs supporting program objects are supplied with the release of Daylight Software. Most of these are supplied as contributed code, in the directory: $DY_ROOT/contrib/src/progobThe programs clogptalk and cmrtalk operate
as program objects (in $DY_ROOT/bin).
13.2 Using Program ObjectsProgram objects are normal UNIX programs, scripts, etc. which communicate through standard input and standard output using ASCII messages with a specifically defined protocol (the "PIPETALK" protocol). Any executable within the UNIX environment which adheres to this protocol can be used as a Program Object. Note that program object programs need not be Daylight Toolkit programs. There are example program objects within the "contrib/src" directory in the standard distribution. This approach allows a program to be used like a function, but without the need to link to the object libraries underlying the program. For instance, linking a program to functions written in C (e.g. X-windows) and in FORTRAN (e.g. the MedChem library) is extremely difficult in some versions of UNIX. This approach also avoids the high overhead associated with running external programs from files whenever their functions are needed. For instance, some users have implemented the following approach to clogp computation from a SMILES:
Aside from all that file manipulation, this is an extremely slow method because the clogp program must initialize itself each time a computation is run (although clogp's computations are fast, its initialization is slow because it has to read in the fragment database, read in customizations, etc.) This poor perfomance is due to the one-way nature of pipe communication via the shell. Use of clogp as a program object eliminates such problems. Program objects are created by the function dt_alloc_program() from an executable file name. Messages consist of zero or more ASCII strings and are represented in the Daylight Toolkit by a sequence of string objects. Once the calling program has created a program object, it can converse with it using messages, via dt_converse(). Program objects are deallocated with dt_dealloc().
13.2.1 Welcome and Farewell MessagesThe primary type of communication with a program object is that the calling program sends a message and the program object responds with a message. There are two other situations where program objects can send messages. A program object sends an unsolicited message when it is first invoked; this is called the "welcome message" and is obtained with dt_welcome(). All program objects must send a welcome message (although it may be empty), and all programs which allocate program objects should call dt_welcome() after a sucessful return from dt_alloc_program(). A program object also sends a message when it is terminated; this is called the "farewell message". Calling dt_converse() with a NULL_OB message terminates the program and returns the farewell message. Sending NULL_OB is like sending a program an end-of-file. Any further calls to dt_converse() will produce empty messages. The program object should still be deallocated via dt_dealloc(). It is acceptable to deallocate a program with dt_dealloc() at any time (however, the farewell message will be lost).
13.2.2 Other Special MessagesThere are several properties which are useful to know for all programs. A number of special messages are defined which all program objects will respond to:
These definitions aren't all that special, they are simply string
constants which are sent to program objects, e.g. DX_PT_HELP is
defined as You may define (and document!) such messages as needed, for instance the program clogptalk recognizes the DX_TABLE message as a request for tabluated output (DX_TABLE is defined in medchemtalk.h as "Set TABLEOUTPUT.").
It is probable that other special messages will be defined in the
future. You may register messages with Daylight Support - they will
be included as comments in
13.2.3 Program Object Toolkit Functions
13.3 PIPETALK ProtocolThe "pipetalk protocol" is the communication protocol which programs must follow if they are to be successfully used as program objects. Note that the contributed examples implement this protocol, so they can be modified rather than developed from scratch.
13.3.1 Definitions
End-of-message (EOM), end-of-transmission (EOT), and send-message-
list (MSGLIST) strings are defined in #define DX_PT_EOM "Qwerty: Over." #define DX_PT_EOT "Qwerty: Over and out." #define DX_PT_MSGLIST "Qwerty: Say MSGLIST."These definitions should not be changed. Programs should not write them on a single line for other purposes (intended to be unlikely, given the "Qwerty: " prefix). A message is defined as zero or more strings followed by the EOM string.
13.3.2 Receiving MessagesMessages are received by reading standard input until the receipt of a line containing only the EOM string. Note that input lines to a program object can be arbitrarily long. Programmers should be careful not to use fixed-length buffers to receive input. The Daylight contributed code directory contains examples showing the correct way for a program object to read from standard input (see $DY_ROOT/contrib/src/c/progob).
13.3.3 Sending MessagesAll messages must written to standard output in this manner: message contents as string followed by newline EOT message string followed by newline flush standard output
13.3.4 Initial Response to ExecutionPrograms must send an initial "welcome" message upon execution. The sent message may be empty (i.e., the program must send at least the EOM line).
13.3.5 Program OperationPrograms operate by receiving a message then sending a message. Each time a message is received, one message must be sent. The sent message may be empty (i.e., the program must send only EOM). (It's OK to start sending while reading.) To prevent "deadlock", it is critical that programs never send unsolicited messages, and that they never begin their replies until the entire input message is received (i.e. the EOM message is encountered). In addition, the program must ensure that its output buffer (standard output) is "flushed" after each message, as otherwise the parent program will sit waiting forever for a message that is stuck in the child program's internal buffers.
13.3.6 Response to Special Messages
The following strings are defined in #define DX_PT_HELP "Qwerty: Say HELP." #define DX_PT_PROGRAM "Qwerty: Say PROGRAM." #define DX_PT_VERSION "Qwerty: Say VERSION." #define DX_PT_NOTICE "Qwerty: Say NOTICE." #define DX_PT_MSGLIST "Qwerty: Say MSGLIST."On receipt of one of the first four strings, programs should respond with an appropriate message (containing help on program operation, program name, program version, and copyright notice, respectively). On receipt of a DX_PT_MSGLIST message, programs should send a message containing all other recognized control strings. This response can be empty. Each of the supplied messages should be responded to in a sensible manner, but it is left entirely up to program to do so.
13.3.7 Program TerminationOn receipt of an EOT message, programs must send their final "farewell" message (which may be empty) and go into a quiescent state awaiting EOF on standard input, at which time the program must exit. While in the quiescent state (after EOT but before EOF), the program should respond to all messages with an empty message (just EOM).
13.3.8 Naming ConventionBy convention, programs which communicate via pipetalk protocol have names that end in "talk", e.g. clogptalk.
Back to Table of Contents
|
||||||||