Daylight Programmer's Guide: Error Handling

4. Error handling

4.1 Introduction

Using an object-oriented approach to programming interfaces can make error-handling much simpler. For example, the NULL_OB is used extensively as a returned object under conditions where a function does not return a 'valid' object. Initially, one might reasonably think of the return of the null object as an indication of failure of the function, or a flag that an invalid operation was attempted. Using a traditional procedural programming approach, this is a perfectly normal way to think about the NULL_OB.

The departure from traditional error handling comes when one examines the NULL_OB itself. The NULL_OB is a perfectly valid object in the toolkit (as valid as any other object). The NULL_OB is defined to have all of the properties of 'normal' objects, and can be passed legally to any toolkit function as an object parameter. This is completely different from the error-handling techniques used for procedural programming. In procedural programming, errors must be trapped immediately after they occur, because functions downstream of the error can exhibit undefined or invalid behaviors.

Herein lies one of the strengths of the object-oriented programming approach. Error trapping need not be performed after every function, but only when errors that significantly affect the operation of the program may occur. That is, several toolkit functions can be considered a functional block from an error-trapping perspective. The advantage is that normal processing incurs less overhead for error detection. The overhead is only incurred when necessary and most of the overhead is outside of the normal stream of processing.

The following trivial example illustrates the point. It only checks for abnormal conditions after executing several dependent toolkit functions:

     #include "dt_smiles.h"

     dt_Handle make_integer(dt_Integer value, dt_Handle itsadjunct) {
	 dt_Handle intob, rcob;
	 dt_Boolean rc;

	 /*  Do the work */
	 intob = dt_alloc_integer();
	 rc = dt_setintegervalue(intob, value);
	 rcob = dt_setadjunct(intob, itsadjunct;

	 /*  If everything worked, return. */
	 if (rc && (rcob != NULL_OB))
	   return (rcob);   /* m is the same as rcob */

	 dt_dealloc(intob);

	 /* What type of error was found??.*/
	 if (intob == NULL_OB) {
	   fprintf(stderr, "Couldn't allocate object.\n");
	   return (NULL_OB);
	 }
	 if (rc == FALSE) {
	   fprintf(stderr, "Couldn't set integer value.\n");
	   return (NULL_OB);
	 }
	 if (rcob == NULL_OB) {
	   fprintf(stderr, "Couldn't set adjunct.\n");
	   return (NULL_OB);
	 }
     }

The important features of this example are:

This example is overkill. In most cases, it would be preferable to simply free the integer object and return NULL_OB if any failure occured. We can successfully ignore all of the errors that may occur.
dt_dealloc() is invoked to try to free the integer object. Note that we don't bother to check or keep track of whether or not any object was successfully allocated. If there isn't any object to deallocate, the dt_dealloc() doesn't do anything.

4.2 General approach

There are two fundamental principles which dictate the Daylight Toolkit approach to function operation. They are that:

all toolkit functions are valid for all object types,
all object properties are defined for all object types.

The implication of these principles is that by definition, most Daylight Toolkit functions always succeed. This does not mean that all combinations of functions and objects make sense, but there is rarely a danger of causing an error or corrupting the Daylight Toolkit by calling a function with any arbitrary object. Furthermore, the results of all functions are rigorously defined for all objects. In most cases, the most dire consequence of an inappropriate object/function combination is that the function is ignored.

4.3 Function types

The Daylight Toolkit functions can be divided into classes based on the type of operation that they are performing, and the returned type. There are three types of operations performed by Daylight Toolkit functions:

Functions which create objects
Functions which get the properties of objects
Functions which modify the properties of objects.

4.3.1 Functions which create objects

These functions always succeed and return an object. The object will be the NULL_OB if the attempted creation of an object is not appropriate for the given combination of arguements.

Examples: dt_alloc_mol(), dt_open(), dt_addatom().

4.3.2 Functions which get the properties of objects

These functions always succeed (by definition), and will return the defined property for the object. Since all properties are defined for all objects, the programmer is responsible for the object types passed to functions. For example, dt_aromatic(server) is defined as FALSE, and the programmer is responsible for recognizing this 'nonsensical' case and avoiding it in applications. The main way to do this is to define the valid object types for user-written polymorphic functions and check that the parameter types are valid. This level of rigor is typically only necessary for debugging and for some special applciations where it is critical to avoid ambiguities. For example:

     #define NOT_A_RING -1
     #define IS_AROMATIC_RING 1
     #define NOT_AROMATIC_RING 0

     dt_Integer is_ring_aromatic(dt_Handle object)
     {
       dt_Boolean rc;
       if (dt_type(object) != TYP_CYCLE)
	 return NOT_A_RING;
       if (dt_aromatic(object))
	 return IS_AROMATIC_RING;
       else
	 return NOT_AROMATIC_RING;
     }

Without the prior object-type checking, the function is_ring_aromatic() would return NOT_AROMATIC_RING if given a server, database, non-aromatic bond, etc. The function does not fail if the object-type checking is not performed, but the results may not be as intended.

Note also that functions which return streams or sequences are considered functions which return object properties. Although they typically create a new stream or sequence object, the stream or sequence contains a set of properties of the object given as a parameter. None of the functions in this class will return an empty stream or sequence; they will either return a stream or sequence with one or more members, or they will return the NULL_OB.

Examples: dt_aromatic(), dt_fp_tanimoto(), dt_symbol(), dt_invalid(), dt_mer_sortapplies(), dt_xatom(), dt_getdatabases(), dt_charge().

4.3.3 Functions which modify the properties of objects

These functions are defined to be valid for all object types but may or may not succeed. In all but a few cases, this class of function returns a boolean. These functions will return TRUE if the operation succeeded and if the property was set and FALSE if the property was not set. As with functions which get the properties of objects, checking the type of the object parameters is the sole required safeguard.

Examples: dt_calcxy(), dt_rotate(), dt_fp_fold(), dt_setaromatic(), dt_add().

4.3.4. Exceptions

Several functions both modify and return properties. (eg. dt_fp_setminsize()) These typically take a property value and return the property value after the new value is applied. If the new value is not appropriate and the modification of the property fails, then the property value returned is the value that had been set prior to the function.

Merlin hitlist operations can both modify and return properties of hitlist objects. These functions typically perform an operation, and then return the size of the hitlist (which may have been altered by the function).

Functions operating on sets modify the contents, but return the object that was modified, as opposed to a boolean for success or failure (dt_add(), dt_append(), dt_insert(), dt_setadjunct()). In some cases, a duplicate object is created and added to the set. The returned object is newly created member of the set. This is the case for paths in pathsets. Otherwise, the functions return the handle of the object to which the new object was added.

Functions which return streams or sequences appear ambiguous. Some cases are clearly returning properties (dt_getdatabases()), others less so (dt_match()). In an abstract sense, one can think of these as cases of 'lazy' evaluation. In the case of dt_match(), we could argue that the resulting pathset is a property of the molecule and pattern, even though the pathset is not evaluated until the dt_match() function is called. (A stretch, perhaps).

4.4 Function return types

In addition to considering the operation which each function performs, we can consider the specific returned type from each function. There are five types of returned values from toolkit functions:

dt_Boolean
dt_Handle
dt_Integer
dt_Real
dt_String

Each returned type has a specific value which indicates an abnormal condition.

4.4.1 Functions which return dt_Boolean

These functions are one of two cases: functions returning boolean properties, and functions returning success/failure when setting a property. There are quite a few boolean properties of objects (such as aromatic, atstart, mod_is_on). For each of these functions, the man page enumerates all of the object types which may have a TRUE value for each property. All other objects for a given property have been defined such that the property is always FALSE.

All of the functions which are used to set object properies return TRUE or FALSE depending on the success or failure of the operation. The man page enumerates the objects whose properties may be modified for each function of this type.

4.4.2 Functions which return dt_Integer

These functions all return integer properties of an object or are hybrid functions that modify and then return a property of an object. The man page erumerates all of the object types which will return useful integer properties. By definition, the integer properties of all other object types are defined as -1.

Exceptions include merlin hitlist operations, which modify a hitlist and return its length, dt_ping(), which performs an 'external' operation (it does not operate directly on objects), the (obsolete) fingerprint functions to set global options (e.g. dt_fp_setminsize()), which set a value and return the new value, and dt_thor_tdtput() which returns the success or failure of the operation, but uses an integer because of the need for success / failure / timestamp-out-of-date.

4.4.3 Functions which return dt_Real

These functions all return a real property for an object. For each of these functions, the man page enumerates the object types which have modifiable real properties. The real properties for all other object types are defined to be -1.0.

The one exception is dt_fp_setmindensity(), which modifies a property and then returns the new value of that property.

4.4.4 Functions which return dt_String

All of these functions return the string property of an object. For each of these functions, the man page enumerates the object types which have modifiable string properties. The string properties for all other object types are defined as the invalid string.

4.4.5 Functions which return dt_Handle

These functions either create new objects or return the object properties of objects. For each of the functions which get the object properties of objects, the man page enumerates the object types which have modifiable object properties. The object properties for all other objects are defined as the NULL_OB.

The exceptions are functions which modify sets(dt_add(), dt_insert(), dt_append()). They modify the set, and return either the given set or a copy of the object which was added to the set depending on the context.

4.5 Error message facilities

Various Daylight Toolkit operations can result in errors. The errors typically encountered are a direct result of external interactions of the Toolkit of three general types: failures of input of external data as part of the creation of objects, failures of communications with external resources (servers, toolkits, databases), and exhaustion of resources available to the Toolkit (out of memory).

Failures of parsing of external data (eg. parsing SMILES for dt_smilin) typically result in diagnostic messages which allow debugging the external data expression. Failure of external communication and exhaustion of Toolkit resources are provided primarily for graceful termination of application. The Daylight Toolkit provides several functions to access and clear the Toolkit- provided diagnostic messages, and to add your own diagnostic messages to the Toolkit's queue.

The error-handling functions maintain a queue of approximately 200 error messages. If this error queue overflows, the last message in the queue is lost and the newest message replaces it.

There are several levels of error message:

Error Level	Explanation
DX_ERR_NONE	No error.
DX_ERR_NOTE	Something that might or might not be of interest, but not an error.
DX_ERR_WARNING	Something abnormal that may require attention
DX_ERR_ERROR	The requested operation could not be carried out
DX_ERR_FATAL	A serious error was detected; the Toolkit cannot continue.

A "fatal" error does not actually cause the application program to quit; you have time to clean up, close files and warn the user that something serious has occured. However, once a fatal error has occurred, the Toolkit's ability to continue correctly is doubtful.

Note also that a fatal error may not be reported correctly since many fatal errors involve a memory allocation failure. The error- reporting functions may fail to allocate memory needed to record the error messages, resulting in lost error messages.

dt_errorclear() => void

Discard all error messages and clear the error queue.

dt_errorworst() => integer

Returns the error level of the worst error recorded by dt_errorsave() since the last call to dt_errorclear() or since the application program started. A return value of DX_ERR_NONE indicates that no errors have been recorded.

dt_errorsave(string func_name, integer level, string message) => integer

Store the error message at level in the error queue. The parameter func_name is also stored; typically it is the name of the function in which the error was detected. func_name is printed in parentheses after the error message by dt_errors().

dt_errors(integer level) => Handle sequence

Return all errors of severity level or worse (higher); if level is zero, all notes, errors and warnings are returned. The returned sequence is a sequence of string objects, in generation order.

The sequence and string objects are newly-allocated copies of the error queue; it is the responsibility of the calling function to eventually deallocate the sequence object and all of the string objects it contains.

dt_smilinerrors() => Handle sequence

Like dt_errors(), above, but returns errors related to SMILES parsing. Prior to version 4.91 this was handled separately than the regular error queue. Starting with version 4.91 this function is identical to dt_error(DX_ERR_NOTE).

Back to Table of Contents
Go to previous chapter Basics: Polymorphic Functions
Go to next chapter Basics: Strings and Number Objects