Experiences Porting a Corporate Database and Managing Large TDT Files Made Easier with Simple Helper Functions

Evan Bolton, American Cyanamid Co.


Five basic "home grown" TDT operators and their use in making useful filters for creation and management of large TDT fi les is to be discussed. Some of these filters, which perform primary operations on TDT files, such as separation of TDT 's containing specific TDT fields, true concatenation of TDT's based upon identifiers, transformation of activity screen ing data into LC50's or LC90's, formation of tab delimited files from only particular TDT fields, and automatic generati on of QSAR spreadsheets, will be demonstrated. A description of the five basic TDT operators are given below.

char *field_rm(char *tdt, char *field, int *length)

Parse a TDT (tdt) for a TDT field (field) and remove the first occurrence of it, returning the resulting TDT and its length (length).

char **nparse_tdt(char *tdt, char *field, int num_rec)

This function parses a TDT (tdt) for multiple TDT fields of the same type (field) and returns an array of the contents of a predetermined number (num_rec) of these TDT fields.

char *parse_tdt(char *tdt, char *field, int *length)

This function parses a TDT (tdt) for a TDT field (field) and returns the contents of the field and its length (length).

char *read_tdt(FILE *fp, int *length)

This function reads a "valid" TDT from a file (fp) and returns the TDT and its length (length). (The TDT is checked for validity, by making sure a '<' is followed by a '>' before another '<' and ends with a '|'.)

char **tdt_fields(char *tdt, int *num_fields)

This function parses a TDT (tdt) and returns an array containing the TDT field names, minus any '$' identifiers and the array size (num_fields).