Monomer databases and monomer tables
Thor databases containing monomer-level data
(e.g., combinatorial data in CHORTLES) have an associated
database containing the monomer definitions.
Thor clients accessing such databases download the entire contents of
the monomer database as a monomertable object
the first time it is needed (e.g., for normalization).
This step can get time-consuming if there are many thousands of
monomer definitions (client program startup is slow).
Monomer table caching
The option of caching monomer tables locally is introduced in
selected 4.51 Thor client applications
(thorlookup and daytoolserver).
Monomer table caching is invoked by setting the client program option
-THOR_MONOCACHE_DIR
to the name of a local directory (e.g., /tmp).
This directory will be used to hold one monomertable cache file for
each monomer database which is accessed.
(Such files have long names which identify the remote thor service
and the remote server's database path.)
Monomer tables will be (re)cached in that directory only if they don't
already exist or the underlying monomer database has changed since
the table was last cached.
This is how the monomer table caching scheme works:
If this process fails at any step (e.g., the cache file is not accessible
or the disk is out of space), monomer definitions are downloaded in the
normal (slow) manner.
For advanced users ...
Users are advised to some testing before invoking monomer table caching
in a production environment.
The (otherwise undocumented) -DEBUG option to thorlookup allows
you to write scripts which invoke and time monomer table caching under
various conditions.
DEBUG "cachedir=" and "timecheck" tags are used, e.g.,
$ thorlookup -DEBUG "cachedir=/tmp timecheck" mixbase@bob ...
The new database property "monomtime" is defined in support of this
caching scheme, i.e., the C statement:
str = dt_info(&lens, database, "monomtime");
will return the date and time that the database's monomertable was
last updated, as a dt_String. (The caching scheme described above
only tests whether this is same-or-different than that stored with
the cache.)
Summary
A mechanism for client-side monomertable caching is introduced which
can significantly reduce the time it takes a Thor client to start
accessing combinatorial databases.
It is most suitable for use with databases which are accessed many
times between changes to the monomer-table, e.g., all static databases,
when thorlookup is used repeatedly in scripts, and when a remote toolkit
server is serving many clients accessing such databases.
The benefit of caching is increased when communication bandwidth is
limited, e.g., when working over a slow or busy network, or with
databases on slow media such as CD-ROMs.
We introduce this capability with some trepidation because this scheme
violates one of the principles of the Daylight toolkit, which is,
"Never do any visible I/O." (Of course there's another one which is,
"Never say never.") Initially, this feature is likely to be used in
a few high-volume environments -- we'll report the results as we
hear them.
Daylight Chemical Information Systems, Inc.
info@daylight.com