Daylight Software Version 4.51 Release Notes
============================ CONTENTS ============================
1. INTRODUCTION AND HIGHLIGHTS OF RELEASE 4.51
2. DETAILS OF CHANGES MADE IN RELEASE 4.51
3. BUGS FIXED FOR VERSION 4.51
4. KNOWN BUGS IN VERSION 4.51
5. DAYLIGHT SOFTWARE-RELEASE HISTORY
========= 1. INTRODUCTION AND HIGHLIGHTS OF RELEASE 4.51 =========
DAYLIGHT CIS SOFTWARE, VERSION 4.51
Rev: May 1, 1997
This document describes changes and features specific to version 4.51
of Daylight Chemical Information Systems software. If there is a
machine-specific file for your machine in this directory (e.g.
"readme_v451_sgi"), please read it too.
- REACTIONS!
- Tversky Similarity
- Read-Only CDROM-able Databases
- Fingerprint-tuples
- Component-tuples
- Properties of objects
- Non-identifier Cross-referencing
- Automatic Indirect Data
- Data quoting is uniformly enforced
- Datatree merging moved to server
- Monomer table caching
- Merlin Component-tuple fingerprint optimization
- THORDBFIX451
- Thordbinfo
- Admin and Theory Manuals Restructured
====================== 2. DETAILS OF CHANGES =====================
REACTIONS!
The Reaction Toolkit and Reaction capability in Thor and Merlin is
introduced. Reactions are stored as SMILES where the the
reactant(s), agent(s), and product(s) are separated by the
">" character. Reaction atom-maps may be complete
or incomplete and are stored as Absolute SMILES. Reactions are
searchable with all structural searches, super-, sub-, and
similarity, with reaction-role specificity. A "difference
fingerprint" is introduced which characterizes the topological
changes involved in the reaction, rather than the total features in
all reaction-roles. Reactions are depicted in a usual way,
and GRINS can handle Reaction SMILES. In the Reaction Toolkit,
Reaction Transformations are introduced, which apply reactions
programatically, offering many applications, including combinatorial
library development and genetic algorithms. Finally, several
commercial databases of reactions are being released by Daylight.
Tversky Similarity
A new Merlin capability supports a flexible user-configurable
similarity coefficient. By varying parameters alpha and beta,
the weight placed on shared and unique structural features may
be adjusted over a continuous range. Some settings include the
Tanimoto coefficient itself, 'similarity as substructure', and
'similarity as superstructure'. These features are supported
by the Merlin Toolkit, Merlinserver, xvmerlin and MCL.
Read-Only CDROM-able Databases
Read only databases introduced. No lock file is created, thus
the database can be read from CDROM. Variable db file suffixes
are allowed, specified by the option DATABASE_SUFFIX_LIST, thereby
supporting ISO 9660 CDROM (8.3 names) format, and allowing lower
case file names. Also, relative pathnames are allowed in the
Thor header, facilitating CDROM dbs and simplifying installations.
Fingerprint-tuples
The fingerprint program will produce component-tuples of
fingerprints (FPP datatype) when given the new -m option.
A database may contain both FP and FPP data which will be used as
available and needed. For databases of large dot-disconnected
mixtures, this can produce a significant increase in screening
speed (with a penalty in database/pool size)
Component-tuples
Datafields which have the PART_NTUPLE normalization are treated as
vectors of data which have a correspondence to the disconnected
components of a structure. The order-correspondence of the vector
is maintained on canonicalization. Like other ntuples, any kind of
data can be stored in component-tuples (string, integer, real,
binary, etc.)
Such vectors are very useful for representing things like
mole-fractions of mixtures and stoichimetry of reactions.
Component fingerprint-tuples may be used to enhance the speed of
atom-level searching in large mixtures.
Properties of objects
Toolkit functions for assigning named properties to objects
are added: dt_setboolean(), dt_sethandle(), dt_setinteger(),
dt_setreal(), dt_setstring(), dt_proptype(), and dt_propnames().
In this way flexible data structures may be built onto any
toolkit object.
Non-identifier Cross-referencing
Non-identifier dataitems in a Thor datatree which is preceded by a
slash ('/') are automatically cross-referenced. to the tree root
(which may or may not be a SMILES).
Automatic Indirect Data
Added a facility to automatically generate indirect references from
a datatrees in thorload. Previously, a user was required to
generate and load indirect data manually. Typically this was
accomplished by assigning arbitrary indirect keys to the data,
followed by the manual registration of the separate main and
indirect data.
The relevant options to thorload are:
-GENERATE_INDIRECT TRUE|FALSE
Controls automatic generation of indirect references. Default:
FALSE.
-EXCLUDE_INDIRECT ALL or [tag ...]
-INCLUDE_INDIRECT ALL or [tag ...]
These two options select which indirect datatypes are to be
automatically processed. Each takes a list of datatype tags or the
keyword "ALL"
-INDIRECT_DATABASE dbname
Specifies the indirect database name to which the generated
indirect references are registered. This is required when
-GENERATE_INDIRECT TRUE is specified. Default: none.
Data quoting is uniformly enforced
All data containing characters used for datatree syntax $ < ; > |
must be quoted in lexical datatrees. Same as previous quoting
convention but now enforced uniformly, in particular, with datatype
definitions, e.g. $D<"$SMI">, or _P<"*;*">
All databases must be updated (rebuilt). thordbfix451 is supplied
for this purpose.
No changes are visible when working at the object level
(i.e., with toolkits).
Datatree merging moved to server
Loading Thor databases was pretty slow even if "raw" data loading
was used (i.e., database reloading). Moving the merge operation to
the server makes datatree merging much faster (5x) for raw data
loads. (Previous behavior: the client-side got the extant datatree
from the server, merged it with another, and returned the merged
datatree to the server.)
The server now checks to see if the new tree fits in the current
record's location; if so, it uses it rather than creating a new one
and dinking the hash lists. When deleting records, the server now
coalesces adjacent empty blocks when able.
Extra reserved space is inserted following commonly-used
cross-references in an adaptive manner. Such space is not subject
to garbage collection (except when the database is explicitly
crunched).
Monomer table caching
Thor clients connecting to a database with a defined monomer table
download the whole monomer table the first time it is used (e.g.,
for normalization). This can get time-consuming if there are many
thousands of monomer definitions.
The capability of caching monomer tables in a local directory
(e.g., /tmp) was added. This is a new, experimental feature. It is
not part of the Thor toolkit interface but may be made part of the
formal Thor toolkit interface in a future release. For verion 4.51,
it is implemented only in thorlookup and daytoolserver as the
"THOR_MONOCACHE_DIR" option. When set, the programs will use the
given directory to cache local copies of the monomertable.
The new dt_info() property "monomtime" returns the date and time
that a database's monomer table was last modified.
These functions are combined to produce the following behavior: if
a local directory is specified, the monomer table is cached in that
directory only if it doesn't already exist or it has been changed
since the table was cached.
This is most suitable for remote toolkit applications and for
applications which access combinatorial databases over slow lines.
Merlin Component-tuple fingerprint optimization
Mixtures stored as dot-disconnected SMILES are searched faster by
using the FPP component-ntuple fingerprint to avoid interpreting
the entire SMILES if possible.
THORDBFIX451
Added "thordbfix451", used to rebuild Thor databases by Daylight
and users for compatibility with the 4.51 release:
thordbfix451
thordbfix451_dtypes
thordbfix451_rebuild
This is a suite of programs that works with thorfilters to dump,
modify and rebuild pre-4.51 Thor databases.
Thordbinfo
New "thorfilter" program: thordbinfo(1). Prints information about
a database that isn't available via any other thorfilter program.
This was motivated by the need for thorfix451 (see below), but is a
generally- useful new program.
Admin and Theory Manuals Restructured
The Administration and Theory Manuals were restructured so that
the Theory Manual is comprised only of Daylight computational
chemistry theory, and all Thor and Merlin administration topics are
covered in the Administration Manual, retitled the "Daylight
Installation and Administration Guide." The Administration Guide
includes a well-defined installation section and user guides for
the administrator programs including sthorman and the Thorfilters
programs. Among other benefits, now Daylight administrators should
only need refer to one manual!
========================== 3. BUGS FIXED ==========================
> SMILES Toolkit bugs involving aromaticity detection have been
fixed. Previously, there existed "a few" non-stable dt_cansmi()
SMILES which oscillated between two values, or were not
interpretable by dt_smilin().
> Fixed showclusters/listclusters bugs. No longer require that
cluster numbers be ascending. No longer require cluster sizes as
input. Will warn but will proceed correctly in both of the above
cases.
> SMILES bonds no longer limited to 10 connections per atom.
> Fixed a bug in Prado. If one used the -print_smarts option, it
would fail because the SMARTS toolkit wasn't licensed. This was a
problem in the order in which the licenses were checked (a toolkit
function was called before the dy_lm_check_program() function.
> Fixed a bug in Rubicon. +RUBE_WRITE_BOUNDS works correctly (as
documented) now.
> Fixed a bug in SMILES toolkit. If one were to set the atomic
number of an atom above the highest legal value (DX_ATN_MAX), the
toolkit core-dumped when one generated a cansmiles.
> Fixed a bug where dt_setlabel[12]ga() simply didn't work. They
always used DL_GA_TEXTLABEL. Now they use the value set by the
user. The default is now DL_GA_DEFAULT, which is different that
the previous value. This will cause some colors to change for
applications which don't set the labelgas.
> Fixed a bug in tablet. The correct version of the program wasn't
being written out.
> Fixed a bug in dt_stream(bond, TYP_CYCLE). Returned an empty
stream if the bond wasn't in any cycles. Now returns NULL_OB.
> Part searching in smarts. Also dt_smarts_opt() works properly.
> dt_canstream(), dt_origstream(), dt_arbstream(). Replace
dt_canatom_stream, dt_canbond_stream(), dt_origsmi_stream().
> jpscan, jarpat - fixed -NNID option to work as in previous
versions, takes the first NN<> dataitem if the option isn't
specified.
> listclusters, showclusters - fixed memory deallocation bug in
dy_jp_taniss().
> Fixed bug in progob toolkit for remote toolkit. Program objects in
the remote toolkit didn't work because the program object tookit
got the interrupt from the license management of the toolserver.
Fixed in v442p1.
> Changed SUN5 compilation to use -K PIC. This allows one to build
Shareable object libraries for Perl and TCL. Looked at speed
difference in Merlinserver. Fixed in v442p1.
> Changed SMILES toolkit to remove several limits. First, removed a
20K character limit on the length of a SMILES. Also, removed the
limit on the number of bonds allowed to a single atom (used to be
10).
> Lone hydrogens are removed during GRAPH normalization.
========================== 4. KNOWN BUGS =========================
TBA
======================= 5. RELEASE HISTORY =======================
Daylight releases are numbered using the following scheme:
The "system number" (e.g. 3.xx, 4.xx) indicates completely different
systems. Each system is a complete new design and coding.
Major releases (e.g. 4.1x, 4.2x, 4.3x) contain new features and
enhancements. Often, programs and databases from one major release
aren't compatible with those from another.
Minor releases, or "updates" (e.g. 4.32, 4.33) are for bug fixes and
minor additional features. They are also occasionally for adding
new platforms (computers and/or operating-system version).
The first two releases of system 4 were called "4.1" and "4.2"; under the
above-described scheme they would have been called "4.11" and "4.21".
There were two additional releases of the "demo" tape, which would have
been "4.22" and "4.23". Release 4.24 (October '92) was the first to use
this new version-numbering scheme.
4.1 20 Dec 1991 First 4.x release: SunOS only
4.21 20 Mar 1992 Second release: bug fixes, added SGI platform
4.22 ?? Demo Tape update (applics and toolkits not affected).
Updated the demo database: added clustering data to
illustrate Daylight's clustering product.
4.23 22 Sep 1992 Demo Tape update (applics and toolkits not affected)
Same as 4.22, but added workaround for a bug in the
SGI X window system.
4.24 02 Oct 1992 Update: many bug fixes, some added features.
4.25 13 Nov 1992 SGI Toolkit Tape only; corrects incompatibility between
versions of SGI IRIX Operating system. All other 4.24
SGI and Sun tapes are unaffected.
4.31 01 May 1993 Added Print Package, Merlin Toolkit. Many bug fixes
and enhancements. Added support for VAX/VMS (Toolkits,
servers, and some non-X-Windows programs). Added Thor
and Merlin management utilities.
4.32 01 Jul 1993 Added Rubicon program and Rubicon Toolkit. Added
support for HPUX on HP9000/7xx series, and for
Solaris on Sun machines. Improved "man" pages and
help-widget text files. A number of minor bug fixes.
4.33 28 Jan 1994 Improved merlinserver, thorserver, and Merlin and
Thor Toolkits. Improved printing. A number of minor
bug fixes. Restructured & added to "contrib" programs.
4.34 25 Feb 1994 Revamped clustering programs. Partial molecular
fingerprint generation. Fixed bugs introduced in 4.33.
4.40 Nov 23 1994 Preliminary "beta-test" versions of 4.41.
4.40b Feb 12 1995 Preliminary "beta-test" versions of 4.41.
4.41 Mar 17 1995 Databases of mixtures, "monomer" toolkit (CHUCKLES,
CHORTLES, & CHARTS), Program-Object Toolkit, parallel-
ized (multi-CPU) version of clustering, MCL.
4.42 Feb 02 1996 HTML Documentation, CGI application programs, record
locking, thordestroy(1), Thor/Merlin messages,
Thor/Merlin eviction, faster TDT merging, Merlin
parallel SMARTS searches, better Merlin performance
under heavy load, Merlin program objects.
4.42p1 Apr 02 1996 Merlinserver, Merlinsmartstalk, and
Daytoolserver bugs fixed in this patch.
4.51 Mar ** 1997 Reaction Toolkit, reaction databases in Thor/Merlin,
formal object properties, read-only (CDROM) databases,
cross-referencing non-identifiers in THOR, Merlin
"similarity as sub/superstructure", and more.
|