Supporting substructure recognition and the SMARTS® language
The SMARTS® language, an extension of SMILESTM, is a powerful, flexible, and compact representation of structure and reaction queries or patterns. Structure and reaction queries represent subsets of chemical structure or reaction spaces, respectively. Given a structure or reaction and a query in the form of a substructure pattern, one can determine if the molecule or reaction belongs to the subset of chemistry or reaction space represented by the query. This is the widely used substructure search or pattern matching operation.
Alternatively, given a set of structures or reactions, one can determine a query or pattern that most narrowly defines the subset of chemistry or reaction space for which the all of the members of the given set are members. This is an abstraction operation that is useful for representing sets of structures or reactions based on a shared scaffold or transformation.
The SMARTS® Toolkit is a programming library that provides functions needed to search molecules and reactions for substructural patterns and to generate substructural patterns from sets on molecules. Patterns can be simple connections of atoms or sophisticated relationships based on complex atomic environments. Functions are provided with the toolkit to find any match quickly, enumerate all possible isomorphic matches, and to enumerate only isomorphs representing unique sets of atoms. A function is also provided to generate a SMARTS® abstraction from a set of molecules.
Objects supported by this Toolkit include: