CHUCKLES, CHORTLES and CHARTS
Daylight's first languages for combinatorial chemical mixtures.
These languages have been incorporated into the Database
package and the Monomer Toolkit.
Combinatorial chemistry was the inspiration for the Daylight "Monomer"
representation of chemical structures. Storing and searching chemical
structures, not as individual molecules, but as a part of a combinatorial
library synthesized as a mixture is the problem which these combinatorial
library languages address.
Combinatorial libraries have a synthetic design, the "Monomer" representation
may have no synthetic connection, but rather a chemical information systems
representation to facilitate storage and retrieval. In the Daylight
system a combinatorial library can be represented in terms of monomers.
The Monomer table is constructed and Monomers and Monomer Sets which are
used to describe chemical mixtures stored in a database.
Monomer- "molecular chunk"-- a piece of a molecule that is typically
more than one atom but less than a
whole molecule. It has three properties: a symbol, a SMILES and a description.
-
Left-to-right orientation is important
-
Monomer definitions need not represent valid molecules
-
Monomer definition SMILES is interpreted in the larger SMILES context(e.g.
atom order)
-
Unmatched ring closures are acceptable
EXAMPLES



Monomer Table
Symbol
|
Definition
|
Description
|
Ala |
NC(C)C(=O) |
alanine |
Cys |
NC(CS1)C(=O) |
cysteine link |
Oh |
O&1 |
hydroxy |
Mphen |
c1c&2cc&4cc1 |
1,3-substituted benzene |
CHUCKLES-monomer level representation of a molecule
-
Monomer symbols are used in CHUCKLES in the same way as atomic symbols
are used in SMILES
-
Ring closures are interpreted on a monomer-definition basis
-
The SMILES of a molecule is interpreted as if monomer symbols were replaced
by their respective definitions.
-
If no bond is specified, monomers are joined left-to-right, bond symbols
are the sames as for SMILES
-
"." indicates that the adjacent monomers are not bonded
EXAMPLES
AlaCys2AlaCys2
Phen23.Oh2.Oh3
Monomer Sets-
Multiple monomer choices in a given position are specified via a MonomerSet,
made of semicolon-separated Monomers in brackets. The Monomer Set
definition consists of its symbol, its set of Monomers and its description.
Example
Symbol
|
Definition
|
Description
|
Basepep |
Lys;Arg;His |
basic peptide residue |
Thiopep |
Cys;Met |
sulfur-containing peptide residue |
CHORTLES-monomer-level representation of mixtures
An extension to the CHUCKLES language that represents regular mixtures.
Multiple monomer choices in a given position are specified via a MonomerSet,
made of semicoln-separated Monomers in brackets.
Examples
CHORTLES |
Components |
Ala[Cys;Ala]Oh |
2 dimers |
Ala[Basepep][Thiopep][Ala;Cys]Oh |
12 pentamers |
CHARTS-monomer level patterns
CHARTS provides a language for monomer-level patterns specified in CHUCKLES
and CHORTLES much like the SMARTS language for molecular patterns specified
in SMILES.
-
AND syntax same as to that in CHORTLES/SMARTS and is used to specify
submixtures, i.e. [A;B] means "A AND B"
-
OR syntax is the same as in SMARTS (comma means "or") and is used
to specify components, i.e., [A,B] means "A OR B"
-
Pseudo monomers- Any, Begin, End
-
A repeat count is provided at the monomer-level, :N, :N-M,:?,:+,:*
-
Dot disconnections - A "." in a CHARTS indicates that the CHARTS subexpression
on the left and right of the dot are to be matched independently.
Examples
CHARTS
|
Match found in CHORTLES
|
Ala[Pro;Tyr]
|
[Ala;Gly;Lys][Pro;Ser;Tyr][Cys;His]O
|
Ala[Pro,Tyr]
|
AlaProHisOh AlaTyrHisOh TyrAlaProHisGlyOh
|
Begin-Any-Any-Any-End
|
All Pentamers
|
Phe[Ala;Gly:2-]Phe
|
2 Phe's with 2> intervening Ala's/Gly's, e.g. PheGlyAlaPhe
|
Ala[Pro;Tyr]His.AlaProHis
|
[Ala;Gly][Pro;Ser;Tyr][Cys;His][Ala;Gly]ProHisAlaOh
|
Monomer Table- Table of all Monomers and Monomer Sets for a
given database.
Summary:
CHARTS Search- Takes place at the Monomer level rather than atom-by-atom
SMARTS Search of combinatorial mixture- Atomic level searching of CHUCKLES
and CHORTLES, CHUCKLES are enumerated---slow!
Submixture Search- "Find mixtures contaned by this mixture"- search
target is a CHORTLES
Supermixture Search - "Find mixtures containing this mixture"
Daylight Chemical Information Systems Inc.