Reaction Languages: Reaction SMIRKS


General

Reaction SMIRKS is the language used for describing generic reactions (transformations). Reaction SMIRKS is a superset of reaction SMILES. Reaction SMIRKS is basically reaction SMILES with allowed SMARTS expressions. Hence, all valid reaction SMIRKS are also valid reaction SMARTS patterns.

Rules

The SMIRKS rules are as follows:

  1. Reactant and product atoms which are atom mapped must be mapped pairwise and be complete. It is not legal in SMIRKS to have ambiguous atom maps. In effect, since SMIRKS describes a generic reaction mechanism, that mechanism must be fully specified.

  2. Stoichiometry must be well-defined and complete for SMIRKS. Any atoms which are not atom mapped are assumed to be added or deleted during the transformation.

  3. Explicit hydrogens which are used on one side of a SMIRKS must be explicit on the other and must be mapped (unless they are added / deleted during the transform).

  4. Bond expressions must be SMILES (no bond queries allowed).

  5. Atom expressions may be any valid atomic SMARTS expressions for atoms which do not undergo bond changes during the transform. All other atom expressions should be either SMILES or "limited" SMARTS (see below).

Simple Reaction SMIRKS
Depiction Reaction SMIRKS and remark
[*:1][N:2](=[O:3])=[O:4]>>[*:1][N+:2](=[O:3])[O-:4]

Here's a simple transform. It is useful for converting between tautomer representations of Nitro groups. SMIRKS need not represent real chemistry; they are useful tools for manipulating molecules, too.

[H:1][O:5][CH2:8][CH3:3].[H:2][O:6][C:9](=[O:7])[CH3:4]>> [O:6]([H:1])[H:2].[CH3:3][CH2:8][O:5][C:9](=[O:7])[CH3:4]

Our favorite esterification, the complete version.

[H][O:5][CH2:8][CH3:3].O[C:9](=[O:7])[CH3:4]>> [CH3:3][CH2:8][O:5][C:9](=[O:7])[CH3:4]

Our favorite esterification again. This uses the new 4.61 syntax for adding and deleting atoms: unmapped atoms are assumed to come from outside of the environment.

[$([O,S;+0]C),$([N+0](C)C),$(C=C[O,N,S]:1][c:10]1[cH:8][cH:6][cH:5][cH:7][cH:9]1. [O-:2][N+:4]=[O:3]>>[$([O,S;+0]C),$([N+0](C)C),$(C=C[O,N,S]:1][c:10]1 [cH:8][cH:6][c:5]([cH:7][cH:9]1)[N+:4](=[O:3])[O-:2]

Atomic SMARTS expressions can be used to describe "environmental" factors which affect the generic reaction. In this case, electron donation to the ring favors the reaction.

[C:1]=O>>[C:1]1OCCO1

A Ketal formation (protection reaction). Thecarbonyl oxygen goes away; it is not retained as part of the cyclic ketal.

([*+0;n,N,S,O:1]).([C:2][*;Br,I:3])>>[*+1:1][C:2].[*-1:3]

An alkylation reaction. This uses SMARTS for atom expressions.

SMIRKS Atomic Expressions

In version 4.51, atoms which were involved directly in a transformation (the bonding to that atom changed in the transform) were required to be expressed as SMILES. All other atom expressions could be SMARTS. In version 4.61, this restriction was relaxed to allow the following general syntax for atom expressions which are involved in the reaction:

[<SMILES PART>;<SMARTS PART>:<MAP>]

Using the low-precedence and (";") one can explicitly say the properties of the atom that change during the transform, while keeping the ability to generalize the transform with SMARTS, including recursive SMARTS. For example, "[*+1;n,N:1][H:2]>>[*+0;n,N:1].[H+1:2]" represents a general ammonium ion deprotonation. Prior to version 4.61, it would have been necessary to represent this concept as two separate SMIRKS in order to express both aliphatic and aromatic reactions.

Chirality is handled locally in SMIRKS, with SMARTS semantics. For example, the following two transformations both invert carbon tetrahedral stereochemistry:

[*:1][C@:2]([*:3])([*:4])[*:5]>>[*:1][C@@:2]([*:3])([*:4])[*:5]
[*:1][C@:2]([*:3])([*:4])[*:5]>>[*:1][C@:2]([*:4])([*:3])[*:5]


Forward to "Reaction Capabilities".
Return to table of contents.
Daylight Chemical Information Systems, Inc.
jjdelany@daylight.com