Ring closure bonds are specified in SMILES by appending matching digits to the specifications of the joined atoms, with the bond symbol preceding the digit (if needed). In fact, this is another general way of specifying a bond in SMILES.
A useful way of thinking about SMILES ring specification is as follows. There's a graph theorem that says, "There is always a way of breaking one bond per ring in a connected molecule which leaves you with a still-connected but acyclic molecule." (Actually, graph theoreticians talk about "graphs" instead of "molecules" and "edges" instead of "bonds", but if they thought about chemistry, that's how they might say it.) Pick one bond in each ring in this way, numbering them in any order. Break the numbered bonds, appending the bond number to the atoms on the ends of the bonds so broken. Note that this leaves an acyclic structure which can be always be specified using the three rules for specifying atoms, bonds, and branching.
Cyclohexane is a simple example:
There are usually many different, but equally valid descriptions of the same structure, e.g., the following SMILES notations for 1-methyl-3-bromo-cyclohexene-1:
A single atom may have more than one ring closure. This is illustrated by the structure of cubane in which two atoms have more than one ring closure:
The ability to re-use ring closure digits makes it possible to specify
structures with more than 10 rings.
Structures that require more than 10 ring closures to be open at once are
exeedingly rare (not even buckminsterfullerene needs that many).
If needed or desired, higher-numbered ring closures may be specified by
prefacing a two digit number with a percent sign (%).
For example, C2%13%24
is a carbon atom
with ring closures 2, 13, and 24.
Depiction | SMILES | Name | Remark |
---|---|---|---|
C1CCCCC1 | cyclohexane | If unspecified, the default bond order for the ring closure is the same as with any other bond. | |
C1=CCCCC1 C=1CCCCC1 C1CCCCC=1 C=1CCCCC=1 |
cyclohexene | The order of ring closure bonds may be specified as long as they don't conflict, e.g, C=1CCCCC-1 is not OK. | |
c12c(cccc1)cccc2 same as c1cc2ccccc2cc1 |
napthalene | Atoms can have more than one ring closure. | |
c1ccccc1c2ccccc2 same as c1ccccc1c1ccccc1 |
biphenyl | A ring closure digit may be reused if desired. |