Single, double, triple, and aromatic bonds are represented by the symbols `-', `=', `#', and `:', respectively. Adjacent atoms without an intervening bond symbol connected by a valence-dictated bond (typically a single or aromatic bond). `-' (single) and `:' (aromatic) bond symbols may always be omitted on input.
The syntax for the bond sublanguage is:
bond : <empty> | '-' | '=' | '#' | ':' ;
There is no "preferred" or "correct" ordering in SMILES, e.g., CCO and OCC are equally valid SMILES for ethanol.
The SMILES language specifies no predefined length limit on a SMILES string. In practice, most implementations define such a limit, typically between 20,000 and 80,000 characters. Ring closure specification is an alternative method for specifying bonds as is explained in the section on ring specification.Depiction | SMILES | Name | Remark |
---|---|---|---|
CC or C-C or [CH3]-[CH3] |
ethane | Adjacent aliphatic atoms are assumed to be bonded by a single bond: the single bond symbol `-' is not needed on input. | |
C=O or O=C |
formaldehyde | Double bonds are represented by an equals sign. Note that the order of input doesn't matter (SMILES may start with any atom). | |
C#N or N#C |
hydrogen cyanide | Triple bonds are represented by an hash (or "pounds") sign. (There is no handy triple-bond-like symbol in standard ASCII.) | |
C=C or cc |
ethene | Ethene is normally written C=C, but (surprise!) the default bond between non-aromatic sp2 atoms may be a double bond ... | |
C=CC=C or cccc |
butadiene | ... but not always. (Butadiene is normally written C=CC=C.) |
|
? | ccc | ? | "There ain't no sech animal." |