The simplified molecular input line entry specification (SMILES) is a specification for unambiguously describing the structure of chemicals using ASCII character strings. With a little bit of practice, these strings can be written, read, and understood directly; several molecular software packages can read or generate SMILES strings.
Atoms are represented by the standard abbreviation of the chemical elements, in square brackets, such as [Au] for gold. Hydroxide anion is [OH-]. If the brackets are omitted, the proper number of implicit hydrogen atoms is assumed; for instance the SMILES for water is simply O and that for ethanol is CCO. The double-bonded carbon dioxide is represented as O=C=O and the triple-bonded hydrogen cyanide as C#N. Cyclohexane is represented as C1CCCCC1, the idea being that the two ones label the same position in the molecule, thus forming a ring with six carbons. Branches are described with parentheses, as in CCC(=O)O for propionic acid and FC(F)F, or alternatively C(F)(F)F, for fluoroform.
The SMILES specification was developed by David Weininger in the late 1980s.
External links:
- SMILES tutorial, http://www.daylight.com/dayhtml/smiles/smiles-intro.html
- Web-based applications capable of rendering SMILES strings into 2D figures, http://www.daylight.com/daycgi/depict
- Molecule editor applet that can create SMILES, http://www.molinspiration.com/jme/index.html