FACTOID # 16: In the 2000 Presidential Election, Texas gave Ralph Nader the 3rd highest popular vote count of any US state.
 Home   Encyclopedia   Statistics   States A-Z   Flags   Maps   FAQ   About 


FACTS & STATISTICS    Advanced view

Search encyclopedia, statistics and forums:



(* = Graphable)



Encyclopedia > International Chemical Identifier

The IUPAC International Chemical Identifier (InChI), developed by IUPAC and NIST, is a digital equivalent of the IUPAC name for any particular covalent compound. Chemical structures are expressed in terms of five layers of information — connectivity, tautomeric, isotopic, stereochemical, and electronic. IUPAC logo The International Union of Pure and Applied Chemistry (IUPAC) (Pronounced as eye-you-pack) is an international non-governmental organization established in 1919 devoted to the advancement of chemistry. ... NIST logo The National Institute of Standards and Technology (NIST, formerly known as The National Bureau of Standards) is a non-regulatory agency of the United States Department of Commerce’s Technology Administration. ... IUPAC nomenclature is a system of naming chemical compounds and of describing the science of chemistry in general. ... Tautomers are organic compounds that are interconvertible by a chemical reaction called tautomerization. ... Isotopes are any of the several different forms of an element each having different atomic mass (mass number). ... The different types of isomers. ...

The InChI algorithm converts input structural information into the InChI identifier in a three-step process: normalization (to remove redundant information), canonicalization (to generate a unique set of atom labels), and serialization (to give a string of characters).




L-ascorbic acid

Grain alcohol redirects here. ... Image File history File links No higher resolution available. ... This article deals with the molecular aspects of ascorbic acid. ...

Layer types

There are six InChI layer types:

  1. Main layer
  2. Charge layer
  3. Stereochemical layer
  4. Isotopic layer
  5. Fixed-H layer
  6. Reconnected Layer

Electric charge is a fundamental property of some subatomic particles, which determines their electromagnetic interactions. ... The different types of isomers. ... Isotopes are any of the several different forms of an element each having different atomic mass (mass number). ...


Each layer can be split into sub-layers. For example, the main layer can be split up into three sub-layers:

  1. Chemical formula (no prefix)
  2. Atom connections (prefix: "c")
  3. Hydrogen atoms (prefix: "h")

This article or section does not cite any references or sources. ... General Name, Symbol, Number hydrogen, H, 1 Chemical series nonmetals Group, Period, Block 1, 1, s Appearance colorless Atomic mass 1. ...


Layers and sub-layers are both separated by the "/" delimiter. All layers and sub-layers (except for the chemical formula sub-layer of the main layer) start with a lower-case letter indicating the type of information held in that layer.


The only documentation for how to generate InChI strings is the InChI implementation, available from the SourceForge site. The details of each sub-layer format has not been documented sufficiently enough for others to implement independent parsers. The closest is BKChem which claims a 98.5% success rate reading the InChI strings for the NCI data set. Although those format details are mostly reverse-engineerable, they are not enough that other software can generate InChI-like strings and have the InChI software do the canonicalization.

The IUPAC InChI software is designed to convert MDL molfiles into InChI strings; parsing InChI strings is of lesser importance. For example, some deliberately constructed "InChI"-like strings, when passed through the InChI algorithm, produce incorrect (non-canonical) results. (Details reported on the InChI discussion list hosted at Sourceforge, in several postings around early July 2007.)

Because there is no such thing as a "non-canonical" but InChI-like string, it is not possible - except in the most trivial of molecules - to generate or validate an InChI string manually.

The InChI software was not designed to be robust against hostile use. Several denial-of-service and buffer overflow attacks exist in the InChI parser of the code base (for examples InChI=1/65536C65536 and InChI=1/C/q2987987*-1). Because there is only one widely used implementation, essentially all software accepting InChI strings from untrusted sources are potentially open to attack. At present this exploit is theoretical.

The implementation can be used as a library but it was not designed that way. (For example, functions names have no special prefix so it's more likely to get namespace collisions). The public release only supports the Microsoft and gcc compilers. IUPAC continues to develop and refine the code but those releases take a long time, so third-paty patches to fix security holes and improve portability are not quickly folded back in to the public code base.

See also

This article does not cite any references or sources. ... A molecule editor is a computer program for drawing and editing chemical structures. ...

External links

  • IUPAC InChI site
  • InChI.info - an unofficial InChI website featuring on-line converter from InChI to molecular drawings
  • Unofficial InChI FAQ
  • Generate InChI (interactive service at University of Cambridge, either interactive or WSDL)
  • Search Google for molecules (generates InChI from interactive chemical and searches Google for any pages with embedded InChIs). Requires Javascript enabled on browser
  • Free ChemSketch Drawing Package Chemical Structure drawing package including output to InChI file format and conversion of InChI to structure
  • PubChem online molecule editor that supports SMILES/SMARTS and InChI
  • ChemSpider Services that allows generation of InChI and conversion of InChI to structure (also SMILES and generation of other properties)
  • MarvinSketch implementation to draw structures (or open other file formats) and output to InChI file format
  • Googling for InChIs a presentation to the W3C.
  • Presentation on InChIs from the Googleplex
  • InChIMatic Draw your molecule and Google will search for it
  • BKchem implements its own InChI parser and uses the IUPAC implementation to generate InChI strings

  Results from FactBites:
CAS registry number - Wikipedia, the free encyclopedia (415 words)
CAS registry numbers are unique numerical identifiers for chemical compounds, polymers, biological sequences, mixtures and alloys.
Chemical Abstracts Service (CAS), a division of the American Chemical Society, assigns these identifiers to every chemical that has been described in the literature.
A CAS registry number is separated by hyphens into three parts, the first consisting of up to 6 digits, the second consisting of two digits, and the third consisting of a single digit serving as a check digit.
The Electronic Chemistry Library (4109 words)
Chemical patents worldwide are comprehensively indexed and cross-referenced in Chemical Abstracts.
It includes descriptive and numerical data on the chemical, physical, spectral and biological properties of compounds; systematic and common names of compounds; literature references; structure diagrams and their associated connection tables.
LIGAND, Database of Chemical Compounds and Reactions in Biological Pathways, is designed to provide the linkage between chemical and biological aspects of life in the light of enzymatic reactions.
  More results at FactBites »



Share your thoughts, questions and commentary here
Your name
Your comments

Want to know more?
Search encyclopedia, statistics and forums:


Press Releases |  Feeds | Contact
The Wikipedia article included on this page is licensed under the GFDL.
Images may be subject to relevant owners' copyright.
All other elements are (c) copyright NationMaster.com 2003-5. All Rights Reserved.
Usage implies agreement with terms, 1022, m