Guide to ChemFOnt


ChemFOnt (the Chemical Functional Ontology) is a hierarchical, OWL-compatible ontology describing the functions and actions of more than 319,000 biologically important chemicals. It is intended to bring the same rigor, standardization and formal structure to the terminology used in biochemistry, food chemistry and environmental chemistry as the gene ontology (GO) has brought to molecular biology. ChemFOnt is available as both a freely accessible, web-enabled database and a downloadable OWL file. Users may download and deploy ChemFOnt within their own chemical databases or integrate ChemFOnt into their own analytical software to generate machine readable relationships that can be used to make new inferences, enrich their metabolomic set data (metabolite set enrichment) or make new, non-obvious connections.

ChemFOnt contains data on 341,627 chemicals, including 515,332 terms or definitions. The functional hierarchy for ChemFOnt consists of 4 functional “aspects”, 12 functional super-categories and a total of 173,705 functional terms. In addition, each of the chemicals are classified into 4825 structure-based chemical classes. ChemFOnt currently contains 3.9 million protein-chemical relationships and ~10.3 million chemical-functional relationships. The long-term goal for ChemFOnt is for it to be adopted by databases and software tools used by the general chemistry community as well as the metabolomics, exposomics, metagenomics, genomics and proteomics communities

These 12 major functional categories are subdivided into another 399 functional subcategories which are further divided into thousands of other branches or leaf nodes for a maximum depth of up to seven layers. In particular, Physiological Effect has 3637 defined categories; Disposition has 4186 defined categories; Process has 161,098 defined categories; and Role has 1037 defined categories. In total, ChemFOnt has 173,305 fully defined and fully connected functional categories, which are all placed into a logically consistent hierarchy.

These terminal or leaf nodes also contain a fully defined term (for a total of 173,305 definitions). Similarly, every chemical in ChemFOnt is also defined (for a total of 341,627 definitions) and every chemical structure or structure class is also defined via ClassyFire [5] (for a total of 4,825 definitions). The entire ontological hierarchy for ChemFOnt currently comprises a total of 515,332 fully defined terms, which are linked to ~10.3 million chemical/functional relationships

All terminal leaf nodes in the ChemFOnt hierarchy contain a fact supported by a citeable reference. The information for constructing ChemFOnt was built from many pre-existing ontologies and definitions (GO, FOBI, FoodOnt, ChemOnt, Disease Ontology), handwritten definitions (where no prior ontology or definitions existed) and structured facts/references acquired from hand-curated databases maintained in the Wishart lab (FooDB, HMDB, MiMeDB, DrugBank, MarkerDB, PathBank). To ensure uniformity, consistency and compliance, additions, corrections and improvements to ChemFOnt are done through a moderated process and strict standard operating protocols (SOPs) maintained by designated ChemFOnt editors.

Requests to join the ChemFOnt editorial team and suggestions from external users can be emailed to the ChemFOnt editors and will be handled as a first-come-first-served basis. This is the first release of ChemFOnt and it is expected that annual or bi-annual updates will continue over many years. The use of text mining tools such as PolySearch2 is expected to facilitate the continued expansion and updating of ChemFOnt’s contents. The long-term goal is to fully annotate the nearly 1 million detectable compounds known to exist in the human and natural environment using the ChemFOnt structure.