An Ontology is a way to formally express knowledge. A domain ontology describes what is known in a single area of expertise, like gene expression, protein folding, or French cooking. It can be very idiosyncratic (my view of the world of chocolate, for example), or it can fit in with other ontologies by linking to an upper ontology to form a worldview. My goal is not to go into details of implementation. Generally, an ontology takes the form of many statments, where each statement is a triple, of the form:
subject verb object
For example, a simple chocolate ontology would look like:
dark chocolate contains cacao liquor
dark chocolate contains cacao butter
dark chocolate may contain sugar
dark chocolate may contain vanilla
chocolate may contain dark chocolate
Each statement, or triple, is called an assertion. Putting enough assertions together creates a knowledge model. You can see that it’s possible to make false assertions and to create circular reasoning, which is why it’s difficult to make ontologies that are truly useful. Most ontologies today are still research projects. But new tools and techniques (plus a lot of hard work) are making ontologies that will soon be very useful.
Ontologies tend to get big, to encapsulate as much knowledge about a domain as possible. Many ontologies now have more than 1 billion assertions (triples). Upper ontologies like Cyc are used to model most of human knowledge to some degree and can therefore serve as a “backbone” to a knowledge repository like WikiPedia.
Many companies and research institutes are building in-house ontologies to help solve a particular problem. Unfortunately, if they don’t gain wide acceptance, we’ll have the same problem with ontologies that we have with data formats – too many of them, and they don’t connect properly. We’re in a chicken-and-egg phase, where we want people to develop ontologies but it really makes sense to share ontologies, rather than reinvent the wheel. It’ hard to build a good ontology – better to cooperate and build a good one than to go it alone and do it 80%.
Here are some published ontologies:
- Cyc, a large Foundation Ontology for formal representation of the universe of discourse.
- Foundational Model of Anatomy, an ontology for human anatomy
- Gene Ontology for genomics
- Gellish English dictionary, an ontology that includes a dictionary and taxonomy that includes an upper ontology and a lower ontology that focusses on industrial and business applications in engineering, technology and procurement. See also Gellish as Open Source project on SourceForge.
- NIFSTD Ontologies from the Neuroscience Information Framework: a modular set of ontologies for the neuroscience domain. See http://neuinfo.org
- OBO Foundry, a suite of interoperable reference ontologies in biomedicine
- PRO, the Protein Ontology of the Protein Information Resource, Georgetown University.
- Protein Ontology for proteomics
- WordNet, a lexical reference system