Ontology Summit 2013: (Track A) "Intrinsic Aspects of Ontology Evaluation" Synthesis

Track Co-champions: Leo Obrst & SteveRay

Mission Statement

Ontologies are built to solve problems, and ultimately an ontology's worth can be measured by the effectiveness with which it helps in solving a particular problem. Nevertheless, as a designed artifact, there are a number of intrinsic characteristics that can be measured for any ontology that give an indication of how "well-designed" it is. Examples include the proper use of various relations found within an ontology, proper separation of concepts and facts (sometimes referred to as classes vs. instance distinctions), proper handling of data type declarations, embedding of semantics in naming (sometimes called "optimistic naming"), inconsistent range or domain constraints, better class/subclass determination, the use of principles of ontological analysis, and many more. This Track aims to enumerate, characterize, and disseminate information on approaches, methodologies, and tools designed to identify such intrinsic characteristics, with the aim of raising the quality of ontologies in the future.

Scope

Dimensions of evaluation, methods, criteria, properties to measure

Version 1 Synthesis

It is useful to partition the ontology evaluation space into three regions:

1. Evaluation that does not depend at all on knowledge of the domain being modeled, but does draw upon mathematical and logical properties such as graph-theoretic connectivity, logical consistency, model-theoretic interpretation issues, inter-modularity mappings and preservations, etc. Structural properties such as branching factor, density, counts of ontology constructs, averages, and the like are intrinsic. Some meta-properties such as transitivity, symmetry, reflexivity, and equivalence may also figure in intrinsic notions.
2. Evaluation where some understanding of the domain is needed in order to, for example, determine that a particular modeling construct is in alignment with the reality it is supposed to model. It may be that some meta-properties such as rigidity, identity, unity, etc., suggested by metaphysics, philosophical ontology, and philosophy of language are used to gauge the quality of the subclass/isa taxonomic backbone of an ontology and other structural aspects of the ontology.
3. Situations where the structure and design of the ontology is opaque to the tester, and the evaluation is determined by the correctness of answers to various interrogations of the model.

We have chosen to call Region 1 Intrinsic Evaluation and Region 3 Extrinsic Evaluation. The reason this sort of partitioning is helpful is that purely intrinsic evaluation is highly amenable to automation (which is not to say that the other partitions are not automatable eventually, with more effort) and thus to scaling to many ontologies of any size. Examples of such tools include the Oops! Evalution web site at http://oeg-lia3.dia.fi.upm.es/oops/index-content.jsp and described by MariaPovedaVillalon [ see slides ], and the use of OntoQA to develop metrics for any ontology based on structural properties and instance populations, described by Samir Tartir [ see slides ]. By the very nature of the Oops! web tool, it is not possible for it to depend upon any domain knowledge. Instead, it reports only on suspected improper uses of various OWL DL modeling practices.

Similarly, Region 3, purely extrinsic evaluation, implies no ability whatsoever to peer inside a model, and depends entirely on model behavior through interactions. In some cases, it may be appropriate that extrinsic evaluation criteria be considered as intrinsic criteria with additional, relational arguments, e.g., precision with respect to a specific domain and specific requirements.

For the purposes of developing reasonable expectations of different evaluation approaches, the challenge mainly lies in clarifying the preponderance of work that falls within Region 2, where some domain knowledge is employed and combined with the ability to explore the ontology being evaluated. For example, the OQuaRE framework described by AstridDuqueRamos [ see slides ] falls in this middle region as it combines both context dependent and independent metrics. Indeed, the OQuaRE team has stated their desire to better distinguish between these two categories of metrics. Another example is the OntoClean methodology (not reported on in Ontology Summit 2013, but generally well-known [1, 2]), that draws upon meta-domain knowledge, i.e., standard evaluative criteria originating from the practices of ontological analysis.

Of course, structural integrity and consistency are only two kinds of evaluation to be performed, even in a domain-context-free setting. Entailments, model theories and subtheories, interpretability and reducibility are just a few of the other properties that should be examined. It is the goal of this summit to define a framework in which these examinations can take place, as part of a larger goal of defining the discipline of ontological engineering.

[1] N. Guarino, C. Welty. 2002. Evaluating Ontological Decisions with OntoClean. Communications of the ACM. 45(2):61-65. New York: ACM Press. http://portal.acm.org/citation.cfm?doid=503124.503150.

[2] Guarino, Nicola and Chris Welty. 2004. An Overview of OntoClean. In Steffen Staab and Rudi Studer, eds., The Handbook on Ontologies. Pp. 151-172. Berlin:Springer-Verlag. http://www.loa-cnr.it/Papers/GuarinoWeltyOntoCleanv3.pdf.

Version 2 Synthesis

This document has as scope the dimensions of ontology evaluation, methods, criteria, and the properties to measure to ensure better quality ontologies.

Intrinsic Aspects of Ontology Evaluation Ontologies are built to solve problems, and ultimately an ontology's worth can be measured by the effectiveness with which it helps in solving a particular problem. Nevertheless, as a designed artifact, there are a number of intrinsic characteristics that can be measured for any ontology that give an indication of how "well-designed" it is. Examples include the proper use of various relations found within an ontology, proper separation of concepts and facts (sometimes referred to as classes vs. instance distinctions), proper handling of data type declarations, embedding of semantics in naming (sometimes called "optimistic naming"), inconsistent range or domain constraints, better class/subclass determination, the use of principles of ontological analysis, and many more.

We focus in the communiqué on the evaluation of ontologies under the following intrinsic aspects:

- Is the ontology free of obvious inconsistencies and errors in modeling?
- Is the ontology structurally sound? How do we gauge that?
- Is the ontology appropriately modular?
- Is the ontology designed and implemented according to sound principles of logical, semantic, and ontological analysis?
- Which intrinsic aspects of ontology evaluation are of greater value to downstream extrinsic ontology evaluation?

Section C (2) What are the desirable characteristics of ontologies? Do they depend on the intended use of the ontology? And how are they measured over the life cycle of the ontology?

1) Partitioning the Ontology Evaluation Space:

a. Intrinsic Evaluation Aspects: Intrinsic ontology evaluation, from our perspective, consists of two parts: Structural Intrinsic Evaluation and Domain Intrinsic Evaluation.

Structural Intrinsic Evaluation: Ontology evaluation that does not depend at all on knowledge of the domain being modeled, but does draw upon mathematical and logical properties such as graph-theoretic connectivity, logical consistency, model-theoretic interpretation issues, inter-modularity mappings and preservations, etc. Structural metrics such as branching factor, density, counts of ontology constructs, averages, and the like are intrinsic. Some meta-properties such as adherence to implications of transitivity, symmetry, reflexivity, and equivalence assertions may also figure in intrinsic notions.

In general, structural intrinsic criteria are focused only on domain-independent notions, mostly structural, and those based on the knowledge representation language.

Some examples of tools and methodologies that address intrinsic ontology evaluation:

I. Oops! Evaluation web site at http://oeg-lia3.dia.fi.upm.es/oops/index-content.jsp and described by MariaPovedaVillalon

II. OntoQA to develop metrics for any ontology based on structural properties and instance populations, described by Samir Tartir

III. PatrickLambrix's debugging of Isa-a taxonomic structures, especially with mappings between ontologies

Domain Intrinsic Evaluation: Evaluation where some understanding of the domain is needed in order to, for example, determine that a particular modeling construct is in alignment with the reality it is supposed to model. It may be that some meta-properties such as rigidity, identity, unity, etc., suggested by metaphysics, philosophical ontology, semantics, and philosophy of language are used to gauge the quality of the axioms of the ontology, including e.g., the subclass/isa taxonomic backbone of the ontology and other structural aspects of the ontology.

Most of the aspects of this category focus on ontological content methods such as better ontological and semantic analysis, including meta-property analysis (such as provided by methodologies like OntoClean, etc.)

Domain knowledge and better ways to represent that knowledge do come into play here, though divorced as much as possible from application-specific domain requirements that come more explicitly from extrinsic evaluation issues. At the extrinsic edge of domain intrinsic evaluation, the context-independent measures from Structural Intrinsic evaluation begin to blend into the very context-dependent, application issues of Extrinsic evaluation.

Some examples of tools and methodologies that address domain intrinsic ontology evaluation:

I. OQuaRE framework described by AstridDuqueRamos

II. OntoClean (Guarino and Welty)

III. Maria Copeland: Ontology Evolution and Regression Testing

IV. Melissa Haendel: Ontology Utility from a biological viewpoint

V. Ed Barkmeyer: Recommended practices with mapping vocabularies (especially code-lists) to ontologies.

b. Extrinsic Evaluation Aspects: Ontology evaluation where the structure and design of the ontology is opaque to the tester, and the evaluation is determined by the correctness of answers to various interrogations of the model. In general, application requirements and domain requirements that are specifically needed by particular applications are the focus of extrinsic evaluation.

2) Evaluation Across the Ontology Lifecycle

Every criterion should be evaluated at each point in the ontology lifecycle, but with some criteria being more important (necessary/sufficient) at some points more than others. In other words, a better ontology evaluation methodology might define necessary and sufficient criteria (and their measures) derived from both intrinsic and extrinsic aspects that apply to different points in the ontology lifecycle.

In addition, the determination of these necessary or sufficient criteria may be subject to constraints: for example, though initially an intrinsic criterion of logical consistency of the ontology may be imposed as a necessary property at the beginning of the first phase of ontology development, it might be relaxed subsequently when it is determined that a different semantics will apply in how the ontology is interpreted within a given application (e.g., if the application-specific reasoning will not observe the full description logic Open World Assumption, but instead interpret the ontology under a locally Closed World Assumption).

--

maintained by the Track A champions: Steve Ray & Leo Obrst ... please do not edit

Ontolog Forum

Contents

Ontology Summit 2013: (Track A) "Intrinsic Aspects of Ontology Evaluation" Synthesis

Mission Statement

Scope

Version 1 Synthesis

Version 2 Synthesis