is subsumption inference possible in traditional database systems (RDBMS)? - database

My question is about subsumption in databases (versus ontologies). My understanding says that if I have instances that belong to Class B, then Class A which is the superclass of Class B will also have these instances.
Ontologies provide in-built subsumption inference through the various reasoners (e.g. RDFS++, Pellet, etc.). I would like to know if it is possible to achieve a similar task in database systems. If so, how flexible or easy is it to implement? Are there any advantages of the database implementation (if any) over the ontology-based approach?

To clarify, an ontology doesn't perform reasoning. An ontology is the set of logical axioms that a reasoner uses to answer queries with the inferred (and explicit) information in your knowledge base.
There are a number of existing open-source and commercial systems which perform reasoning and could be considered a database as opposed to something that is purely for reasoning, like Pellet/Fact/Hermit. Examples include AllegroGraph, GraphDb, and Stardog. So obviously, yes, it's possible. There are a couple different ways to approach the implementation, so you have some flexility on how to design the system based on your preferred use case.
It's not hard to create a toy reasoner that will parse an ontology and do some basic reasoning like subsumption. But if you want to support (correctly) one of the OWL fragments, and you want to do it at scale, it's not easy.
Go look at how Jena and its reasoners are implemented, that will be enough to get you going.
Sesame also has a RDFS reasoner, so that would be another source for you to review.

Related

Multiple Ontologies Reuse (which partially sourced from a more specific domain)

I need to develop an ontology in computational biochemistry and molecular dynamics. For this, I have collected the terms that is going to be used and attempted to reuse ontologies by searching the terms on ontology search service, such as EBI-OLS. Some terms are very relevant to import/reuse, however, the ontology itself is intended for a more specific domain, such as National Cancer Institute Thesaurus (which has 171,081 classes). Other than that, there are other 10 source ontologies that I could potentially reuse. Some of them are also huge, such as EDAM ontology.
Is it okay to reuse ontology that seemingly intended for a more
specific domain, such as cancer? We will use the ontology for a more
generic use in life science, not only cancer-related domain.
Is there any general rule of thumb on which of those 10-ish ontologies
that are suitable for reuse? (e.g., the paper describing that
ontology should be cited by at least n number of papers, or it
should be compatible with Open Biological and Biomedical Ontology
(OBO) Foundry principles, or it should be backed by a well-known
institution and still maintained).
How to decide the sweet spot on
the number of ontology sources one can based on? While we can reuse
as much available terms as we can (from many ontology sources,
especially in life science domain), there is a concern that it would make
the resulting knowledge graph representation much more complex.
Thank you for your answers.
Answers to your questions:
I would say yes, assuming the terms that you intent to use are indeed a match for your use case. I.e., if there is a term that you are interested in using, but say its definition or the synonyms do not match your needs, then I will probably consider not using the term.
Yes, there are. I really recommend reading the paper Ten Simple Rules for Selecting a Bio-ontology and the OBO Tutorial.
Try to keep the number of ontologies you want to use as small as is sensible (that is the smallest set of ontologies that are well aligned with the needs of your use case). The reason for this is that you will want to engage with the designers of the ontologies you use to extend and amend these ontologies for your use case. The more ontologies you use, the chances are that you will need to communicate with a larger community to affect change for your use case. This may increase development times. However, using an ontology that is not well aligned with your use case will also increase communication and timelines. Thus, the reason for keeping the number of ontologies as small as is sensible.
As for your concern regarding importing large ontologies into your ontology, the way this is dealt with is to extract only the terms you are interested using ROBOT and then to import the extracted ontology into your own ontology.
In general, I will really strongly recommend reaching out to the OBO Foundry. They have developed life science related ontologies for a number of years. Working with them you are likely to avoid many of the typical problems people run into when they start designing ontologies.
I have also written up some general guidelines from my perspective wrt choosing biological ontologies here.

Selecting an ontology methodology for developing ontology

My question is that is it mandatory to follow any ontology methodology while developing an ontology?
As per my understandings:
You can develop an ontology without following any specific methodology
You can strictly follow an ontology methodology according to your need/context of your ontology/project.
Instead of strictly following, you can partially/ loosely follow an ontology methodology according to your need/context of your ontology/project
You can even merge steps from multiple ontologies according to your need/context of your ontology/project.
One cannot say that one methodology (i.e NeOn Methodology) is better than another one. you can select any methodology according to your need.
Ontology Development Guidelines and Methodology are same things.
Please comment/guide me point by point. Thanks.
Here are some examples of scientific papers about ontologies construction. These are not "mandatory" but good guidelines indeed.
Kolas, D., Dean, M., & Hebeler, J. (2006). Geospatial Semantic Web:
Architecture of Ontologies (p. 1‑10). IEEE.
https://doi.org/10.1109/AERO.2006.1656068
Denaux, R., Dolbear, C., Hart, G., Dimitrova, V., & Cohn, A. G.
(2011). Supporting domain experts to construct conceptual ontologies:
A holistic approach. Web Semantics: Science, Services and Agents on
the World Wide Web, 9, 113‑127.
https://doi.org/10.1016/j.websem.2011.02.001
Tan, H., Adlemo, A., Tarasov, V., & Johansson, M. E. (2017).
Evaluation of an Application Ontology. In CEUR Workshop Proceedings
(Vol. 2050). Bolzano, Italy.
There is nothing mandatory about how to develop an ontology. However, people have found pitfalls and repeating patterns, hence some methodologies have been developed.
Which one is best for your objectives is very dependent on your objectives. There can be no absolute, general rule.
It depends on the user which method of ontology the user wants to implements. As long as theres no duplication and the data quality is maintained its good to produce and implement.
As the general motivation for using ontologies is to eliminate differences in the meanings of terminologies among different stakeholders, its a good practice to follow some proven ontology development methodologies while developing the ontology to eliminate ontological inadequacies. One way to validate ontological adequacy of taxonomic relationships is by applying the OntoClean methodology, which was one of the first attempts to formalize notions of ontological analysis for information systems. It is based on the general ontological notions drawn from philosophy, like essence, identity, unity, rigidity, and dependence, used to characterize relevant aspects of the intended meaning of the properties, classes, and relations that make up an ontology, and imposing constraints on the taxonomic structure of an ontology.
A common mistake while developing ontologies is the 'misuse' of IS_A relation, commonly known as the IS_A overloading problem, which can be detected and prevented by applying the OntoClean method.
So yes, its a good practice to follow ontology validating methodologies such as OntoClean for validating the ontological adequacy of taxonomic relationships.

What are ontology can do, but relational database can not?

I am new to ontology. After some study, I still do not know what is ontology advantage in application.
I already know ontology can provide more meaningful querying interface than database, and ontology can use reasoner to find hidden info to get better result.
But.
With building a bool table in database to represent new concept for each instance, or simple if-else rule engine. We can get same result as ontology with better performance.
So, what is the most important reason of using ontology in application exactly?
Please refer to Databases vs Ontologies by Ian Horrocks
In short:
Databases has closed world assumption, ontologies has open world
assumption
In databases each individual has a single unique name, but in ontologies individuals might have more than one name
You can infer implicit information from ontologies, in databases you can't.
The schema in an ontology is large and complex but databases have simple and smaller schema. In other words, The focus on formal semantics is much stronger in ontologies than in databases. Because the aim of ontologies is to represent meaning rather than data. Please refer to Ontologies and DB Schema: What's the Difference by Mike Uschold

How do you normalize an ontology the way you would normalize a relational database?

I know how to normalize a relational database. There are methodologies for getting to a fifth normal form. I understand the reasons why you may want to back off to fourth normal or otherwise.
What is the equivalent method for an ontology which describes a graph?
I am not aware of any mechanism for ontologies that is directly comparable to database normalization. The closest match I can think of are ontology design patterns. However, they are much less strict. You can roughly compare them to software design patterns. You can check
http://ontologydesignpatterns.org/wiki/Main_Page
or have a look at some papers, e.g., about the M3O (http://dl.acm.org/citation.cfm?id=1772775), Event Model F or by Aldo Gangemi, among many others. Ontology design patterns also give you certain properties, but they mainly depend on the patterns you use, and which ones are appropriate depends on the modeling taks you try to achieve.
Both design patterns and database normalization try to achieve certain properties. I guess the difference is, that design patterns are less strict. The achieved properties are often depending on the patterns you use, the domain, the purpose etc. So, they are not really as generic as the normal forms.

migrating from direct db access to ontology - does it make sense?

I have seen people working on an ontology. Few benefits, I know, about ontology is re-use and integrating
multiple system i.e If we have
several DB’s and we want to integrate
them with each other, you will get
many problems, because of their
different formats. This problem has
been solved by storing them in
ontology.
We have many system that is based on RDBMS. So to make system intelligent (adding semantics), ontology driven system is used i.e migrating from Db to ontology.
But for this purpose, We have to re-design system (making ontology according to system specifications). Instead of re-inventing the wheel, can't I make minor changes in Db that may make system intelligent.
Lets say I have three systems A,B & C that is based on RDBMS. I cant integrate them but using ontology. For that I have to make ontology for each. Instead of designing ontology, why shouldn't I go for designing Db (keep in mind three system).
I can suggest you to think about D2R.

Resources