Generate individuals from OWL class defintion - owl

I'm fairly new to ontologies and have the following situation:
Given a class definition, I want to automatically generate individuals based on all possible combinations of a given restriction.
For example:
Let's say a "Pizza" class has the property "hasTopping" which is supposed to be linked to an individual of class "Topping". I want to generate an individual of the class Pizza for each individual existing for a Topping. If there are two Topping individuals, Tomato and Cheese, I want to create one Pizza individual with "hasTopping Tomato" and one with "hasTopping Cheese".
Is there any general way to generate individuals in ontologies like this? (As an alternative to implement it myself.)
Is this "violating" the intent/purpose of ontologies in general? Would this usually be handled in a different way? (I'm not completely familiar with ontologies yet.)

There's no standard method to do this, so I think you'll have to implement it yourself. The Leigh University Benchmark does something similar, so it might provide you with some ideas: http://swat.cse.lehigh.edu/projects/lubm/
I don't think this violates the idea behind ontologies at all - seems quite straightforward. There is no best practice for it, so however you choose to implement it will probably be adequate.

Related

Ontologies only built with classes and not class instances

I am wondering why public biomedical ontologies are often organized in such a way that there are no class instances and only classes? I understand it in a way that all instances are classes, but I do not understand what is the advantage or purpose of such modelling? Those classes have also only annotation properties. For example NCIT ontology: https://bioportal.bioontology.org/ontologies/NCIT/?p=summary.
I would appreciate if someone could provide me with an explanation what is the purpose of such model and if there is an advantage to a model where classes have class instances. I am definitively not an expert in the field and I only was working on modelling 'standard' ontologies with classes and their instances.
TLDR
The reason for preferring classes over individuals (or instances) is that classes allow for sophisticated reasoning which is used to infer classification hierarchies.
The longer answer
The semantics of OWL allows you to make the following type of statements:
ClassExpression1 is a subclass of ClassExpression2
PropertyExpression1 is a subproperty of PropertyExpression2
Individual c1 is an instance of Class1
Individual x is related to individual y via property1
Of these 4 options, (1) by far allows for the most sophistication. Intuitively it comes down to how much each of these allow you to express and the reasoning capability to derive inferences from those statements. To get an intuitive feel of this, using the OWL Direct Semantics, we can see what
ClassExpression1 and ClassExpression2 can be substituted with:
There no way that this expressivity can be achieved using individuals.
Individuals vs Classes
In your question you say that all instances (individuals) are classes. This is not exactly true. Rather, classes consists of instances or instances belong to classes. From a mathematical perspective classes are sets and individuals are members of a set.
Annotations in biomedical ontologies
These ontologies have a substantial (80%-90%) amount of annotations. However, they do have lots of logical axioms. You can see it for example when you look at http://purl.obolibrary.org/obo/NCIT_C12392 on the righthandside, if you scroll down to the bottom, you will see the axioms listed:

Is it possible to filter the class display in Protegé?

I am designing a new ontology, which references the (very large!) Ontology of Units of Measure (om-2): http://www.ontology-of-units-of-measure.org/page/om-2
This makes it hard to work with Protegé because it seems that I am left with one of two bad alternatives:
The small number of classes in my ontology are swamped in the display by om-2 classes (displaying with prefix helps, but only a little).
I don't include om-2 in the Protegé project and just refer to classes from there. As far as I can tell, this hampers Protegé's ability to work with a DL reasoner.
Is there some way to tell Protegé to filter the display and hide the om-2 classes? Being able to toggle this would be a huge help.

Can I use OWL API to enforce specific subject-predicate-object relationships?

I am working on a project using RDF data and I am thinking about implementing a data cleanup method which will run against an RDF triples dataset and flag triples which do not match a certain pattern, based on a custom ontology.
For example, I would like to enforce that class http://myontology/A must denote http://myontology/Busing the predicate http://myontology/denotes. Any instance of Class A which does not denote an instance of Class B should be flagged.
I am wondering if a tool such as the OWLReasoner from OWL-API would have the capability to accomplish something like this, if I designed a custom axiom for the Reasoner. I have reviewed the documentation here: http://owlcs.github.io/owlapi/apidocs_4/org/semanticweb/owlapi/reasoner/OWLReasoner.html
It seems to me that the methods available with the Reasoner might not be up for the purpose which I would like to use them for, but I'm wondering if anyone has experience using OWL-API for this purpose, or knows another tool which could do the trick.
Generally speaking, OWL reasoning is not well suited to finding information that's missing in the input and flagging it up: for example, if you create a class that asserts that an instance of A has exactly one denote relation to an instance of B, and have an instance of A that does not, under Open World assumption the reasoner will just assume that the missing statement is not available, not that you're in violation.
It would be possible to detect incorrect denote uses - if, instead of relating to an instance of B, the relation was to an instance of a class disjoint with B. But this seems a different use case than the one you're after.
You can implement code with the OWL API to do this check, but it likely wouldn't benefit from the ability to reason, and given that you're working at the RDF level I'd think an API like Apache Jena might actually work better for you (you won't need to worry if your input file is not OWL compliant, for example).

Describing "inclusion" in ontologies using Protege

I am using Protege 4.3.0 to describe remediation activities in oil-damaged areas.
I am a complete newbie at ontologies and followed Matthew Horridge's tutorial.
He expresses the fact that every Pizza has some Toppings through the propriety hasTopping, that it has one base through hasBase etc...
I was wondering what would have been the drawbacks of creating a general property "has" and expressing the fact with
Pizza has some Topping
Pizza has max 1 Base
and so on ...
Any consideration?
Adriano
The general rule in creating ontologies is to be as specific as possible. Based on the Pizza ontology example and the two main object properties:
hasTopping
hasBase
If you only define "has" instead of the two, it means that you can say:
Pizza has max 1 PizzaBase
Pizza has min 3 PizzaTopping
Imagine that you have FrenchPizza that is equivalent to:
has some (TomatoTopping and ThinBase)
This will result in an inconsistency, since PizzaBase and PizzaTopping are disjoint and it cannot distinguish between the property relating them. However, if you had the original two properties, this would not have occurred.
Hope this helps.
Using has would be fine in many situations. As opposed to what Conquering Scientist said, I see no reason to be as specific as possible. In fact, it such was the case, the Pizza ontology would not be specific enough. However, using simply the verb has for the name of the property would probably be prone to mistakes. But you could have a property hasIngredient that is more general than hasTopping and hasBase.
One advantage of defining hasTopping is that you can set its domain and range independently from hasBase, so that:
<p> <hasTopping> <t> .
entails:
<t> a <Topping> .
while:
<p> <has> <t> .
does not say anything about <t>.
In any case, you must conscious that the Pizza tutorial is not a tutorial for good ontology modelling. It is merely presenting all the features of Protégé 4. If I was selling pizzas and wanted to organise the information with SemWeb technologies, I would never use such an ontology.

Is it a bad practice to have multiple classes in the same file?

I used to have one class for one file. For example car.cs has the class car. But as I program more classes, I would like to add them to the same file. For example car.cs has the class car and the door class, etc.
My question is good for Java, C#, PHP or any other programming language. Should I try not having multiple classes in the same file or is it ok?
I think you should try to keep your code to 1 class per file.
I suggest this because it will be easier to find your class later. Also, it will work better with your source control system (if a file changes, then you know that a particular class has changed).
The only time I think it's correct to use more than one class per file is when you are using internal classes... but internal classes are inside another class, and thus can be left inside the same file. The inner classes roles are strongly related to the outer classes, so placing them in the same file is fine.
In Java, one public class per file is the way the language works. A group of Java files can be collected into a package.
In Python, however, files are "modules", and typically have a number of closely related classes. A Python package is a directory, just like a Java package.
This gives Python an extra level of grouping between class and package.
There is no one right answer that is language-agnostic. It varies with the language.
One class per file is a good rule, but it's appropriate to make some exceptions. For instance, if I'm working in a project where most classes have associated collection types, often I'll keep the class and its collection in the same file, e.g.:
public class Customer { /* whatever */ }
public class CustomerCollection : List<Customer> { /* whatever */ }
The best rule of thumb is to keep one class per file except when that starts to make things harder rather than easier. Since Visual Studio's Find in Files is so effective, you probably won't have to spend much time looking through the file structure anyway.
No I don't think it's an entirely bad practice. What I mean by that is in general it's best to have a separate file per class, but there are definitely good exception cases where it's better to have a bunch of classes in one file. A good example of this is a group of Exception classes, if you have a few dozen of these for a given group does it really make sense to have separate a separate file for each two liner class? I would argue not. In this case having a group of exceptions in one class is much less cumbersome and simple IMHO.
I've found that whenever I try to combine multiple types into a single file, I always end going back and separating them simply because it makes them easier to find. Whenever I combine, there is always ultimately a moment where I'm trying to figure out wtf I defined type x.
So now, my personal rule is that each individual type (except maybe for child classes, by which a mean a class inside a class, not an inherited class) gets its own file.
Since your IDE Provides you with a "Navigate to" functionality and you have some control over namespacing within your classes then the below benefits of having multiple classes within the same file are quite worth it for me.
Parent - Child Classes
In many cases i find it quite helpful to have Inherited classes within their Base Class file.
It's quite easy then to see which properties and methods your child class inherits and the file provides a faster overview of the overall functionality.
Public: Small - Helper - DTO Classes
When you need several plain and small classes for a specific functionality i find it quite redundant to have a file with all the references and includes for just a 4-8 Liner class.....
Code navigation is also easier just scrolling over one file instead of switching between 10 files...Its also easier to refactor when you have to edit just one reference instead of 10.....
Overall breaking the Iron rule of 1 class per file provides some extra freedom to organize your code.
What happens then, really depends on your IDE, Language,Team Communication and Organizing Skills.
But if you want that freedom why sacrifice it for an iron rule?
The rule I always go by is to have one main class in a file with the same name. I may or may not include helper classes in that file depending on how tightly they're coupled with the file's main class. Are the support classes standalone, or are they useful on their own? For example, if a method in a class needs a special comparison for sorting some objects, it doesn't bother me a bit to bundle the comparison functor class into the same file as the method that uses it. I wouldn't expect to use it elsewhere and it doesn't make sense for it to be on its own.
If you are working on a team, keeping classes in separate files make it easier to control the source and reduces chances of conflicts (multiple developers changing the same file at the same time). I think it makes it easier to find the code you are looking for as well.
It can be bad from the perspective of future development and maintainability. It is much easier to remember where the Car class is if you have a Car.cs class. Where would you look for the Widget class if Widget.cs does not exist? Is it a car widget? Is it an engine widget? Oh maybe it's a bagel widget.
The only time I consider file locations is when I have to create new classes. Otherwise I never navigate by file structure. I Use "go to class" or "go to definition".
I know this is somewhat of a training issue; freeing yourself from the physical file structure of projects requires practice. It's very rewarding though ;)
If it feels good to put them in the same file, be my guest. Cant do that with public classes in java though ;)
You should refrain from doing so, unless you have a good reason.
One file with several small related classes can be more readable than several files.
For example, when using 'case classes', to simulate union types, there is a strong relationship between each class.
Using the same file for multiple classes has the advantage of grouping them together visually for the reader.
In your case, a car and a door do not seem related at all, and finding the door class in the car.cs file would be unexpected, so don't.
As a rule of thumb, one class/one file is the way to go. I often keep several interface definitions in one file, though. Several classes in one file? Only if they are very closely related somehow, and very small (< 5 methods and members)
As is true so much of the time in programming, it depends greatly on the situation.
For instance, what is the cohesiveness of the classes in question? Are they tightly coupled? Are they completely orthogonal? Are they related in functionality?
It would not be out of line for a web framework to supply a general purpose widgets.whatever file containing BaseWidget, TextWidget, CharWidget, etc.
A user of the framework would not be out of line in defining a more_widgets file to contain the additional widgets they derive from the framework widgets for their specific domain space.
When the classes are orthogonal, and have nothing to do with each other, the grouping into a single file would indeed be artificial. Assume an application to manage a robotic factory that builds cars. A file called parts containing CarParts and RobotParts would be senseless... there is not likely to be much of a relation between the ordering of spare parts for maintenance and the parts that the factory manufactures. Such a joining would add no information or knowledge about the system you are designing.
Perhaps the best rule of thumb is don't constrain your choices by a rule of thumb. Rules of thumb are created for a first cut analysis, or to constrain the choices of those who are not capable of making good choices. I think most programmers would like to believe they are capable of making good decisions.
The Smalltalk answer is: you should not have files (for programming). They make versioning and navigation painful.
One class per file is simpler to maintain and much more clear for anyone else looking at your code. It is also mandatory, or very restricted in some languages.
In Java for instance, you cannot create multiple top level classes per file, they have to be in separate files where the classname and filename are the same.
(C#) Another exception (to one file per class) I'm thinking of is having List in the same file as MyClass. Where I envisage using this is in reporting. Having an extra file just for the List seems a bit excessive.

Resources