How to use ontology alignments to transfer instances from one ontology to another one (Abox to Tbox) - owl

I have an IFC file converted to an ifcOWL (let's call it Ontology A, which is a .owl file) file containing classes and instances of interest. On the other hand, I have a modular ontology (let's call it Ontology B, which is a .owl file) for which the alignments (let's call it Ontology C, which is a .owl file) between these two exist.
I am using Protege as my ontology editor and a knowledge management system.
Now my question is that how can/should I use the alignments (Ontology C) to transfer the instances from (Ontology A) to (OntologyB) using Protege.
Things that I have tried so far are the following:
1-I have tried to create the alignments manually by adding the = or subClassOf relations.
2-Also I have tried http://krrwebtools.cs.ox.ac.uk/logmap/ to the mapping in an automated manner that cannot capture all classes and will only be able to capture all classes consider the same names with different capitalization.
3-I converted the IFC file to ifcOWL using https://github.com/pipauwel/IFCtoRDF
4-I have created a modular ontology with different namings and conventions than the initial ifcOWL.This is because the Ontology (B) is using only a subset of IFC classes. The created ontology (B) will be a subset of 2 or 3 sources of data that are created in form of an ontology for which I will need to pull the instances and query them.

Related

Database Schema for Storing C++ Classes

I'm trying to come up with a way of storing the underlying structure of C++ classes in a database, and trying to determine what the best type of database would be/how to lay out information in that database.
C++ classes by themselves as code would be parsed into an AST, which would suggest a tree as a possibile data structure, which would fit well into a graph database. However, I don't think the data could be stored purely as a tree, once you consider that pointers could create a loop. The thought then would be a graph. Being someone primarily familiar with relational databases, I'm not sure what the plausibility of that is. The primary pieces of data that would be needed for a class are:
Class name
Children nodes, including their type and offset within the struct
For children nodes that aren't a primitive type, they would have a relationship to the class. This by itself seems like it would fit well inside of a graph database. The main thing I'm having a hard time envisioning is how things like pointers or arrays would be stored. Would those just have different attributes on the edge between the two classes? Is there some other way of storing this data that I'm missing which would work better?

How to store large individual set (ABox) out of ontology preserving reasoning results?

I have a quite simple ontology written in OWL 2, describing my particular world (TBox).
Since I have to store a lot of individuals, I would like to avoid scalability problems of triplestores, storing them out of the ontology (e.g., in a database table). At the same time, I would like to exploit reasoning capabilities.
For example, if I stored my individuals assertions directly into my ontology:
:Mary a :Female
:Mary :motherOf :John
...
with an opportune modelling, I could entail after a reasoning that:
:Mary a :Parent
:Mary a :Mother
...
...but what if I stored my individuals assertions outside of my ontology? Is there a framework/best practice to manage this scenario?

How do we make RDF schema compatible with OWL?

I have been trying to research making rdf schema compatible with web ontology language but I am still new and still getting mixed up. any help with this is highly appreciated.
I need to know if there is anything I should remove or omit from rdfs to make it compatible with the owl.
To the best of my knowledge, almost every RDFS class expression and property hierarchy are valid in OWL.
Exceptions are containers and uses of rdfs:Resource and rdf:Property.
Edit:
From the OWL 2 specs:
2.3 Semantics
The OWL 2 Structural Specification document defines the abstract structure of OWL 2 ontologies, but it does not define their meaning. The Direct Semantics [OWL 2 Direct Semantics] and the RDF-Based Semantics [OWL 2 RDF-Based Semantics] provide two alternative ways of assigning meaning to OWL 2 ontologies, with a correspondence theorem providing a link between the two. These two semantics are used by reasoners and other tools, e.g., to answer class consistency, subsumption and instance retrieval queries.
So you need to be aware first of what semantics is appropriate for your application. RDF semantics is fully included in OWL 2 FULL, so if you /need/ all RDF constructs, you'll have to deal with OWL 2 FULL, which means any reasoners you can use will be incomplete.
The most common situation, however, is to need only OWL 2 DL or a simpler profile; this poses restrictions on the RDF constructs used.
As mentioned before, subclass axioms in RDFS are compatible with OWL; subproperty axioms as well. Restrictions are: all classes and properties need to be declared; properties can be declared as object, data or annotation properties, but cannot have two types.
For an RDF centric view, see this blog post: http://www.epimorphics.com/web/wiki/owl-2-rdf-vocabularies

How to model many to many with an "include all" option

lets say I have a number of custom report types. One report type can be associated with a number of file extension options (pdf, excel etc), each file extension for that specific custom report type has an number legal actions that can be used (say for report type A with pdf extension I may save, print and modify). However, the legal actions are mostly identical between extensions for a report, only in very few occasions do they differ. How would you model this relationship?
If I also would want the option that a custom report type should have all extension types and the same actions for each extension type, would you introduce some "magic value" like extension type * indicates that this report type should include all available extension types along with a base set of legal action, or would you simply populate the relationship manually with all extension types, legal actions etc and remember to update them when new extension types are introduced? This is not that common though.
Hope the question is somewhat clear:)
I did struggle to follow the question, but I think you're saying:
a report type has one or more file extensions.
a file extension has one or more permissable action.
many file extensions share the same set of permissable actions.
If that's what you're saying, I would introduce the concept of "permissionSet", with a many-to-many relationship to individual permissions. A file extension has a many to many relationship with permission sets.

How to save R list object to a database?

Suppose I have a list of R objects which are themselves lists. Each list has a defined structure: data, model which fits data and some attributes for identifying data. One example would be time series of certain economic indicators in particular countries. So my list object has the following elements:
data - the historical time series for economic indicator
country - the name of the country, USA for example
name - the indicator name, GDP for example
model - ARIMA orders found out by auto.arima in suitable format, this again may be a list.
This is just an example. As I said suppose I have a number of such objects combined into a list. I would like to save it into some suitable format. The obvious solution is simply to use save, but this does not scale very well for large number of objects. For example if I only wanted to inspect a subset of objects, I would need to load all of the objects into memory.
If my data is a data.frame I could save it to database. If I wanted to work with particular subset of data I would use SELECT and rely on database to deliver the required subset. SQLite served me well in this regard. Is it possible to replicate this for my described list object with some fancy database like MongoDB? Or should I simply think about how to convert my list to several related tables?
My motivation for this is to be able to easily generate various reports on the fitted models. I can write a bunch of functions which produce some report on a given object and then just use lapply on my list of objects. Ideally I would like to parallelise this process, but this is a another problem.
I think I explained the basics of this somewhere once before---the gist of it is that
R has complete serialization and deserialization support built in, so you can in fact take any existing R object and turn it into either a binary or textual serialization. My digest package use that to turn the serialization into hash using different functions
R has all the db connectivity you need.
Now, what a suitable format and db schema is ... will depend on your specifics. But there is (as usual) nothing in R stopping you :)
This question has been inactive for a long time. Since I had a similar concern recently, I want to add the pieces of information that I've found out. I recognise these three demands in the question:
to have the data stored in a suitable structure
scalability in terms of size and access time
the possibility to efficiently read only subsets of the data
Beside the option to use a relational database, one can also use the HDF5 file format which is designed to store a large amount of possible large objects. The choice depends on the type of data and the intended way to access it.
Relational databases should be favoured if:
the atomic data items are small-sized
the different data items possess the same structure
there is no anticipation in which subsets the data will be read out
convenient transfer of the data from one computer to another is not an issue or the computers where the data is needed have access to the database.
The HDF5 format should be preferred if:
the atomic data items are themselves large objects (e.g. matrices)
the data items are heterogenous, it is not possible to combine them into a table like representation
most of the time the data is read out in groups which are known in advance
moving the data from one computer to another should not require much effort
Furthermore, one can distinguish between relational and hierarchial relationships, where the latter is contained in the former. Within a HDF5 file, the information chunks can be arranged in a hierarchial way, e.g.:
/Germany/GDP/model/...
/Germany/GNP/data
/Austria/GNP/model/...
/Austria/GDP/data
The rhdf5 package for handling HDF5 files is available on Bioconductor. General information on the HDF5 format is available here.
Not sure if it is the same, but I had some good experience with time series objects with:
str()
Maybe you can look into that.

Resources