SPARQL DBPedia query for seating capacity, optimize and remove duplicates

SPARQL DBPedia query for seating capacity, optimize and remove duplicates - query-optimization

I want to get all objects with seating capacity information on DBPedia. Optionally, I want to get their label, address, lat and lon information.
My issue is that I get a lot of duplicates even after filtering by language. How can I get distinct entries based on, say, 'address', or any other attribute?
Also, can you tell which part of this query can be improved so that my query doesn't time out when I use the public DBpedia endpoint? Thanks!
PREFIX dbpediaO: <http://dbpedia.org/ontology/>
SELECT ?place ?label ?capacity ?address ?lat ?lon WHERE {
?place dbpedia2:seatingCapacity ?capacity .
OPTIONAL{
?place dbpediaO:address ?address .
?place rdfs:label ?label .
?plage geo:lat ?lat .
?place geo:long ?lon .
}
filter (lang(?label) = "en" || lang(?label) = "eng")
filter (lang(?address) = "en" || lang(?address) = "eng")
}

Your places have multiple values of, for example, address. The unique thing is the URI itself. Moreover, you should put each property in a separate OPTIONAL, or at least use separate OPTIONAL clauses for lat/long. For label you do not need an OPTIONAL clause at all in DBpedia. The only way to get unique places is to group by the place and sample or group_concat all other properties. Something like this:
PREFIX dbo: <http://dbpedia.org/ontology/>
SELECT ?place (sample(?_label) as ?label)
(group_concat(?capacity; separator=";") as ?capacities)
(group_concat(?address; separator=";") as ?adresses) ?lat ?lon
WHERE {
?place dbo:seatingCapacity ?capacity ;
rdfs:label ?_label .
filter (langmatches(lang(?_label),"en"))
OPTIONAL {
?place dbo:address ?address .
filter (langmatches(lang(?address), "en"))
} OPTIONAL {
?place geo:lat ?lat ; geo:long ?lon .
}
}
group by ?place ?lat ?lon
order by desc(?place)
limit 100
As you can see, there are also multiple capacity values for places.

Related

SPARQL how to use concatenated strings for a subject of a subquery?

I created a SPARQL query like below. I get ?year and ?month from SERVICE query in Wikidata, and would like to use them as a part of a subject of a subquery (i.e. ?uri ?p ?o part). I managed to concatenate ?year and ?month and generate a URI, but somehow the the query does not return a result.
I tested both the subquery part (e.g. only using <https://example.com/date/10-1> ?p ?o) and the SERVICE query individually. They both return results properly (13 and 5 results respectively, so the size is not an issue). My guess is it is concatenated variable is a string not URI, which cannot be a subject. But I am not sure. As I am not sure what is wrong, I tried similar queries, but they get time-out, due to the subquery I think. Can you spot the problem and let me know how to fix it? Many thanks in advance!
SELECT DISTINCT ?event ?eventLabel ?d1 ?d2 ?d3 ?date ?year ?month ?uri ?p ?o
WHERE {
SERVICE <https://query.wikidata.org/sparql> {
select DISTINCT ?event ?eventLabel ?d1 ?d2 ?d3 ?date ?year ?month
where{
?event wdt:P31/wdt:P279* wd:Q13418847 .
?event wdt:P276 wd:Q1741 .
OPTIONAL {?event wdt:P580 ?d1}
OPTIONAL {?event wdt:P585 ?d2}
OPTIONAL {?event wdt:P582 ?d3}
BIND(IF(!BOUND(?d1),(IF(!BOUND(?d2),?d3,?d2)),?d1) as ?date)
BIND(year(?date) AS ?year)
BIND(month(?date) AS ?month)
SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE], en". }
}
ORDER BY ?date
LIMIT 100
}
BIND(CONCAT("<https://example.com/date/", str(?year), "-", str(?month), ">") AS ?uri) .
{
SELECT ?uri ?p ?o
WHERE {?uri ?p ?o .}
LIMIT 10
}
}

You can use the IRI function to make the string (without the </>) into a URI:
BIND(IRI(CONCAT("https://example.com/date/", str(?year), "-", str(?month))) AS ?uri) .

graphdb owl-max cardinality restriction not working

Can I restrict insert of data in graphdb based on cardinality rules defined in my ontology.
I loaded the following ontology in a graphdb repository with "owl-max" Ruleset. Based on discussion here. I am trying to restrict that a person can have only one "age" property.
#prefix : <http://stackoverflow.com/q/24188632/1281433/people-have-exactly-one-age#> .
#prefix owl: <http://www.w3.org/2002/07/owl#> .
#prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
#prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
#prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
:Person a owl:Class ;
rdfs:subClassOf [ a owl:Restriction ;
owl:cardinality "1"^^xsd:nonNegativeInteger ;
owl:onProperty :hasAge
] .
<http://stackoverflow.com/q/24188632/1281433/people-have-exactly-one-age>
a owl:Ontology .
:hasAge a owl:DatatypeProperty .
I now insert a person record as below
prefix : <http://stackoverflow.com/q/24188632/1281433/people-have-exactly-one-age#>
prefix data: <http://data.example.com/>
Insert DATA {
data:dow a :Person ;
:hasAge 26 .
}
The next time i insert an updated age for that person,
Insert DATA {
data:dow :hasAge 27 .
}
I expected an that age 27 triple overrides age 26 triple or I get an insert error. However both ages are stored for the person.
data:dow a <http://stackoverflow.com/q/24188632/1281433/people-have-exactly-one-age#Person> ;
<http://stackoverflow.com/q/24188632/1281433/people-have-exactly-one-age#hasAge> "26"^^xsd:integer , "27"^^xsd:integer .

disclaimer: This is not meant to be a proper nor correct answer, I just used it because formatting in comment is weird.
Not sure whether owl-max profile supports the inconsistency rule you'd need here. As a workaround, you could at least try to add a custom rule:
PREFIX sys: <http://www.ontotext.com/owlim/system#>
INSERT DATA {
<_:custom> sys:addRuleset
'''Prefices {
x : http://stackoverflow.com/q/24188632/1281433/people-have-exactly-one-age#
}
Axioms {}
Rules
{
Consistency: max_one_age_value
a <x:hasAge> b
a <x:hasAge> c [Constraint b != c]
-----------------------
}'''
}

Thanks to #aksw for the excellent answers. According to http://graphdb.ontotext.com/documentation/standard/reasoning.html#predefined-rulesets, owl-max and owl-rl should do what was asked.
I want to add a clarification: #trace-log, there is no implicit "overwrite " operation in SPARQL, and it doesn't matter what rule set you use. You must use the DELETE...INSERT... statement to update triples in the way you describe.

SPARQL union not merging the results

I have a sparql query which is the union of three individual queries. On the union i apply a filter. Although my individual queries return rows, but the union doesn't return any rows. I have checked that the output of individual queries have rows which satisfy the outer filter criteria.
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX dbpedia-owl: <http://dbpedia.org/ontology/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX yago: <http://dbpedia.org/class/yago/>
SELECT *
where{
{
SELECT (?LandingURI) AS ?URI ?category
WHERE {
{
?LandingURI rdfs:label ?term .
?LandingURI (dcterms:subject|rdf:type) ?category .
}
FILTER((?term ="Roger"#en))
}
}
UNION
{
SELECT (?redirects) AS ?URI ?category
WHERE {
{
?LandingURI rdfs:label ?term .
?LandingURI <http://dbpedia.org/ontology/wikiPageRedirects> ?redirects .
?redirects (dcterms:subject|rdf:type) ?category .
}
FILTER((?term ="Roger"#en))
}
}
UNION
{
SELECT (?disambiguates) AS ?URI ?category
WHERE {
{
?LandingURI rdfs:label ?term .
?LandingURI <http://dbpedia.org/ontology/wikiPageDisambiguates> ?disambiguates .
?disambiguates (dcterms:subject|rdf:type) ?category .
}
FILTER((?term ="Roger"#en))
}
}
FILTER (?category = dbpedia-owl:TennisPlayer ||
?category = yago:TennisOrganisations ||
?category = <http://dbpedia.org/resource/Category:The_Championships,_Wimbledon>
)
}

I don't know what software you're using to process your query, but it shouldn't accept that query at all; it's not well-formed. You can check your query at sparql.org's query validator.
The bracketing in your query makes it hard to tell exactly where the union is placed, but just in case it's not , union should appear between two group graph patterns, e.g.: { ... } UNION { ... }.
All that said, there's actually no need to use union (or filter) here at all. You can do this all with property paths and values. If I understand it correctly, your query can be rewritten as:
select ?uri ?term ?category where {
#-- acceptable values of ?category
values ?category {
dbpedia-owl:TennisPlayer
yago:TennisOrganisations
<http://dbpedia.org/resource/Category:The_Championships,_Wimbledon>
}
#-- acceptable values of ?term
values ?term {
"Roger"#en
}
#-- get the rdfs:label of the ?landingUri, then
#-- follow an optional redirect or disambiguation
#-- link to the actual uri, and then get any
#-- dcterms:subject or rdf:type from the actual uri.
?landingUri rdfs:label ?term .
?landingUri (dbpedia-owl:wikiPageRedirects|dbpedia-owl:wikiPageDisambiguates)? ?uri
?uri (dcterms:subject|rdf:type) ?category .
}

SPARQL: union on dbpedia changes birthdate

I'm facing a strange behavior by using "union" on the endpoint http://de.dbpedia.org/sparql.
By using
SELECT distinct *
WHERE {
{
?name dcterms:subject category-de:Haus_Liechtenstein.
?name rdf:type foaf:Person.
?name <http://dbpedia.org/ontology/birthDate> ?birthdate.
Optional {?name dbpedia-owl:deathDate ?deathDate.}
Optional {?name <http://de.dbpedia.org/property/gnd> ?gnd.}
}
filter (!bound(?deathDate))
}
Order BY ASC (?birthdate)
the birthdate of "Marie Kinsky" for example is "1940-04-14Z", which is correct (1st row). When I'm now adding a second source with union:
SELECT distinct *
WHERE {
{
?name dcterms:subject category-de:Haus_Liechtenstein.
?name rdf:type foaf:Person.
?name <http://dbpedia.org/ontology/birthDate> ?birthdate.
Optional {?name dbpedia-owl:deathDate ?deathDate.}
Optional {?name <http://de.dbpedia.org/property/gnd> ?gnd.}
}
union{
SERVICE silent <http://dbpedia.org/sparql>{
?name dcterms:subject category-en:Princely_Family_of_Liechtenstein.
?name rdf:type foaf:Person.
?name dbpprop:father ?father.
?name dbpprop:mother ?mother.
?name dbpprop:birthDate ?birthdate.
Optional{?name dbpedia-owl:spouse ?spouse.}
Optional{?name dbpprop:shortDescription ?title.}
Optional{?name dbpedia-owl:individualisedGnd ?gnd.}
Optional {?name dbpedia-owl:deathDate ?deathDate.}
}}
filter (!bound(?deathDate))
}
Order BY ASC (?birthdate)
then I get the Birthdate of "Marie" with "1940-04-13+02:00" which is wrong (first row). By checking the date manually Marie, the birthdate is "1940-04-14".
Can someone explain me this behavior?
Thank you in advance and best regards
Fobi

Try the following query (a very cut down version of your original) on http://de.dbpedia.org/sparql:
PREFIX category-en: <http://dbpedia.org/resource/Category:>
SELECT distinct *
WHERE {
SERVICE silent <http://dbpedia.org/sparql> {
?name dcterms:subject category-en:Princely_Family_of_Liechtenstein.
?name dbpprop:birthDate ?birthdate.
}
}
Note that each of the dates have this suspicious timezone shift, and Marie has 1940-04-13+02:00.
Now try the following on http://dbpedia.org/sparql:
PREFIX category-en: <http://dbpedia.org/resource/Category:>
SELECT distinct *
WHERE {
?name dcterms:subject category-en:Princely_Family_of_Liechtenstein.
?name dbpprop:birthDate ?birthdate.
}
Now I see Marie has birth date 1940-04-14+02:00!
I wonder whether the dbpedia endpoint is trying to make time zone corrections based on the locale of the client? But it really isn't getting it right.
(It's not just Liechtenstein royalty, most birth dates have this feature)
Update:
From the dbpedia mailing list:
We recently recognized that there are inconsistencies with dates in the DBpedia
Sparql endpoint:
dates in the SPARQL endpoint are timezoned +02:00 while on the DBpedia pages
and in the dumps they are not.
[...]
That is most probably an issue of Virtuoso and there already was an issue
raised in 2011 on the virtuoso-users mailing list

How do I limit the number of results for a specific variable in a SPARQL query?

Let's say I have a SPARQL query like this, looking for resources that have some shared property with a focal resource, and also getting some other statements about the focal resource :
CONSTRUCT {
?focal pred:icate ?shared .
?other pred:icate ?shared .
}
WHERE {
?focal pred:icate ?shared ;
more:info ?etc ;
a "foobar" .
?other pred:icate ?shared .
}
LIMIT 500
If there are more than 500 other resources, that LIMIT might exclude that more:info statement and object. So, is there a way to say "I only want at most 500 of ?other", or do I have to break this query into multiple pieces?

You can use LIMIT in subqueries, i.e. something like the following:
CONSTRUCT {
?focal pred:icate ?shared .
?other pred:icate ?shared .
}
WHERE {
?focal pred:icate ?shared ;
more:info ?etc ;
a "foobar" .
{
SELECT ?shared {
?other pred:icate ?shared .
}
LIMIT 500
}
}

http://www.w3.org/TR/2012/WD-sparql11-query-20120105/#modResultLimit
The LIMIT clause puts an upper bound on the number of solutions
returned. If the number of actual solutions, after OFFSET is applied,
is greater than the limit, then at most the limit number of solutions
will be returned.
You can only limit the number of solutions to your query, not a specific subset of it. You can use a subquery with a LIMIT clause though: http://www.w3.org/TR/sparql-features/#Subqueries.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SPARQL DBPedia query for seating capacity, optimize and remove duplicates - query-optimization

Related

SPARQL how to use concatenated strings for a subject of a subquery?

graphdb owl-max cardinality restriction not working

SPARQL union not merging the results

SPARQL: union on dbpedia changes birthdate

How do I limit the number of results for a specific variable in a SPARQL query?

Categories

Resources