how to write (union inside optional clause) or (optionals with union clause )? - union

for example to illustrate the issue, I need to query for some companies in sparql in dbpedia, and get their homepages.
some of these companies have a website, while others does have, so we must query inside sparql optional clause right?
so to get addresses in my query I wrote it inside optional clause as following:
...
optional {?company foaf:homepage ?website}
however not all triples stores the website address by using foaf:homepage URI, some of these triples used dbp:homepage and the previous clause will be as following:
...
optional {?company dbp:homepage ?website}
the problem is how to use UNION to get complementary data between these two clauses?
the same problem appears when querying for people data in dbpedia, for example to query for birth date :
not all persons have birthdate literal values in dpbedia.
some persons have birthdate in dbo:birthDate
some persons have birthdate in dbp:birthDate
again we need to query in optional clause and need to complement/union the date.
to solve this the query should be like :
optional { {?person dbp:birthDate ?birthdate } union {?person dbo:birthDate ?birthdate} }
or as following:
where { {optional { ?person dbp:birthDate ?birthdate }} union {optional {?person dbo:birthDate ?birthdate}}}
I think I'm close, however I did not success... till now ;)
so how to solve this issue please ?

In a case like this:
optional {?company foaf:homepage ?website}
optional {?company dbp:homepage ?website}
You can use an alternation property path:
optional {?company foaf:homepage|dbp:homepage ?website}
Have a look Why is this SPARQL query not returning any results? for some discussion of one of the difficulties that can arise with having multiple optional blocks that bind the same variable, as well as a similar solution.

Related

How to get regular expressions working in the filter clause in Azure Cognitive Search?

I cant seem to get the filter clause to retrieve documents from my index using a regex clause. The schema for my index is straight forward, I only have a single field which is both searchable and filterable and is of type Edm.String, called someId (which normally contains a hash value of something) and has sample values like:
someId
k6l7k2oj
k6l55iq8
k6l61ff8
...
I need to be able to extract all values from this field that start with K6 and end with 8. So based on the documentation I am using this in my POST request body
{
"filter": "search.ismatch('/^k6[?]*d$/','someId','simple','all')",
"select":"someId",
"count":"true"
}
and it comes up with nothing.
On the other hand if I simplify and say I only need data where someId starts with K6, I seem to get some success if i just use a wild card.
like this:
{
"filter": "search.ismatch('k6l*','someId','simple','all')",
"select":"someId",
"count":"true"
}
I do get what I am looking for. Question is why does the regex not work with search.isMatch(), what am i missing?
...
Regex is part of the full Lucene syntax; It is not available in the simple syntax. Try changing the third parameter of search.ismatch to 'full'.
Also, did you mean to use search.ismatch or search.ismatchscoring? The latter is functionally equivalent to using the top-level search, searchFields, queryType, and searchMode parameters. The former does not count matches towards relevance scoring.
Your regex does not do what you intend either it seems. I tested your regex with your sample data and it does not match. Try this regex instead:
^k6.{5}8$
It matches a lowercase k6 from the start of the string, followed by 5 characters of anything and finally an 8.
Complete example
{ "filter": "search.ismatch('^k6.{5}8$','someId','full','all')", "select":"someId", "count":"true" }
Thanks to Dan and Bruce.
This exact expression worked for me
{
"filter": "search.ismatch('/k6.{5}8/','someId','full','all')",
"select":"someId",
"count":"true"
}

Groupby and count() with alias and 'normal' dataframe: python pandas versus mssql

Coming from a SQL environment, I am learning some things in Python Pandas. I have a question regarding grouping and aggregates.
Say I group a dataset by Age Category and count the different categories. In MSSQL I would write this:
SELECT AgeCategory, COUNT(*) AS Cnt
FROM TableA
GROUP BY AgeCategory
ORDER BY 1
The result set is a 'normal' table with two columns, the second column I named Count.
When I want to do the equivalent in Pandas, the groupby object is different in format. So now I have to reset the index and rename the column in a following line. My code would look like this:
grouped = df.groupby('AgeCategory')['ColA'].count().reset_index()
grouped.columns = ['AgeCategory', 'Count']
grouped
My question is if this can be accomplished in one go. Seems like I am over-doing it, but I lack experience.
Thanks for any advise.
Regards, M.
Use parameter name in DataFrame.reset_index:
grouped = df.groupby('AgeCategory')['ColA'].count().reset_index(name='Count')
Or:
grouped = df.groupby('AgeCategory').size().reset_index(name='Count')
Difference is GroupBy.count exclude missing values, GroupBy.size not.
More information about aggregation in pandas.

LIKE query on elements of flat jsonb array

I have a Postgres table posts with a column of type jsonb which is basically a flat array of tags.
What i need to do is to somehow run a LIKE query on that tags column elements so that i can find a posts which has a tags beginning with some partial string.
Is such thing possible in Postgres? I'm constantly finding super complex examples and no one is ever describing such basic and simple scenario.
My current code works fine for checking if there are posts having specific tags:
select * from posts where tags #> '"TAG"'
and I'm looking for a way of running something among the lines of
select * from posts where tags #> '"%TAG%"'
SELECT *
FROM posts p
WHERE EXISTS (
SELECT FROM jsonb_array_elements_text(p.tags) tag
WHERE tag LIKE '%TAG%'
);
Related, with explanation:
Search a JSON array for an object containing a value matching a pattern
Or simpler with the #? operator since Postgres 12 implemented SQL/JSON:
SELECT *
-- optional to show the matching item:
-- , jsonb_path_query_first(tags, '$[*] ? (# like_regex "^ tag" flag "i")')
FROM posts
WHERE tags #? '$[*] ? (# like_regex "TAG")';
The operator #? is just a wrapper around the function jsonb_path_exists(). So this is equivalent:
...
WHERE jsonb_path_exists(tags, '$[*] ? (# like_regex "TAG")');
Neither has index support. (May be added for the #? operator later, but not there in pg 13, yet). So those queries are slow for big tables. A normalized design, like Laurenz already suggested would be superior - with a trigram index:
PostgreSQL LIKE query performance variations
For just prefix matching (LIKE 'TAG%', no leading wildcard), you could make it work with a full text index:
CREATE INDEX posts_tags_fts_gin_idx ON posts USING GIN (to_tsvector('simple', tags));
And a matching query:
SELECT *
FROM posts p
WHERE to_tsvector('simple', tags) ## 'TAG:*'::tsquery
Or use the english dictionary instead of simple (or whatever fits your case) if you want stemming for natural English language.
to_tsvector(json(b)) requires Postgres 10 or later.
Related:
Get partial match from GIN indexed TSVECTOR column
Pattern matching with LIKE, SIMILAR TO or regular expressions in PostgreSQL

Using expressions for a value in Paramaters

I have a report that returns various products depending on which product group you select. Most of these products all have similar product codes that allow me to use the LIKE operator to get the required results. However, for one particular product group, I have the following problem:
VSAMPLES
VSAMPLES2016
VSAMPLES2016DD
VSAMPLESADD
VSAMPLESET
VSAMPLESLARGE
VSAMPLESLARGEADD
VSAMPLESNEW
I only need the top two products to be listed. But using 'VSAMPLES% as a parameter value will return all of these products.
Can i write an expression for the parameter value that will use 'VSAMPLES% and 'VSAMPLES2016% to only return these two products?
EDIT
The query is:
SELECT STRC_CODE, STRC_DESC FROM DeFactoUser.F_ST_Products
WHERE STRC_CODE LIKE #ProductCode
I am using LIKE so I don't have to specify dozens of products for each group.
For one Parameter value I am using 'PA.A% This works perfectly because every product starting with PA.A is needed. In the case of VSAMPLES this isn't the case.
Parameter Values are as follows:
So, can I not add a value to the Aspire tab that will return only those two products?
OK, What i was asking might not have been possible. i fixed the issue by altering my query.
SELECT STRC_CODE, STRC_STATUS, STRC_DESC FROM DeFactoUser.F_ST_Products
WHERE STRC_CODE LIKE #ProductCode AND STRC_CODE NOT IN ('VSAMPLES2016DD',
'VSAMPLESADD', 'VSAMPLESET', 'VSAMPLESLARGE', 'VSAMPLESLARGEADD',
'VSAMPLESNEW')
This results in only the two products I needed being returned when i use VSAMPLES% as a value.
Much simpler then I thought.
Thanks for the input into the question I asked.

Rails 3, ActiveRecord, PostgreSQL - ".uniq" command doesn't work?

I have following query:
Article.joins(:themes => [:users]).where(["articles.user_id != ?", current_user.id]).order("Random()").limit(15).uniq
and gives me the error
PG::Error: ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 1: ...s"."user_id" WHERE (articles.user_id != 1) ORDER BY Random() L...
When I update the original query to
Article.joins(:themes => [:users]).where(["articles.user_id != ?", current_user.id]).order("Random()").limit(15)#.uniq
so the error is gone... In MySQL .uniq works, in PostgreSQL not. Exist any alternative?
As the error states for SELECT DISTINCT, ORDER BY expressions must appear in select list.
Therefore, you must explicitly select for the clause you are ordering by.
Here is an example, it is similar to your case but generalize a bit.
Article.select('articles.*, RANDOM()')
.joins(:users)
.where(:column => 'whatever')
.order('Random()')
.uniq
.limit(15)
So, explicitly include your ORDER BY clause (in this case RANDOM()) using .select(). As shown above, in order for your query to return the Article attributes, you must explicitly select them also.
I hope this helps; good luck
Just to enrich the thread with more examples, in case you have nested relations in the query, you can try with the following statement.
Person.find(params[:id]).cars.select('cars.*, lower(cars.name)').order("lower(cars.name) ASC")
In the given example, you're asking all the cars for a given person, ordered by model name (Audi, Ferrari, Porsche)
I don't think this is a better way, but may help to address this kind of situation thinking in objects and collections, instead of a relational (Database) way.
Thanks!
I assume that the .uniq method is translated to a DISTINCT clause on the SQL. PostgreSQL is picky (pickier than MySQL) -- all fields in the select list when using DISTINCT must be present in the ORDER_BY (and GROUP_BY) clauses.
It's a little unclear what you are attempting to do (a random ordering?). In addition to posting the full SQL sent, if you could explain your objective, that might be helpful in finding an alternative.
I just upgraded my 100% working and tested application from 3.1.1 to 3.2.7 and now have this same PG::Error.
I am using Cancan...
#users = User.accessible_by(current_ability).order('lname asc').uniq
Removing the .uniq solves the problem and it was not necessary anyway for this simple query.
Still looking through the change notes between 3.1.1 and 3.2.7 to see what caused this to break.

Resources