JCR SQL2 Query comparing name and path with like - jackrabbit

I was trying to execute following jcr sql2 query:
String expression = "SELECT * FROM [nt:base] AS p " +
"WHERE NAME(p) like 'opony.txt'";
But I got
javax.jcr.UnsupportedRepositoryOperationException.
Is it any other way to search for nodes that names are like '%example%'?
I was also trying to search for nodes with specified path
String expression = "SELECT * FROM [nt:base] " +
"WHERE PATH([nt:base]) like '/a/b'";
But I got
javax.jcr.query.InvalidQueryException: Query:
SELECT * FROM [nt:base] WHERE PATH([(*)nt:base]) like '/a/b'; expected: LENGTH, NAME, LOCALNAME, SCORE, LOWER, UPPER, or CAST
How can I search nodes that paths are like '%example%'?
I am using JCR_SQL2
javax.jcr.query.Query query = queryManager.createQuery(expression, Query.JCR_SQL2);

Edit
A couple of points:
If you don't include a wildcard like '%" in your pattern, there's not much difference between LIKE and the ordinary = operator.
PATH() and NAME() are not supported in all implementations of SQL2. They appear to be OK under JBoss.
To search for nodes within a subtree, use the ISDESCENDANTNODE() function.
There is a [jcr:title] attribute that you can test; it's not the same as the node name, but it might serve your purposes.
So, for your first query, try something like
SELECT * FROM [nt:base] AS p WHERE [jcr:title] like '%opony%'
and ISDESCENDANTNODE('/content/a/b')
The JBoss documentation on querying is fairly good, but be aware that it may not line up exactly with vanilla Jackrabbit. I've also used this cheat sheet for the Magnolia implementation.

Related

Django query filter using large array of ids in Postgres DB

I want to pass a query in Django to my PostgreSQL database. When I filter my query using a large array of ids, the query is very slow and goes up to 70s.
After looking for an answer I saw this post which gives a solution to my problem, simply change the ARRAY [ids] in IN statement by VALUES (id1), (id2), ....
I tested the solution with a raw query in pgadmin, the query goes from 70s to 300ms...
How can I do the same command (i.e. not using an array of ids but a query with VALUES) in Django?
I found a solution building on #erwin-brandstetter answer using a custom lookup
from django.db.models import Lookup
from django.db.models.fields import Field
#Field.register_lookup
class EfficientInLookup(Lookup):
lookup_name = "ineff"
def as_sql(self, compiler, connection):
lhs, lhs_params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params = lhs_params + rhs_params
return "%s IN (SELECT unnest(%s))" % (lhs, rhs), params
This allows to filter like this:
MyModel.objects.filter(id__ineff=<list-of-values>)
The trick is to transform the array to a set somehow.
Instead of (this form is only good for a short array):
SELECT *
FROM tbl t
WHERE t.tbl_id = ANY($1);
-- WHERE t.tbl_id IN($1); -- equivalent
$1 being the array parameter.
You can still pass an array like you had it, but unnest and join. Like:
SELECT *
FROM tbl t
JOIN unnest($1) arr(id) ON arr.id = t.tbl_id;
Or you can keep your query, too, but replace the array with a subquery unnesting it:
SELECT * FROM tbl t
WHERE t.tbl_id = ANY (SELECT unnest($1));
Or:
SELECT * FROM tbl t
WHERE t.tbl_id IN (SELECT unnest($1));
Same effect for performance as passing a set with a VALUES expression. But passing the array is typically much simpler.
Detailed explanation:
IN vs ANY operator in PostgreSQL
How to use ANY instead of IN in a WHERE clause with Rails?
Optimizing a Postgres query with a large IN
Is this an example of the first thing you're asking?
relation_list = list(ModelA.objects.filter(id__gt=100))
obj_query = ModelB.objects.filter(a_relation__in=relation_list)
That would be an "IN" command because you're first evaluating relation_list by casting it to a list, and then using it in your second query.
If instead you do the exact same thing, Django will only make one query, and do SQL optimization for you. So it should be more efficient that way.
You can always see the SQL command you'll be executing with obj_query.query if you're curious what's happening under the hood.
Hope that answers the question, sorry if it doesn't.
I had lots of trouble to make the custom lookup 'ineff' work.
I may have solved it, but would love some validation from Django and Postgres experts.
1) Using it 'directly' on a ForeignKey field (ModelB)
ModelA.objects.filter(ModelB__ineff=queryset_ModelB)
Throws the following exception:
"Related Field got invalid lookup: ineff"
ForeignKey fields cannot be used with custom lookups.
A similar issue is reported here:
Custom lookup is not being registered in Django
2) Using it 'indirectly' on the pk field of related model (ModelB.id)
ModelA.objects.filter(ModelB__id__ineff=queryset_ModelB.values_list('id', flat=True))
Throws the following exception:
"can only concatenate list (not "tuple") to list"
Looking at Django Traceback, I noticed that rhs_params is a tuple.
Yet we try to add it to lhs_params (a list) in our custom lookup.
Hence I changed:
params = lhs_params + rhs_params
into:
params = lhs_params + list(rhs_params)
3) I then got a Postgres error (at least I had passed Django ORM)
"function unnest(uuid) does not exist"
"HINT: No function matches the given name and argument types. You might need to add explicit type casts."
I apparently solved it by changing the sql:
from:
return "%s IN (SELECT unnest(%s))" % (lhs, rhs), params
to:
return "%s IN (SELECT unnest(ARRAY(%s)))" % (lhs, rhs), params
Hence my final as_sql method looks like this:
def as_sql(self, compiler, connection):
lhs, lhs_params = self.process_lhs(compiler, connection)
rhs, rhs_params = self.process_rhs(compiler, connection)
params = lhs_params + list(rhs_params)
return "%s IN (SELECT unnest(ARRAY(%s)))" % (lhs, rhs), params
It seems to work, and is indeed faster than in__ (tested with EXPLAIN ANALYZE in Postgres).
But I would love to have some validation from experts, perhaps Erwin Brandstetter?
Thanks for your input.

Hibernate + MSSQL + Fulltext Search via Contains SQLFunctionTemplate

I've setup a fulltext index on my customer table to be able to quickly search for customers via their info.
I had to create a custom hibernate dialect to simplify the mapping.
and do some funky stuff to get hibernate to work
Hibernate + MSSQL + Fulltext Search via Contains
My custom dialect has a function that looks like this
registerFunction("contains", new SQLFunctionTemplate(StandardBasicTypes.BOOLEAN, "CONTAINS(?1, ?2) AND 1"));
This makes it possible to do queries as follows
from Customer c where contains(c.name, :term) = true
My problem now is that to be able to return parial matches I need to quote the term and add a *
So the raw query would be
select * from customer c where CONTAINS(c.name, '"mycust*"');
I've tried wrapping the SQLFunctionTemplate with quotes and a star, and tried quoting at the call site, but neither work.
Any suggestions on how to create a sql function that does the raw query above?
Just use Hibernates criteria API and sqlRestriction then you can pass any string to the query u want. E.g. searchValue=myCust* etc...
Restrictions.sqlRestriction(" CONTAINS(name, ?)", searchValue, StandardBasicTypes.STRING)

Lucene Query syntax using Boolean Clauses

I have two fields in Lucene
type (can contain values like X, Y, Z)
date (contains values like 2015-18-10 etc)
I want to write following query: (type = X and date=today's data) OR (type = anything except X).
How can I write this query using SHOULD, MUST, MUST_NOT? looks like there is no clause for these type of query.
You can express the latter part using *:* -type:X, as this creates the set of all documents, and then subtracts the set of documents that has type:X. The *:* query is represented as MatchAllDocsQuery in code.
If I got your problem, I think the solution is just some combination of BooleanQuery, following is the code written in Scala to address the issue.
According to the documentation(in BooleanClause.java), MUST_NOT should be used with caution.
Use this operator for clauses that must not appear in the matching documents.Note that it is not possible to search for queries that only consist of a MUST_NOT clause.
object LuceneTest extends App {
val query = new BooleanQuery
val subQuery1 = new BooleanQuery
subQuery1.add(new TermQuery(new Term("type", "xx")), BooleanClause.Occur.MUST)
subQuery1.add(new TermQuery(new Term("date", "yy")), BooleanClause.Occur.MUST)
val subQuery2 = new BooleanQuery
// As mentioned above, so I put MatchAllDocsQuery here to avoid only containing MUST_NOT
subQuery2.add(new MatchAllDocsQuery, BooleanClause.Occur.MUST)
subQuery2.add(new TermQuery(new Term("type", "xx")),BooleanClause.Occur.MUST_NOT)
// subQuery1 and subQuery2 construct two subQueries respectively
// then use OR(i.e SHOULD in Lucene) to combine them
query.add(subQuery1, BooleanClause.Occur.SHOULD)
query.add(subQuery2, BooleanClause.Occur.SHOULD)
query
}
Anyway, hope it helps.

SOQL: Performing Query with both LIKE & IN

I have a list of accounts that I would like to perform a LIKE query on. Ideally, my query would be something like:
SELECT id, owner.name FROM Account WHERE name LIKE IN :entityList
Is there any way I can do this?
The issue is that my entity names come from a third party source, which means that small variations in name may be present, i.e
"Bay Ridge apt." VS "Bay Ridge Apartments"
It's hard to predict where the difference in spelling might be, and I was hoping that the LIKE filter might be some magical filter that can figure this out for me and match on a substring (i.e. "Bay Ridge").
How can I perform this query?
Thank you!
EDIT: The Salesforce guide doesn't include this option in their guide, so it might not be possible to combine a LIKE and an IN. Maybe there's a solution around it?
SOQL Comparison Operators
EDIT: Maybe there's a way to perform a SOSL query on a list? Something like :
Find {entityList} In Account
I can't seem to find this any where...
You can achieve this through looping and make dynamic query depend upon your list.
String[] entityList = new String []{'CEO','CFO','CMO','CTO','CIO','COO','VP','DIRECTOR','VIP'};
Boolean first = true;
query = 'Select Id FROM CONTACT';
for(String el : entityList){
if(!first){
query += ' OR';
} else {
query += ' WHERE';
}
query = query + ' name LIKE \'%' + el + '%\'';
first = false;
}
Turns out, you don't need to have both LIKE & IN. By setting the return value as a list, and referencing an iterator in the query, IN is implied. This query does both:
SELECT id, owner.name FROM Account WHERE name LIKE :entityList
See here:
SOQL: Performing Query with both LIKE & IN
Also, there doesn't seem to be a good way to bulkify the SOSL query

JCR SQL2 query with dynamic date comparison

I need to query the jcr repository to find nodes where a date property (e.g. jcr:created) is younger than a specific date.
Using SQL2, I do the check "jcr:created > date" like that (which works fine):
SELECT * FROM [nt:base] AS s WHERE s.[jcr:created] > CAST('2012-01-05T00:00:00.000Z' AS DATE)
Now the tricky part:
There's an additional property which declares a number of days which have to be added to the jcr:created date dynamically.
Let's say the property contains 5 (days) then the query should not check "jcr:created > date" but rather "(jcr:created + 5) > date". The next node containing the property value 10 should be checked by "(jcr:created + 10) > date".
Is there any intelligent / dynamic operand which could do that? As the property is node specific I cannot add it statically to the query but it has to read it of each node.
Jackrabbit doesn't currently support such dynamic constraints.
I believe the best solution for now is to run the query with a fixed date constraint and then explicitly filter the results by yourself.
An alternative solution would be to precompute the "jcr:created + extratime" value and store it in an additional property. Such computation could either be located in the code that creates/updates the nodes in the first place, or you could place it in an observation listener so it'll get triggered regardless of how the node is being modified.
I had a need to find documents created in last 12 hours
I had a hard time how to get a valid date in the CAST function, Pasting for others who may need it.
SimpleDateFormat dateFromat = new SimpleDateFormat("yyyy-MM-dd'T'HH:mm:ss.SSS'Z'");
cal.setTime(cal.getTime());
cal.add(Calendar.HOUR, -12);
String queryString = "SELECT * FROM [nt:base] AS s WHERE "
+ "ISDESCENDANTNODE([/content/en/documents/]) "
+ "and s.[jcr:created] >= CAST('"+dateFromat.format(cal.getTime())+"' AS DATE)";
I found the receipe there:
test.sql2.txt
A list of test. My query look like:
SELECT * FROM [nt:base] where [jcr:created] > cast('+2012-01-01T00:00:00.000Z' as date)
Everything inside the cast string is require: +yyyy-MM-ddT00:00:00.000Z

Resources