I have a document embedded in a list, and I need to query for documents matching the value of a series of Strings.
My document:
As you can see, I have a regular document. Inside that document is "categories" which is of an unknown length per document. Inside categories I have "confidence" and "title". I need to query to find documents which have a titles matching a list of title I have in an ArrayList. The query I thought would work is:
FOR document IN documents FILTER document.categories.title IN #categories RETURN article
#categories is an ArrayList with a list of titles. If any of the titles in the ArrayList are in the document, I would like it to be returned.
This query seems to be returning null. I don't think it is getting down to the level of comparing the ArrayList to the "title" field in my document. I know I can access the "categories" list using [#] but I don't know how to search for the "title"s in "categories".
The query
FOR document IN documents
FILTER document.categories.title IN #categories
RETURN article
would work if document.categories is a scalar, but it will not work if document.categories is an array. The reason is that the array on the left-hand side of the IN operator will not be auto-expanded.
To find the documents the query could be rewritten as follows:
FOR document IN documents
FILTER document.categories[*].title IN #categories
RETURN document
or
FOR document IN documents
LET titles = (
FOR category IN document.categories
RETURN category.title
)
FILTER titles IN #categories
RETURN document
Apart from that, article will be unknown unless there is a collection named article. It should probably read document or document.article.
Related
Our service's default search web page uses the * full Lucene query to match all documents. This is before the user has provided any search terms. There is some data (test data, in our case) that we want to exclude from the search result.
Is it possible to match all documents but exclude a subset of all documents?
For example, suppose we have an "owners" field and we want to exclude documents with the "testA" and "testB" owner. The following query does not seem to work with the match all approach:
Query: search=* -owners:testA -owners:testB&queryType=full&$orderby=created desc
Error: "Failed to parse query string. See https://aka.ms/azure-search-full-query for supported syntax."
When searching for anything but *, this approach works fine. For example:
Query: search=foo -owners:testA -owners:testB&queryType=full&$orderby=created desc
Result: (many documents matched)
I have considered a $filter for this and using $filter=filterableOwners/all(p: p ne 'testa' and p ne 'testb') but this has the following drawbacks:
the index must be rebuild with a filterable field
analyzers can't be used so case-insensitivity must be implemented by lowercasing the values and filter expression
Ideally this could be done using only the search query parameter with a Lucene query text.
I found a workaround for the issue. If you have a field in your documents that always has a value, you can use a .* regex to match all values in the field and therefore match all documents.
For example, suppose the packageId field has a value for all documents.
Incorrect (as posted in the original question):
Query: search=* -owners:testA -owners:testB&queryType=full&$orderby=created desc
Correct:
Query: search=packageId:/.*/ -owners:testA -owners:testB&queryType=full&$orderby=created desc
I would like to create an output based on the field-names of my Solr index objects.
What I have are objects like this e.g.:
{
"Id":"ID12345678",
"GroupKey":"Beta",
"PricePackage":5796.0,
"PriceCoupon":5316.0,
"PriceMin":5316.0
}
Whereby the Price* fields may vary from object to object, some might have more than three of those, some less, however they would be always prefixed with Price.
How can I query Solr to get a list with all field-names prefixed by Price?
I've looked into filters, facets but could not find any clue on how to do this, as all examples - e.g. regex facet - are in regard to the field-value, not the field-name itself. Or at least I could not adapt it to that.
You can get a comma separated list of all existing field names if you query for 0 documents and use the csv response writer (wt parameter) to generate the field name list.
For example if you request /solr/collection/select?q=*:*&wt=csv you get a list of all fields. If you only want fields prefixed with Price you could also add the field list parameter (fl) to limit the fields.
So the request to /solr/collection/select?q=*:*&wt=csv&fl=Price*should return the following response:
PricePackage,PriceCoupon,PriceMin
With this solution you get all fields existing including dynamic fields.
I have a field in solr of type list of texts.
field1:{"key1:val1,key2:val2,key3:val3", "key1:val1,key2:val2"}
I want to form a query such that when I search for key1:val1 and key3:val3 I get the result who has both the strings i.e key1:val1 and key3:val3.
How shall I form the query?
If these are values in a multivalued field, you can't - directly. You'll have to use something like highlighting to tell you where Solr matched it.
There is no way to tell Solr "I only want the value that matched inside this set of values".
If this is a necessary way to query your index, index the values as separate documents instead in a separate collection. In that case you'd have to documents instead, one with field1:"key1:val1,key2:val2,key3:val3" and one with key1:val1,key2:val2.
You can use AND with fq.
Like:
fq=key1:val1 AND key3:val3
With this filter query you will get only records where key1 = val1 AND key3 = val3.
I have been searching through the MongoDB query syntax with various combinations of terms to see if I can find the right syntax for the type of query I want to create.
We have a collection containing documents with an array field. This array field contains ids of items associated with the document.
I want to be able to check if an item has been associated more than once. If it has then more than one document will have the id element present in its array field.
I don't know in advance the id(s) to check for as I don't know which items are associated more than once. I am trying to detect this. It would be comparatively straightforward to query for all documents with a specific value in their array field.
What I need is some query that can return all the documents where one of the elements of its array field is also present in the array field of a different document.
I don't know how to do this. In SQL it might have been possible with subqueries. In Mongo Query Language I don't know how to do this or even if it can be done.
You can use $lookup to self join the rows and output the document when there is a match and $project with exclusion to drop the joined field in 3.6 mongo version.
$push with [] array non equality match to output document where there is matching document.
db.col.aggregate([
{"$unwind":"$array"},
{"$lookup":{
"from":col,
"localField":"array",
"foreignField":"array",
"as":"jarray"
}},
{"$group":{
"_id":"$_id",
"fieldOne":{"$first":"$fieldOne"},
... other fields
"jarray":{"$push":"$jarray"}
}},
{"$match":{"jarray":{"$ne":[]}}},
{"$project":{"jarray":0}}
])
I need to implement further functionality picture of it is attached below. I've already built an application based on Solr search.
In a few words about this functionality: drop down will contain similar search phrases within concrete category and number of items found.
In what way to make Solr collect such data and somehow receive it?
Yes, you can do that in Solr using Facets, which allow grouping results. The default behaviour of facets is to return the group name and the number of items found. You do that by adding these 2 items you your query string facet=true, facet.field=category.
An example query in your case will be
http://localhost:8983/solr/NAME_OF_YOUR_INDEX/select/?wt=json&indent=on&q=ipo&fl=category,name&facet=true&facet.field=category
Take a look at the tutorial for more details.
This is roughly equivalent to doing this in SQL:
SELECT category, COUNT(*) FROM items WHERE text LIKE "%ipo%" GROUP BY category;