Retrieve faceted results with non default count value(10) from Azure Search

Retrieve faceted results with non default count value(10) from Azure Search - azure-cognitive-search

I am using the Azure index for an index search. My objective behind the Index search is to retrieve the Unique records depend upon some unique parameter say System_ID and I started using facets feature for this, but when using it I am unable to retrieve more than 10 unique facets despite providing a count value to 20 in the query.
Below is the summary:
I am able to retrieve only 10 unique records even though more than 10 unique records are there in Index.
When i modify the count property of facet to 20 Still I am getting only 10 records
Can you please help with me to modify it in such a way that I will get more than 10 records.
Any help will be appreciable.
Default query:
$filter=(systemID ne null) and (ownerSalesforceRecordID eq 'a0h5B000000gJKfQAM')&facet=machineTagSystemID,sort:value&queryType=full
Default Results:
{"machineTagSystemID": [
{
"count": 9,
"value": "ABCS test machines-111-test - change|*1XA78RUGV23PVPN"
},
{
"count": 6,
"value": "Ajit Machine testing1jjcdxxxxxxxxxxxxxx|*1L693D439H5ZNG9"
},
{
"count": 19,
"value": "Anvesh test111dsaa|*13SSNP5AJ3L96C5"
},
{
"count": 3,
"value": "Dead End cross 2|*1NK7KNNLFVTM4QC"
},
{
"count": 3,
"value": "hehehe|*1NDC32TDNXT5RAH"
},
{
"count": 14,
"value": "high2 Machine12345678ppjk fvrf|*1T2F3VQEJ58ZLQL"
},
{
"count": 31,
"value": "prashant dev machine 213|*12L343TZTFGH3M6"
},
{
"count": 1,
"value": "ryansjcilaptop465986543|*1E2PG9V3BMEYDM7"
},
{
"count": 12,
"value": "snehali DEV June|*1QXEDL8E2V8MGBY"
},
{
"count": 27,
"value": "tarun Machine-dev|*1YRPHS3J7NGUVA8"
}
]}
Facet with count:
$filter=(systemID ne null) and (ownerSalesforceRecordID eq 'a0h5B000000gJKfQAM')&facet=machineTagSystemID,sort:value,count:20&queryType=full
But same results:
{"machineTagSystemID": [
{
"count": 9,
"value": "ABCS test machines-111-test - change|*1XA78RUGV23PVPN"
},
{
"count": 6,
"value": "Ajit Machine testing1jjcdxxxxxxxxxxxxxx|*1L693D439H5ZNG9"
},
{
"count": 19,
"value": "Anvesh test111dsaa|*13SSNP5AJ3L96C5"
},
{
"count": 3,
"value": "Dead End cross 2|*1NK7KNNLFVTM4QC"
},
{
"count": 3,
"value": "hehehe|*1NDC32TDNXT5RAH"
},
{
"count": 14,
"value": "high2 Machine12345678ppjk fvrf|*1T2F3VQEJ58ZLQL"
},
{
"count": 30,
"value": "prashant dev machine 213|*12L343TZTFGH3M6"
},
{
"count": 1,
"value": "ryansjcilaptop465986543|*1E2PG9V3BMEYDM7"
},
{
"count": 12,
"value": "snehali DEV June|*1QXEDL8E2V8MGBY"
},
{
"count": 27,
"value": "tarun Machine-dev|*1YRPHS3J7NGUVA8"
}
]}
This is based on the documentation link: https://learn.microsoft.com/en-us/azure/search/search-faceted-navigation

Facets honor the filter specified in a query. It's possible this is why you are only seeing 10 unique facet value for this field. Generally speaking, your query looks fine. If there were more than 10 unique values in this field for the query specified, I would expect them to show up.
How many total results are returned by this query? I see 125 total values in the facets you provided and I'm wondering if the count aligns with your results.
Mike

This is old but I hit the same issue - there is a default limit of 10 values returned in a facet, you can extend this to return more facet values by adding a count to a given facet. E.g.:
facet=Month,count:12&search=something
This can also be done in the c# API by just adding the count to the facet name:
var options = new SearchOptions();
options.Facets.Add("Month,count:12");

Related

Solr's Complex Query for Nested Documents is not Working

The DataSet I am working on Solr.
[
{
"id": "doc_1",
"name": "Harpreet Chaggar",
"_childDocuments_": [
{ "id": "child_doc_a", "number": 22,"created_at":"2020-03-20T00:00:00Z" },
{ "id": "child_doc_b", "number": 10 ,"created_at":"2021-05-28T00:00:00Z"},
]
},
{
"id": "doc_2",
"name": "Hardik Deshmukh",
"_childDocuments_": [
{ "id": "child_doc_1", "number": 67,"created_at":"2022-03-20T00:00:00Z" },
{ "id": "child_doc_2", "number": 78 ,"created_at":"2022-05-28T00:00:00Z"},
]
},
]
My objective is to make exclude query for a nested Date Data along with some parent conditions and to return parent document for all queries.
I am trying to fetch "id" : "doc_2", "name": "Hardik Deshmukh" by the following query. Note:- I need parent document in return.
q = {!parent which='(name:("Hardik" OR "Harpreet") AND id:"doc_1")'}-created_at:[2020-01-17T00:00:00Z TO 2021-12-17T00:00:00Z]
But I am not getting any results.
To make sure if the date query is working properly, I executed the below query.
q = -created_at:[2020-01-17T00:00:00Z TO 2021-12-17T00:00:00Z]
And it was working.
"response":{"numFound":4,"start":0,"numFoundExact":true,"docs":[
{
"id":"doc_1",
"name":["Harpreet Chaggar"],
"_version_":1746310602768252928},
{
"id":"child_doc_1",
"number":["67"],
"created_at":"2022-03-20T00:00:00Z",
"_version_":1746310602791321600},
{
"id":"child_doc_2",
"number":["78"],
"created_at":"2022-05-28T00:00:00Z",
"_version_":1746310602791321600},
{
"id":"doc_2",
"name":["Hardik Deshmukh"],
"_version_":1746310602791321600}]
}}
Field types:
For created_at
Field-Type:org.apache.solr.schema.DatePointField
For name
Field-Type:org.apache.solr.schema.TextField
And if I want to fetch "id": "doc_1", I am able to get it by executing the following query.
{!parent which='(name:("Hardik" OR "Harpreet") AND id:"doc_1")'} ( created_at:[2020-01-17T00:00:00Z TO 2021-12-17T00:00:00Z] )
It fetches desired results.
"response":{"numFound":1,"start":0,"numFoundExact":true,"docs":[
{
"id":"doc_1",
"name":["Harpreet Chaggar"],
"_version_":1746310602768252928}]
}}

How to sort on a key located inside an array of objects in MongoDB

I framed the following document structure after applying aggregation query on my mongodb collection of documents:
"lots": [
{
"lot_no": "Lot number",
"location": "ABC123",
"shipping_line_id": "ONE1568295272",
"shipping_line_name": "One Line",
"containers": {
"CNTR_20STD": {
"qty": 10,
"expiry_date": 87126348762
},
"CNTR_40STD": {
"qty": 10,
"expiry_date": 87126348762
},
"CNTR_40HC": {
"qty": 20,
"expiry_date": 87126348762
}
},
"status": true,
"lot_items": [
{
"_id": "String to be generated as ID",
"origin_port": {
"port_name": "Nhava Sheva",
"port_code": "INNSA"
},
"destination_port": {
"port_name": "Jebel Ali",
"port_code": "AEJEA"
},
"container_type": "CNTR_20STD/CNTR_40STD/CNTR_40HC",
"allowable_weight": 30,
"special_terms": "Some special terms",
"buy": 1000,
"sell": 1200
}
]
}
]
I want to be able to sort the results on the basis of origin port name/destintion port name/buy/sell/allowable weight, etc.
I am trying to do the following but, it is not working:
{"$sort": {"lot_items.buy"}}
Can someone help me as to how to sort according to the above-mentioned keys present in an array of objects in the MongoDB document?

Querying CosmosDB based on timestamps

I am working with a CosmosDB setup by one of my colleagues and connecting to it using a connection string. The database contains several JSON documents with the following schema:
{
"period": "Raw",
"source": "Traffic",
"batchId": "ee737270-0b72-49b7-a2f1-201f642e9c81",
"periodName": "Raw",
"sourceName": "Traffic",
"groupKey": "gc4151_a",
"partitionKey": "traffic-gc4151_a-raw-raw",
"time": "2021-08-05T23:55:10",
"minute": 55,
"hour": 23,
"day": 05,
"month": 08,
"quarter": 3,
"year": 2021,
"minEventTime": "2021-08-05T23:55:09",
"maxEventTime": "2021-08-05T23:55:11",
"meta": {
"siteId": "GC4151_A",
"from": {
"lat": "55.860894822588506",
"long": "-4.284365958508686"
},
"to": {
"lat": "55.86038667864348",
"long": "-4.2826901232101795"
}
},
"measurements": {
"flow": [
{
"calculation": "Raw",
"name": "flow",
"calculationName": "Raw",
"value": 0
}
],
"concentration": [
{
"calculation": "Raw",
"name": "concentration",
"calculationName": "Raw",
"value": 0
}
]
},
"added": "2021-08-05T12:21:32.000819Z",
"updated": "2021-08-05T12:21:32.000819Z",
"id": "d4346f50-543e-4c4d-82cf-835b480914c2",
"_rid": "4RRTAIYVA1AIAAAAAAAAAA==",
"_self": "dbs/4RRTAA==/colls/4RRTAIYVA1A=/docs/4RRTAIYVA1AIAAAAAAAAAA==/",
"_etag": "\"1c0015a1-0000-1100-0000-5f3fbc4c0000\"",
"_attachments": "attachments/",
"_ts": 1598012492
}
I am trying to write a SQL query to select all the records that fall between the current date-time and one week earlier, so I can use these to perform future calculations.
I have attempted to use both of the following:
SELECT *
FROM c
WHERE c.time > date_sub(now(), interval 1 week);
and
SELECT *
FROM c
WHERE c.time >= DATE_ADD(CURDATE(), INTERVAL -7 DAY);
However, both of these return the following error:
Gateway Failed to Retrieve Query Plan: Message: {"errors":[{"severity":"Error","location":{"start":124,"end":125},"code":"SC1001","message":"Syntax error, incorrect syntax near '1'."}]}
ActivityId: 51c3b6f7-e760-4062-bd80-8cc9f8de5352, Microsoft.Azure.Documents.Common/2.14.0, Microsoft.Azure.Documents.Common/2.14.0
My question is what is the issue with my code, and how can I fix it?

You may use DateTimeAdd and GetCurrentDateTime() to achieve this. Eg.
SELECT *
FROM c
WHERE c.time > DateTimeAdd("day",-7,GetCurrentDateTime() )
Let me know if this works for you.

How to set unique constraint for field in document nested in array?

I have a collection of documents in MongoDB that looks like:
{"_id": 1, "array": [{"id": 1, "content": "..."}, {"id": 2, "content": "..."}]}
{"_id": 2, "array": [{"id": 1, "content": "..."}, {"id": 2, "content": "..."}, {"a_id": 3, "content": "..."}]}
and I want to ensure that there is no duplicate array.id within each document. So the provided example is ok, but the followign is not:
{"_id": 1, "array": [{"id": 1, "content": "..."}, {"id": 1, "content": "..."}]}
My question is how to do this (preferably in PyMongo).
EDIT
What I tried was the following code that I thought would create key on (_id, array.id) but if you run it this does not happen:
from pymongo import MongoClient, ASCENDING
client = MongoClient(host="localhost", port=27017)
database = client["test_db"]
collection = database["test_collection"]
collection.drop()
collection.create_index(keys=[("_id", ASCENDING),
("array.id", ASCENDING)],
unique=True,
name="new_key")
document = {"array": [{"id": 1}, {"id": 2}]}
collection.insert_one(document)
collection.find_one_and_update({"_id": document["_id"]},
{"$push": {"array": {"id": 1}}})
updated_document = collection.find_one({"_id": document["_id"]})
print(updated_document)
which outputs (note that there are two objects with id = 1 in the array). I would expect to get an exception.
{'_id': ObjectId('5eb51270d6d70fbba739e3b2'), 'array': [{'id': 1}, {'id': 2}, {'id': 1}]}

So if I understand it correctly there is no way how to set index (or
some condition) that would enforce the uniqueness within the document,
right? (Other than check this explicitly when creating the document or
when inserting into it.)
Yes. Please see the following two scenarios about using the unique index on an array field with embedded documents.
Unique Multikey Index (index on embdeed document field within an array):
For unique indexes, the unique constraint applies across separate
documents in the collection rather than within a single document.
Because the unique constraint applies to separate documents, for a
unique multikey index, a document may have array elements that result
in repeating index key values as long as the index key values for that
document do not duplicate those of another document.
First Scenario:
db.arrays.createIndex( { _id: 1, "array.id": 1}, { unique: true } )
db.arrays.insertOne( { "_id": 1, "array": [ { "id": 1, "content": "11"}, { "id": 2, "content": "22"} ] } )
db.arrays.insertOne( { "_id": 2, "array": [ { "id": 1, "content": "1100"}, { "id": 5, "content": "55"} ] } )
db.arrays.insertOne( {"_id": 3, "array": [ {"id": 3, "content": "33"}, {"id": 3, "content": "3300"} ] } )
All the three documents gets inserted without any errors.
As per the note on Unique Multikey Index, above, the document with _id : 3 has two embedded documents within the array with same "array.id"value: 3.
Also, the uniqueness is enforced on two keys of the compound index { _id: 1, "array.id": 1} and there were duplicate "array.id" values across the documents also ( the _id values 1 and 2).
Second Scenario:
db.arrays2.createIndex( { "array.id": 1 }, { unique: true } )
db.arrays2.insertOne( { "_id": 3, "array": [ { "id": 3, "content": "33" }, { "id": 3, "content": "330"} ] } )
db.arrays2.insertOne( { "_id": 4, "array": [ { "id": 3, "content": "331" }, { "id": 30, "content": "3300" } ] } )
The first document with _id : 3 gets inserted successfully. The second one has an error: "errmsg" : "E11000 duplicate key error collection: test.arrays2 index: array.id_1 dup key: { array.id: 3.0 } ". This behavior is as expected as per the note Unique Multikey Index.

You can do this check on update
const doc = await Model.findOneAndUpdate(
{ _id, 'array.id': { $ne: newID} },
{
$push: {
array: newID
}
},
{ new: true }
);

How to write a SQL query in CosmosDB for a JSON document which has nested/multiple array

I need to write a SQL query in the CosmosDB query editor, that will fetch results from JSON documents stored in Collection, as per my requirement shown below
The example JSON
{
"id": "abcdabcd-1234-1234-1234-abcdabcdabcd",
"source": "Example",
"data": [
{
"Laptop": {
"New": "yes",
"Used": "no",
"backlight": "yes",
"warranty": "yes"
}
},
{
"Mobile": [
{
"order": 1,
"quantity": 2,
"price": 350,
"color": "Black",
"date": "07202019"
},
{
"order": 2,
"quantity": 1,
"price": 600,
"color": "White",
"date": "07202019"
}
]
},
{
"Accessories": [
{
"covers": "yes",
"cables": "few"
}
]
}
]
}
Requirement:
SELECT 'warranty' (Laptop), 'quantity' (Mobile), 'color' (Mobile), 'cables' (Accessories) for a specific 'date' (for eg: 07202019)
I've tried the following query
SELECT
c.data[0].Laptop.warranty,
c.data[1].Mobile[0].quantity,
c.data[1].Mobile[0].color,
c.data[2].Accessories[0].cables
FROM c
WHERE ARRAY_CONTAINS(c.data[1].Mobile, {date : '07202019'}, true)
Original Output from above query:
[
{
"warranty": "yes",
"quantity": 2,
"color": "Black",
"cables": "few"
}
]
But how can I get this Expected Output, that has all order details in the array 'Mobile':
[
{
"warranty": "yes",
"quantity": 2,
"color": "Black",
"cables": "few"
},
{
"warranty": "yes",
"quantity": 1,
"color": "White",
"cables": "few"
}
]
Since I wrote c.data[1].Mobile[0].quantity i.e 'Mobile[0]' which is hard-coded, only one entry is returned in the output (i.e. the first one), but I want to have all the entries in the array to be listed out

Please consider using JOIN operator in your sql:
SELECT DISTINCT
c.data[0].Laptop.warranty,
mobile.quantity,
mobile.color,
c.data[2].Accessories[0].cables
FROM c
JOIN data in c.data
JOIN mobile in data.Mobile
WHERE ARRAY_CONTAINS(data.Mobile, {date : '07202019'}, true)
Output:
Update Answer:
Your sql:
SELECT DISTINCT c.data[0].Laptop.warranty, mobile.quantity, mobile.color, accessories.cables FROM c
JOIN data in c.data JOIN mobile in data.Mobile
JOIN accessories in data.Accessories
WHERE ARRAY_CONTAINS(data.Mobile, {date : '07202019'}, true)
My advice:
I have to say that,actually, Cosmos DB JOIN operation is limited to the scope of a single document. What possible is you can join parent object with child objects under same document. Cross-document joins are NOT supported.However,your sql try to implement mutiple parallel join.In other words, Accessories and Mobile are hierarchical, not nested.
I suggest you using stored procedure to execute two sql,than put them together. Or you could implement above process in the code.
Please see this case:CosmosDB Join (SQL API)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Retrieve faceted results with non default count value(10) from Azure Search - azure-cognitive-search

Related

Solr's Complex Query for Nested Documents is not Working

How to sort on a key located inside an array of objects in MongoDB

Querying CosmosDB based on timestamps

How to set unique constraint for field in document nested in array?

How to write a SQL query in CosmosDB for a JSON document which has nested/multiple array

Categories

Resources