I am using one solr query in the code to fetch data collection from solr.
There is a field called 'rank' before I was using in fetch query where condition like "AND (rank:1 OR rank:0)" like this but now I want to change the least number.
For example, my collection doesn't have 0 or 1 but least rank number is 3.
so I want get the list of records having least rank number.
This is my existing query (from logs):
q=category=nfroots AND (rank=1 OR rank=0)
fq=-isLegacyProduct=true
group=true.
Here is mentioned like category= froots,rank= 0 or 1, isLegacyProduct=true, group=true.
I have to make rank= instead 0 or 1 it should be least number.
here is my collection:-
{
"id":"1",
"rank":"1",
"name":"A"
},
{
"id":"2",
"rank":"2",
"name":"A"
}
,
{
"id":"3",
"rank":"2",
"name":"B"
}
,
{
"id":"4",
"rank":"3",
"name":"B"
}
,
{
"id":"5",
"rank":"4",
"name":"B"
}
My Result should be like this :-
{
"id":"1",
"rank":"1",
"name":"A"
},
{
"id":"3",
"rank":"2",
"name":"B"
}
Above result, I am trying to get a minimum rank field which is belonged to 'A' and 'B'.
Related
I have documents that contain an object array. Within that array are pulses in a dataset. For example:
samples: [{"time":1224960,"flow":0,"temp":null},{"time":1224970,"flow":0,"temp":null},
{"time":1224980,"flow":23,"temp":null},{"time":1224990,"flow":44,"temp":null},
{"time":1225000,"flow":66,"temp":null},{"time":1225010,"flow":0,"temp":null},
{"time":1225020,"flow":650,"temp":null},{"time":1225030,"flow":40,"temp":null},
{"time":1225040,"flow":60,"temp":null},{"time":1225050,"flow":0,"temp":null},
{"time":1225060,"flow":0,"temp":null},{"time":1225070,"flow":0,"temp":null},
{"time":1225080,"flow":0,"temp":null},{"time":1225090,"flow":0,"temp":null},
{"time":1225100,"flow":0,"temp":null},{"time":1225110,"flow":67,"temp":null},
{"time":1225120,"flow":23,"temp":null},{"time":1225130,"flow":0,"temp":null},
{"time":1225140,"flow":0,"temp":null},{"time":1225150,"flow":0,"temp":null}]
I would like to construct an aggregate pipeline to act on each collection of consecutive 'samples.flow' values above zero. As in, the sample pulses are delimited by one or more zero flow values. I can use an $unwind stage to flatten the data but I'm at a loss as to how to subsequently group each pulse. I have no objections to this being a multistep process. But I'd rather not have to loop through it in code on the client side. The data will comprise fields from a number of documents and could total in the hundreds of thousands of entries.
From the example above I'd like to be able to extract:
[{"time":1224980,"total_flow":123,"temp":null},
{"time":1225020,"total_flow":750,"temp":null},
{"time":1225110,"total_flow":90,"temp":null}]
or variations thereof.
If you are not looking for specific values to be on the time field, then you can use this pipeline with $bucketAuto.
[
{
"$bucketAuto": {
"groupBy": "$time",
"buckets": 3,
"output": {
total_flow: {
$sum: "$flow"
},
temp: {
$first: "$temp"
},
time: {
"$min": "$time"
}
}
}
},
{
"$project": {
"_id": 0
}
}
]
If you are looking for some specific values for time, then you will need to use $bucket and provide it a boundaries argument with precalculated lower bounds. I think this solution should do your job
I would like to retrieve some summary statistics from the text documents I have indexed in Solr. In particular, the word count per document.
For example, I have the following three documents indexed:
{
"id":"1",
"text":["This is the text in document 1"]},
{
"id":"2",
"text":["some text in document 2"]},
{
"id":"3",
"text":["and document 3"]}
I would like to get the total number of words per each individual document:
"1",7,
"2",5,
"3",3,
What query can I use to get such a result?
I am new to Solr and I am aware that I can use facets to get the count of the individual words over all documents using something like:
http://localhost:8983/solr/corename/select?q=*&facet=true&facet.field=text&facet.mincount=1
But how to get the total word count per document is not clear to me.
I appreciate your help!
If you do a faceted search over id and an inner facet over text, the inner facet count will give the number of words in that document with that id. But text field type must be text_general or something equivalent (tokenized).
If you only want to count "distinct" words per document id, it is actually much easier:
{
"query": "*:*",
"facet": {
"document": {
"type": "terms",
"field": "id",
"facet": {
"wordCount": "unique(message)"
}
}
}
}
Gives distinct word count per document. Following gives all words and all counts per document but it's up to you to sum them to get total amount (also it's an expensive call)
{
"query": "*:*",
"facet": {
"document": {
"type": "terms",
"field": "id",
"facet": {
"wordCount": {
"type": "terms",
"field": "message",
"limit": -1
}
}
}
}
}
#MatsLindth's comment is something to consider too. Solr and you might not agree on what's a "word". Tokenizer is configurable to a point but depending on your needs it might not be very easy.
I'm making a system to filter demands of the user and for one part I must filter by all attributes selected, here is my query builder :
return EncyclopedieModel::with('stats')
->join('equipement_stats', 'equipement_stats.id_equipement', '=', 'equipement.id_equipement')
->join('stats', 'stats.id_stats','=','equipement_stats.id_stats')
->whereIn('stats.id_stats', $filterStats)
->whereIn('id_typeequipement', $filterEquipement)
->whereIn('id_rarete', $filterRarity)
->skip(0 + $toskip)
->take(10)
->get()
->toJson();
I must filter on multiple demands for stats.
I firstly take all equipements that fit the demand of the user then I get the statistics of the item with eager loading.
My problem is that I must get item that has both stats at the same time. For example if my item has the statistic "3" and "7" I must get all items that have both these statistics.
Now, I'm getting all equipements with the statistic "3" and all equipements with the statistic "7"...
I don't know how I should implement it
EDIT : I tried so simplifize with a car and caracteristics
[
{
"id_car":1,
"nom_car":"Car A",
"rarity": 1,
"caracteristic":[
{
"id_caracterstic":3,
"nom_stats":"Caracteristic A"
},
{
"id_caracterstic":8,
"nom_stats":"Caracteristic W"
},
{
"id_caracterstic":4,
"nom_stats":"Caracteristic Z"
}
]
},
{
"id_car":2,
"nom_car":"Car B",
"rarity": 2,
"caracteristic":[
{
"id_caracterstic":5,
"nom_stats":"Caracteristic P"
},
{
"id_caracterstic":8,
"nom_stats":"Caracteristic W"
},
{
"id_caracterstic":12,
"nom_stats":"Caracteristic ZA"
}
]
},
{
"id_car":1,
"nom_car":"Car C",
"rarity": 2,
"caracteristic":[
{
"id_caracterstic":12,
"nom_stats":"Caracteristic P"
},
{
"id_caracterstic":8,
"nom_stats":"Caracteristic W"
},
{
"id_caracterstic":14,
"nom_stats":"Caracteristic ZDD"
}
]
},
]
It's like I must find cars in my database which rarity is "2" and caracteristics are 8 and 12.
The way I'm doing now, I'm getting Car A, Car B and Car C because my query looks through all cars with caracteristic 8 and all cars with caracteristic 12
What I want is to only get Car B and Car C when I'm looking for a car with caracteristic "8" and "12"
From your question, my understanding is that you want to filter your models to include only those that have all of the requisite equipment types.
With that in mind, you simply need to modify your existing query to use HAVING and GROUP BY.
return EncyclopedieModel::with('stats')
->join('equipement_stats', 'equipement_stats.id_equipement', '=', 'equipement.id_equipement')
->join('stats', 'stats.id_stats','=','equipement_stats.id_stats')
->whereIn('stats.id_stats', $filterStats)
->whereIn('id_typeequipement', $filterEquipement)
->whereIn('id_rarete', $filterRarity)
->groupBy('enclopedie.id_car')
->havingRaw('COUNT(id_typeequipement) = ?', [count($filterEquipement)])
->skip(0 + $toskip)
->take(10)
->get()
->toJson();
The GROUP BY is required to group the equipment types per distinct model for comparision later. The HAVING is essentially a WHERE clause applied to an aggregate function. In this case, COUNT.
So we want to find all models that exactly match the number of specified equipment types.
Edit - HAVING definition
A HAVING clause in SQL specifies that an SQL SELECT statement should only return rows where aggregate values meet the specified conditions. It was added to the SQL language because the WHERE keyword could not be used with aggregate functions.
We have a database with a lot of documents, which gets bigger as time goes on. Right now, query time isn't a problem since the data is only ~1 year old or so. But the bigger this gets, the longer queries will take if we query everything.
Our idea was to take every nth document, the more documents there are, you leave some data out, but you still get a good image from data over the time. However, this is hard to do in Mongo and doesn't seem to work at all, since it still traverses all documents.
Is there a way to set a fixed query time, no matter how many documents, or at least reduce it? It doesn't matter if we lose data overall, as long as we get documents from every time range.
I don't know exactly how your data looks like, but here is an example of what I mean. Let's assume this is your data stored in the database.
/* 1 */
{
"_id" : ObjectId("59e272e74d8a2fe38b86187d"),
"name" : "data1",
"date" : ISODate("2017-11-07T00:00:00.000Z"),
"number" : 15
}
/* 2 */
{
"_id" : ObjectId("59e272e74d8a2fe38b86187f"),
"name" : "data2",
"date" : ISODate("2017-11-06T00:00:00.000Z"),
"number" : 19
}
/* 3 */
{
"_id" : ObjectId("59e272e74d8a2fe38b861881"),
"name" : "data3",
"date" : ISODate("2017-10-06T00:00:00.000Z"),
"number" : 20
}
/* 4 */
{
"_id" : ObjectId("59e272e74d8a2fe38b861883"),
"name" : "data4",
"date" : ISODate("2017-10-05T00:00:00.000Z"),
"number" : 65
}
I understand you want to compare some values throughout months or even years. So you could do the following
db.getCollection('test').aggregate([
{
$match: {
// query on the fields with index
date: {$gte: ISODate("2017-10-05 00:00:00.000Z"),
$lte: ISODate("2017-11-07 00:00:00.000Z")}
}
},
{
// retrieve the month from each document
$project: {
_id: 1,
name: 1,
date: 1,
number: 1,
month: {$month: "$date"}
}
},
{
// group them by month and perform some accumulator operation
$group: {
_id: "$month",
name: {$addToSet: "$name"},
dateFrom: {$min: "$date"},
dateTo: {$max: "$date"},
number: {$sum: "$number"}
}
}
])
I would suggest you save the pre aggregated data, this way instead of searching through 30 documents per month for example you'd only need to search for 1 per month. And you'd only have to aggregate the complete data only once, if you have the pre aggregated results stored then you'd only have to run the pre aggregation for the new data that are coming in.
Is that maybe something you are looking for?
Also if you have indexes and they fields you query have indexes then this helps as well. Otherwise MongoDB has to scan every document in a collection.
I have an array of objects, and I want to query in a MongoDB collection for documents that have elements that match any objects in my array of objects.
For example:
var objects = ["52d58496e0dca1c710d9bfdd", "52d58da5e0dca1c710d9bfde", "52d91cd69188818e3964917b"];
db.scook.recipes.find({products: { $in: objects }}
However, I want to know if I can sort the results by the number of matches in MongoDB.
For example, at the top will be the "recipe" that has exactly three elements matches: ["52d58496e0dca1c710d9bfdd", "52d58da5e0dca1c710d9bfde", "52d91cd69188818e3964917b"].
The second selected has two recipes: i.e. ["52d58496e0dca1c710d9bfdd", "52d58da5e0dca1c710d9bfde"], and the third one only one: i.e. ["52d58496e0dca1c710d9bfdd"]
It would be great if you could get the number of items it had.
By using the aggregation framework, I think that you should be able to get what you need by the following MongoDB query. However, if you're using Mongoose, you'll have to convert this to a Mongoose query. I'm not certain this will work exactly as is, so you may need to play with it a little to make it right. Also, this answer hinges on whether or not you can use the $or operator inside of the $project operator and that it will return true. If that doesn't work, I think you'll need to use map-reduce to get what you need or do it server side.
db.recipes.aggregate(
// look for matches
{ $match : { products : { $or : objects }}},
// break apart documents to by the products subdocuments
{ $unwind : "$products" },
// search for matches in the sub documents and add productMatch if a match is found
{ $project : {
desiredField1 : 1,
desiredField2 : 1,
products : 1,
// this may not be a valid comparison, but should hopefully
// be true or 1 if there is a match
productMatch : { "$products" : { $or : objects }}
}},
// group the unwound documents back together by _id
{ $group : {
_id : "$_id",
products : { $push : "$products" },
// count the matched objects
numMatches : { $sum : "$productMatch" },
// increment by 1 for each product
numProducts : { $sum : 1 }
}},
// sort by descending order by numMatches
{ $sort : { numMatches : -1 }}
)