Solr sort by key value - solr

I want to sort the results from Solr by key value.
Here is my scenario:
We have n products, we have m product groups.
1 product can have m product groups 1 product group has n products.
So we have a n:m relation.
Standard: the product has a sort index of 0 per product group there can be 3 Products with a sort index of 1, 2 or 3. 3 should be the first result, 2 the second, 1 the third and then the rest.
Warning: One Product can have a sort index > 0 in multiple product Groups.
Question: How should I index this information? And how should I do the sort in the query to Solr?

Related

Change formula array for every X number of rows in Excel

I have large number of rows of data in Excel where I need to change the row array of the formula for every 3 rows but I can't figure out how to adjust the formula without an error.
How do I add a formula like this to the formula below?
=INT(((ROW(a1)-1)/11))*1+1
This is the formula I have been using, but I need to change it for every 3 rows.
=IF(COUNTIF($N$4:$N$6, ""), "",MAX($N$4:$N$6))
=IF(COUNTIF($N$7:$N$9, ""), "",MAX($N$7:$N$9))
And so on
Example
I have 3 approvers, if "product" is approved, the date is for the approval date of last said approval, if no approval has been made then the cell is blank. Outcome is what I want to collect from column 3 when product was approved from all 3 approvers which is the newest date of the 3 rows, if one approver has not approved, then I'd like column 4 to be blank.
Product
Approvers
Dates
Outcome
A
1
04.01.2016
04.01.2016
A
2
17.12.2015
04.01.2016
A
3
21.12.2015
04.01.2016
B
1
11.04.2017
11.04.2017
B
2
30.01.2017
11.04.2017
B
3
04.04.2017
11.04.2017
C
1
C
2
13.10.2016
C
3
14.02.2017
D
1
01.03.2022
01.03.2022
D
2
02.12.2019
01.03.2022
D
3
30.01.2020
01.03.2022
Picture of data
Two options:
To answer the original question, to make your formula change every n rows, use =-MOD(ROW()+c, n) to adjust this (where 'c' is a constant just to get them in line, if your data starts on row 2 then c would be 1).
Your formula for row 2 would be:
=IF(COUNTIF(OFFSET($N2,-MOD(ROW($N2)+1,3),0,3),""),"",MAX(OFFSET($N2,-MOD(ROW($N2)+1,3),0,3)))
Another option, not as direct an answer to the question but potentially useful if the number of products changed in future from 3 to something else, would be:
=IF(COUNTIFS(L:L, L2, N:N, ""), "", MAX(IF(L:L=L2, L:L)))
and click Ctrl+Shift+Enter after typing that in (because it's an Array formula, see here, here and here).
The advantage of this approach is that it looks at all rows where the product column is the same (I'm assuming unique products), so no need to limit it to 3 rows per product or have those 3 rows next to each other.

How to model arbitrarily ordering items in database?

I accepted a new feature to re-order some items by using Drag-and-Drop UI and save the preference for each user to the database. What's the best way to do so?
After reading some questions on StackOverflow, I found this solution.
Solution 1: Use decimal numbers to indicate order
For example,
id item order
1 a 1
2 b 2
3 c 3
4 d 4
If I insert item 4 between item 1 and 2, the order becomes,
id item order
1 a 1
4 d 1.5
2 b 2
3 c 3
In this way, every new order = order[i-1] + order[i+1] / 2
If I need to save the preference for every user, then I need to another relationship table like this,
user_id item_id order
1 1 1
1 2 2
1 3 3
1 4 1.5
I need num_of_users * num_of_items records to save this preference.
However, there's a solution I can think of.
Solution 2: Save the order preference in a column in the User table
This is straightforward by adding a column in the User table to record the order. Each value would be parsed as an array of item_ids that ranked by the index of the array.
user_id . item_order
1 [1,4,2,3]
2 [1,2,3,4]
Is there any limitation of this solution? Or is there any other ways to solve this problem?
Usually, an explicit ordering deals with the presentation or some specific processing of data. Hence, it's a good idea to separate entities of theirs presentation/processing. For example
users
-----
user_id (PK)
user_login
...
user_lists
----------
list_id, user_id (PK)
item_index
item_index can be a simply integer value :
ordered continuously (1,2...N): DELETE/INSERT of the whole list are normally required to change the order
ordered discretely with some seed (10,20...N): you can insert new items without reordering the whole list
Another reason to separate entity data and lists: reordering lists should be done in transaction that may lead to row/table locks. In case of separated tables only data in list table is impacted.

MongoDB Compound Indexes - Does the sort order matter?

I've dived recently into mongodb for a project of mine.
I've been reading up on indexes, and for a small collection, i know it wouldn't matter much but when it grows there's going to be performance issues without the right indexes and queries.
Lets say i have a collection like so
{user_id:1,slug:'one-slug'}
{user_id:1,slug:'another-slug'}
{user_id:2,slug:'one-slug'}
{user_id:3,slug:'just-a-slug}
And i have to search my collection where
user id == 1 and slug == 'one-slug'
In this collection, slugs will be unique to user ids.
That is, user id 1 can have only one slug of the value 'one-slug'.
I understand that user_id should be given priority due to its high cardinality, but what about slug? Since its unique as well most of the time. I also cant wrap my head around ascending and descending indexes, or how its going to affect performance in this case or the right order i should be using in this collection.
I've read a bit but i can't wrap my head around it, particularly for my scenario. Would be awesome to hear from others.
You can think of MongoDB single-field index as an array, with pointers to document locations. For example, if you have a collection with (note that the sequence is deliberately out-of-order):
[collection]
1: {a:3, b:2}
2: {a:1, b:2}
3: {a:2, b:1}
4: {a:1, b:1}
5: {a:2, b:2}
Single-field index
Now if you do:
db.collection.createIndex({a:1})
The index approximately looks like:
[index a:1]
1: {a:1} --> 2, 4
2: {a:2} --> 3, 5
3: {a:3} --> 1
Note three important things:
It's sorted by a ascending
Each entry points to the location where the relevant documents resides
The index only records the values of the a field. The b field does not exist in the index at all
So if you do a query like:
db.collection.find().sort({a:1})
All it has to do is to walk the index from top to bottom, fetching and outputting the document pointed to by the entries. Notice that you can also walk the index from the bottom, e.g.:
db.collection.find().sort({a:-1})
and the only difference is you walk the index in reverse.
Because b is not in the index at all, you cannot use the index when querying anything about b.
Compound index
In a compound index e.g.:
db.collection.createIndex({a:1, b:1})
It means that you want to sort by a first, then sort by b. The index would look like:
[index a:1, b:1]
1: {a:1, b:1} --> 4
2: {a:1, b:2} --> 2
3: {a:2, b:1} --> 3
4: {a:2, b:2} --> 5
5: {a:3, b:2} --> 1
Note that:
The index is sorted from a
Within each a you have a sorted b
You have 5 index entries vs. only three in the previous single-field example
Using this index, you can do a query like:
db.collection.find({a:2}).sort({b:1})
It can easily find where a:2 then walk the index forward. Given that index, you cannot do:
db.collection.find().sort({b:1})
db.collection.find({b:1})
In both queries you can't easily find b since it's spread all over the index (i.e. not in contiguous entries). However you can do:
db.collection.find({a:2}).sort({b:-1})
since you can essentially find where the a:2 are, and walk the b entries backward.
Edit: clarification of #marcospgp's question in the comment:
The possibility of using the index {a:1, b:1} to satisfy find({a:2}).sort({b:-1}) actually make sense if you see it from a sorted table point of view. For example, the index {a:1, b:1} can be thought of as:
a | b
--|--
1 | 1
1 | 2
2 | 1
2 | 2
2 | 3
3 | 1
3 | 2
find({a:2}).sort({b:1})
The index {a:1, b:1} means sort by a, then within each a, sort the b values. If you then do a find({a:2}).sort({b:1}), the index knows where all the a=2 are. Within this block of a=2, the b would be sorted in ascending order (according to the index spec), so that query find({a:2}).sort({b:1}) can be satisfied by:
a | b
--|--
1 | 1
1 | 2
2 | 1 <-- walk this block forward to satisfy
2 | 2 <-- find({a:2}).sort({b:1})
2 | 3 <--
3 | 1
3 | 2
find({a:2}).sort({b:-1})
Since the index can be walked forward or backwards, a similar procedure was followed, with a small twist at the end:
a | b
--|--
1 | 1
1 | 2
2 | 1 <-- walk this block backward to satisfy
2 | 2 <-- find({a:2}).sort({b:-1})
2 | 3 <--
3 | 1
3 | 2
The fact that the index can be walked forward or backward is the key point that enables the query find({a:2}).sort({b:-1}) to be able to use the index {a:1, b:1}.
Query planner explain
You can see what the query planner plans by using db.collection.explain().find(....). Basically if you see a stage of COLLSCAN, no index was used or can be used for the query. See explain results for details on the command's output.
[Cannot comment due to a lack of reputation]
Index direction only matters when you're sorting.
Not completely exact : some queries can be faster with particular direction index, even if no order is required in the query itself (sorting is just for results). For example, queries with date criteria : searching for users who subscribe yesterday will be faster with a desc direction on index, than with asc direction or no index.
difference between {user_id:1,slug:1} and {slug:1,user_id:1}
mongo will filter on first field, then on second field with first field matching (and so on...) in index. The more restrictive fields must be at first places to really improve the query

SQL Server : how to apply sorting by a field that consists of multiple values divided in two logical groups

The problem is following:
I have to retrieve and sort records which are logically separated in two groups: Active (2,1) and Inactive (0).
The problem is that Active group consists of two values, the Inactive of one value and sorting has to be applied to logical groups not field values.
E.g.
Product Product_Description Status Priority
"Soap" "Nice soap" 2 A
"Sponge" "Hard sponge" 1 B
"Water" "It comes there too" 0 A
"Wind" "I don't know how it got here" 0 B
"Toothbrush" "It's more logical" 2 B
So the query should order records by Status and Priority. But the Status column consists of 3 values separated logically in two groups (2,1) and (0).
Query should return:
"Soap" 2 A
"Toothbrush" 2 B
"Sponge" 1 B
"Water" 0 A
"Wind" 0 B
I cannot change table structure
The only idea which came to mind is by using Union all and dividing the query into two parts. But maybe there is a nicer way.
Thank you.
If you want to group 2 and 1 together, separate from 0 you can use
order by sign(status), priority

Solr: Fetching results with a minimum from each category

I am using solr 4.4.0. The search is performed on products, each of which has a category field. I want to retrieve top n products. But, if some category has less than m products among the top n, then I want to retrieve more products only for those categories.
Eg. I have 4 categories a, b, c, d. n=20 and m=5. Now lets say the top 20(=n) have following category distribution (a:6, b:4, c:6, d:4). Categories b and d have less than m(=5) products. So I would like to fetch one more product(with the next highest score) for both these categories.
Is there a way I can do this using solr
Did you try to solve this with FieldCollapsing?
You use group.field=category, and group.limit lets you set the size of each group. Then you need to be a bit careful on how the groups are sorted, I think it was by the first doc in the group...
But I guess you can achieve what you are looking for fairly easy.

Resources