Please see my sample solr document below.
{
"title": "Apple"
},
{
"title": "Banana",
"popularity": 2
},
{
"title": "Mango",
"popularity": 3
},
{
"title": "Lemon",
"popularity": 1
}
By default the query is "title":* so all those solr document will return as result, sorted by title ascending order. It will look like this
Apple
Banana
Lemon
Mango
Now, what I want is to add another sorting which a bit tricky at least for me to implement :(. I want to sort it by title ascending and by popularity descending order which only involves the popularity that has a value of 3 and 2. The result should be like this
Mango
Banana
Apple
Lemon
The question is what would be the query?
Thanks
You can sort it as follows:
sort=map(popularity,2,3, popularity,0) desc, title asc
Related
Our data structure is similar to HotelId 1 example in the link https://learn.microsoft.com/en-us/azure/search/search-howto-complex-data-types
Our requirement is as follows:
Input: City = New York, StateProvince = NY, BaseRate = $100
Select fields: HotelId, HotelName, Description, Tags, Address, Rooms
Filter: Only rooms where BaseRate is less than or equal to Input rate and Address City and State matches input values. In this example, it should only select the first room from Rooms, not all Rooms.
Desired output:
{
"HotelId": "1",
"HotelName": "Secret Point Motel",
"Description": "Ideally located on the main commercial artery of the city in the heart of New York.",
"Tags": ["Free wifi", "on-site parking", "indoor pool", "continental breakfast"]
"Address": {
"StreetAddress": "677 5th Ave",
"City": "New York",
"StateProvince": "NY"
},
"Rooms": [
{
"Description": "Budget Room, 1 Queen Bed (Cityside)",
"RoomNumber": 1105,
"BaseRate": 96.99,
}
]
}
Any help or direction on how to write a query for this or any pointers would be welcome.
The records in the hotels sample index consist of hotels, not rooms. Think of it as an index with Documents and Paragraphs. You may search for a Document (hotel) which has something within a Paragraph (room). The result you get would always be a list of Documents. From what I know there is no way to remove certain complex types from a record in a response.
The query to do what you ask (except filtering out rooms) is this by the way:
search=Address/City:"New York" AND Address/StateProvince:"NY"&$select=HotelId,HotelName,Description,Tags,Address,Rooms&$count=true&searchMode=all&queryType=full&$filter=Rooms/any(room: room/BaseRate lt 100.0)
Possible workarounds:
Design your index with rooms as records
Filter out rooms above the selected base rate in your frontend application.
Here is a sample document -
{
"id": "AIRPORT-LAS",
"RANK": 80.0,
"TYPE": "AIRPORT",
"COUNTRY_NAME": "United States",
"COUNTRY_CODE": "US",
"ISO_COUNTRY_CODE": "US",
"LATITUDE": "36.08047103880001",
"LONGITUDE": "-115.14331054699983",
"LATLON": [
"36.08047103880001,-115.14331054699983"
],
"CITY_CODE": "LAS",
"CITY_NAME": "Las Vegas",
"PROVINCE_CODE": "NV",
"PROVINCE_NAME": "NEVADA",
"AIRPORT_NAME": "McCarran Intl Airport",
"AIRPORT_CODE": "LAS"
}
Now based on where (geographic location) the customer is searching, I'll be having several RANK(s) using State and Country combinations for each of the above documents.
For example -
For AIRPORT-LAS, I'll have the following -
USA - CA - 100
USA - NJ - 80
USA - NY - 75
.... rest of combinations
I am trying to understand the following -
What is the best way to index this new set of ranks to the existing documents? As a separate collection? Or as a nested data set?
How can I boost my results using the new set of ranks at search time? [so basically, if the user is searching from USA - CA, I should be using RANK=100, to boost my search results. I would know the State and Country at search time.]
Thank You!
If you want to integrate numeric document values directly into the score, use a boost function on query time. You may also use multiple document values here, but watch out to select an adequate boost factor.
bf=mul(RANK, 2)
So, I'm designing the model for the documents that I'll insert in my database and I have a question about the design. Is it better to just insert more documents in my collections or fewer nested documents?
Example:
sale:{
store_id : "2",
vendor_id: "2,
points : 100
}
sale:{
store_id : "2",
vendor_id: "2,
points : 100
}
sale:{
store_id : "2",
vendor_id: "2,
points : 100
}
sale:{
store_id : "4",
vendor_id: "3,
points : 100
}
sale:{
store_id : "4",
vendor_id: "1,
points : 100
}
So,in this not nested example if I have N sales, I'll have N sales, inside my collections. But if I try to nest, my example will be:
stores:{ [
store_id : "2"
vendor : [
vendor_id : "2"
sales : [
points : 100
],
[
points : 100
],
[
points : 100
]
]
],
[
store_id: 4
vendor : [
vendor_id : 3
sales : [
point : 100
]
],
[
vendor_id : 1
sale : [
point : 100
]
]
] };
In this example, I nest all my sales.
So, my question is: to create reports and analyze data, which one is faster? If I want to see which store sold more for example, will it be faster to analyze nested documents or one line documents?
Thank you in advance.
The answer is pretty simple. If you know there are gonna be an a lot of sales and its not limited number, you have to go for a separate collection for sales. Mongodb is designed to perform amazingly fast even if there are a million documents in a collection but interestingly you are gonna face a lot of issues by nesting.
Also there is a 16mb document size limit in mongodb, so eventually after a while your one store document will reach that limit and it will make things pretty ugly.
It’s quite straight that you should go for a separate collection.
You can also read this blog and it will clear things out for you
https://www.mongodb.com/blog/post/6-rules-of-thumb-for-mongodb-schema-design-part-1
I have a document with a collection of strings representing the number of times that document appears in a region (tags). For example:
[{
"id": "A"
// other properties
"regions": ["3", "3", "3", "2"] // Appears 3 times in region "3" and once in region "2"
},
{
"id": "B"
// other properties
"regions": ["3", "3", "1"] // Appears twice in region "3" and once in region "1"
}]
I tried using a custom scoring profile of type Tag, but I don't see how to give documents with more regions a better score. In other words, I want document A that appears 3 times in region 3 to show before document B that only appears twice in region 3.
FYI, the reason we chose to represent regions this way is because there are way too many regions and not all documents appear in all regions. More details here
Is this doable? This way or another way?
The tag scoring profile checks for an existence of a tag. If the tag appears multiple times, it has no effect on the score.
I've read your other post here. One solution you could consider (which is not exactly what you want) is to bucket the regions based on count. For example, you'd have a collection of regions where the document shows up less than 10 times, between 10 and 50, between 50 and 100 (pick the ranges in a way that make sense for the distribution of region occurrences in your scenario). You're documents would look like this:
{
"id": "A"
"regions10": ["3", "2"] // Appears in region 3 and 2 less than 10 times
"regions50": ["1"] // Appears in region 1 between 10 and 50 times
}
Then, you could use a Weights scoring profile to boost documents that matched in the higher count regions:
"scoringProfiles": [
{
"name": "boostRegions",
"text": {
"weights": {
"regions10": 1,
"regions50": 2,
"regions100": 3
}
}
}
This is not a good solution if you need strict ordering based on the region count, you can't precompute the region counts, or the entire range of value is large (say 0 to 2^31) while the individual buckets need to be small (you'd end up with too many fields).
The problem you have is a data modeling problem. You're trying to retrieve documents based on the property of the document, which is whether it contains a region in a set of regions, but score/boost the document based on the properties of the region, not the document. You'd have to have a document in the index for each document-region pair an a property with the number of times given document appeared in that region.
I'm using Google App Engine, so I'm using a Non relational database (NoSQL). My question is:
Which is the best option to modeling a rank (ranking of players) using their scores?
For example, my players are:
Player { String name, int score}
I want to know the rank (position) from a player and also get the top 10 players, but I doubt which is the best way.
Thanks.
If your scores are indexed, it's easy to do a datastore query and get players in sorted order.
So if you want the top 10 players, that's pretty trivial.
Getting the ranking for an arbitrary player is really hard. Hard enough that I'd say, avoid it if you can, and if you can't, find a hack way around it.
For example, if you have 50,000 players, and PlayerX is ranked 12,345, the only way to know that is query all the players, and check through each of them, keeping count, until you find PlayerX.
One hack might be to store the player ranking in the player entity, and update it with a cron job that runs once every few hours.
There is a built-in solution in Redis:
First add a few members with a score:
redis> ZADD myzset 1 "one"
(integer) 1
redis> ZADD myzset 2 "two"
(integer) 1
redis> ZADD myzset 3 "three"
(integer) 1
Get the rank of "two":
redis> ZREVRANK myzset "one"
(integer) 2
(Index starts at 0)
And if you want the current order:
redis> ZREVRANGE myzset 0 -1
1) "three"
2) "two"
3) "one"
See ZREVRANGE and ZREVRANK in redis documentation.
A suitable representation of this in JSON would be:
"players" : [
{
"name" : "John",
"score" : 15
},
{
"name" : "Swadq",
"score" : 7
},
{
"name" : "Jane",
"score" : 22
}
]
For examples of how to sort this:
PHP: How to sort an array of associative arrays by value of a given key in PHP?
JavaScript: How to sort an array of associative arrays by value of a given key in PHP?
JavaScript general sorting: http://www.breakingpar.com/bkp/home.nsf/0/87256B280015193F87256C8D00514FA4
You could set up your index.yaml like so:
- kind: Player
properties:
- name: score
direction: ascending
To get a player's score you just need to make a pass over the players (while keeping a count) and cache the result to speed further searches for that player.