Azure maps POI search returns at max 100 TOTAL RESULTS - azure-maps

Search For eg search query - Hospital Country set- IN. total results found are <=100.if you send the request again the number of total results change. This has started happening recently and was working fine a couple weeks back.
The below result is definitely incorrect as the totalResults should be in thousands
eq query
https://atlas.microsoft.com/search/poi/json?subscription-key={Your key here}&api-version=1.0&limit=100&ofs=0&countrySet=IN&query=hospital&maxfuzzylevel=3&minfuzzylevel=1
summary
"summary": {
"query": "hospital",
"queryType": "NON_NEAR",
"queryTime": 374,
"numResults": 95,
"offset": 0,
"totalResults": 95,
"fuzzyLevel": 1
},
Expected totalResults to be returned in 1000s across India

Related

Azure Search Working with Complex Collections

Our data structure is similar to HotelId 1 example in the link https://learn.microsoft.com/en-us/azure/search/search-howto-complex-data-types
Our requirement is as follows:
Input: City = New York, StateProvince = NY, BaseRate = $100
Select fields: HotelId, HotelName, Description, Tags, Address, Rooms
Filter: Only rooms where BaseRate is less than or equal to Input rate and Address City and State matches input values. In this example, it should only select the first room from Rooms, not all Rooms.
Desired output:
{
"HotelId": "1",
"HotelName": "Secret Point Motel",
"Description": "Ideally located on the main commercial artery of the city in the heart of New York.",
"Tags": ["Free wifi", "on-site parking", "indoor pool", "continental breakfast"]
"Address": {
"StreetAddress": "677 5th Ave",
"City": "New York",
"StateProvince": "NY"
},
"Rooms": [
{
"Description": "Budget Room, 1 Queen Bed (Cityside)",
"RoomNumber": 1105,
"BaseRate": 96.99,
}
]
}
Any help or direction on how to write a query for this or any pointers would be welcome.
The records in the hotels sample index consist of hotels, not rooms. Think of it as an index with Documents and Paragraphs. You may search for a Document (hotel) which has something within a Paragraph (room). The result you get would always be a list of Documents. From what I know there is no way to remove certain complex types from a record in a response.
The query to do what you ask (except filtering out rooms) is this by the way:
search=Address/City:"New York" AND Address/StateProvince:"NY"&$select=HotelId,HotelName,Description,Tags,Address,Rooms&$count=true&searchMode=all&queryType=full&$filter=Rooms/any(room: room/BaseRate lt 100.0)
Possible workarounds:
Design your index with rooms as records
Filter out rooms above the selected base rate in your frontend application.

Solr - How to index and boost results using several popularity fields?

Here is a sample document -
{
"id": "AIRPORT-LAS",
"RANK": 80.0,
"TYPE": "AIRPORT",
"COUNTRY_NAME": "United States",
"COUNTRY_CODE": "US",
"ISO_COUNTRY_CODE": "US",
"LATITUDE": "36.08047103880001",
"LONGITUDE": "-115.14331054699983",
"LATLON": [
"36.08047103880001,-115.14331054699983"
],
"CITY_CODE": "LAS",
"CITY_NAME": "Las Vegas",
"PROVINCE_CODE": "NV",
"PROVINCE_NAME": "NEVADA",
"AIRPORT_NAME": "McCarran Intl Airport",
"AIRPORT_CODE": "LAS"
}
Now based on where (geographic location) the customer is searching, I'll be having several RANK(s) using State and Country combinations for each of the above documents.
For example -
For AIRPORT-LAS, I'll have the following -
USA - CA - 100
USA - NJ - 80
USA - NY - 75
.... rest of combinations
I am trying to understand the following -
What is the best way to index this new set of ranks to the existing documents? As a separate collection? Or as a nested data set?
How can I boost my results using the new set of ranks at search time? [so basically, if the user is searching from USA - CA, I should be using RANK=100, to boost my search results. I would know the State and Country at search time.]
Thank You!
If you want to integrate numeric document values directly into the score, use a boost function on query time. You may also use multiple document values here, but watch out to select an adequate boost factor.
bf=mul(RANK, 2)

couchdb showing different record count with pagination?

I wanted to calculate how many records couchDB has for a database. So hit the below API and it returns json which has records count as well.
GET http://localhost:20984/mydb/
jSON
{
"db_name": "mydb",
"update_seq": "25577-g1AAAAFLeJ4mDJtXMoSQMrqYcpacSjLYwGSDA1ACqhyPljpJ7xKF0CU7gcrXYlX6QGI0vtgpWJ4lT6AKIW4tTYLACqwZ0c",
"sizes": {
"file": 20881199,
"external": 11977342,
"active": 16542736
},
"purge_seq": 0,
"other": {
"data_size": 11977342
},
"doc_del_count": 0,
"doc_count": 25569,
"disk_size": 20881199,
"disk_format_version": 6,
"data_size": 16542736,
"compact_running": false,
"cluster": {
"q": 8,
"n": 1,
"w": 1,
"r": 1
},
"instance_start_time": "0"
}
Here doc_count is 25569 so i assume total records are 25569.
But when i set document_per_page to 100 & begin to view the records it shows 100 records for first page & 200 records for 2 pages & so on. And if i keep doing this way it shows me more than 35000 records.
Now my question is if total records are 25569 then how couchDB is showing me records more than 35000 with pagination ?
You're right about doc_count, it reports the number of documents in the specified database.
Presuming you're using Fauxton and in there, you switch "Documents per page" to 100, I was able to reproduce the described behavior. When I repeatedly press the next arrow in a short interval, the displayed data and the numbers on the paginator went out of sync. Therefore this seems to be a bug of Fauxton.

Solr schema design: fitting time-series data

I am trying to fit the following data in Solr to support flexible queries and would like to get your input on the same. I have data about users say:
contentID (assume uuid),
platform (eg. website, mobile etc),
softwareVersion (eg. sw1.1, sw2.5, ..etc),
regionId (eg. us144, uk123, etc..)
....
and few more other such fields. This data is partially pre aggregated (read Hadoop jobs): so let’s assume for "contentID = uuid123 and platform = mobile and softwareVersion = sw1.2 and regionId = ANY" I have data in format:
timestamp pre-aggregated data [ uniques, total]
Jan 15 [ 12, 4]
Jan 14 [ 4, 3]
Jan 13 [ 8, 7]
... ...
And then I also have less granular data say "contentID = uuid123 and platform = mobile and softwareVersion = ANY and regionId = ANY (These values will be more than above table since granularity is reduced)
timestamp : pre-aggregated data [uniques, total]
Jan 15 [ 100, 40]
Jan 14 [ 45, 30]
... ...
I'll get queries like "contentID = uuid123 and platform = mobile" , give sum of 'uniques' for Jan15 - Jan13 or for "contentID=uuid123 and platform=mobile and softwareVersion=sw1.2", give sum of 'total' for Jan15 - Jan01.
I was thinking of simple schema where documents will be like (first example above):
{
"contentID": "uuid12349789",
"platform" : "mobile",
"softwareVersion": "sw1.2",
"regionId": "ANY",
"ts" : "2017-01-15T01:01:21Z",
"unique": 12,
"total": 4
}
second example from above:
{
"contentID": "uuid12349789",
"platform" : "mobile",
"softwareVersion": "ANY",
"regionId": "ANY",
"ts" : "2017-01-15T01:01:21Z",
"unique": 100,
"total": 40
}
Possible optimization:
{
"contentID": "uuid12349789",
"platform.mobile.softwareVersion.sw1.2.region.us12" : {
"unique": 12,
"total": 4
},
"platform.mobile.softwareVersion.sw1.2.region.ANY" : {
"unique": 100,
"total": 40
},
"ts" : "2017-01-15T01:01:21Z"
}
Challenges: Number of such rows is very large and it'll grow exponentially with every new field - For instance if I go with above suggested schema, I'll end up storing a new document for each combination of contentID,platform,softwareVersion,regionId. Now if we throw in another field to this document, number of combinations increase exponentially.I have more than a billion such combination rows already.
I am hoping to find advice by experts if
Multiple such fields can be fit in same document for different 'ts' such that range queries are possible on it.
time range (ts) can be fit in same document as a list(?) (to reduce number of rows). I know multivalued fields don't support complex data types, but if anything else can be done with the data/schema to reduce query time and number of rows.
The number of these rows are very large, for sure more than 1billion (if we go with the schema I was suggesting). What schema would you suggest for this that'll fit query requirements?
FYI: All queries will be exact match on fields (no partial or tokenized), so no analysis on fields is necessary. And almost all queries are range queries.
You are trying to store query time results of all the possible combination of attributes values. Thats just too much duplicate data. Rather you store each observation and the attributes as a single data point just once. so if you had 'n' observations and if you add an additional attribute, it would grow additively, not exponentially. And if you needed data for a certain combination of attributes, you filter/aggregate them at query time.
{
"contentID": "uuid12349789",
"ts" : "2017-01-15T01:01:21Z",
"observation": 10001,
"attr-platform" : "mobile",
"attr-softwareVersion": "sw1.2",
"attr-regionId": "US",
}

SOLR and complex pricing

We have a property booking site (similar to Airbnb) with a LAMP setup and a relatively simple SOLR index. A user can search by:
Location
Guests
Date IN/OUT
They can also filter results for a specific price range, as well as sort by price.
The issue is that price per night is never fixed, and is depends on several things such as custom prices (for one day, or a range of days), price per guest, discounts (e.g. weekend discount) etc. One example could be that price_per_night might be 50$, however for the 10th and 11th of August the real price per night could be 110$ plus 20$ extra guest fee, minus a discount of 30$ for that particular weekend. On top of this, commissions are applied on the total amount so we'd like this to be as accurate as possible.
While we can calculate this on the server side, this of course affects the results returned for a price range as well as the sorting (highest price first vs. lowest).
Could anyone suggest any possible solution?
Below is an example of a select:
"docs": [
{
"id": 1,
"property_type": 1,
"room_type": 1,
"minimum_nights": 2,
"maximum_nights": 120,
"location": "41.3902359,2.1685901",
"price_per_night": 210,
........
"unavailable_days": [
"2016-09-15T00:00:00Z",
"2016-09-16T00:00:00Z",
......
]
}

Resources