SOLR travel site: on date queries - solr

I was looking to implement SOLR for a Hotel bookings site. Search based on location, hotel names, facilities works very well and so does the faceting. What I have not been able to figure out is how to search for a hotel given Checkin and Checkout dates.
Eg: User will search for search query - "Hotels in Newyork" and select CheckIn Date: 10th Feb 2012 and CheckOut Date: 12 Feb 2012 from the date selection box.
This is how I have the data -
Hotel_Name 10thFeb2012 11thFEB2012 ........ 31DEC2012
Hotel1 2room 3room 10rooms
Hotel2 1room 4room ........ 12rooms
Now if the query is for Hotel2 for 3rooms from checkin Date 10thFeb2012 to 11thFeb2012 it shdnt match because there is only one room available for 10thFeb.
IF the query for Hotel2 is for 1 room from checkin Date 10thFeb2012 to 11thFeb2012 then it should be part of search result.

Use the ISO 8601 format for your date-times.
Complete date plus hours, minutes, seconds and a decimal fraction of a second
YYYY-MM-DDThh:mm:ss.sTZD (eg 1997-07-16T19:20:30.45Z)
Both your database and Solr will understand date-times from strings that conform to that format.
So,
store the data in DB and Solr with compatible date-time formats. (On the back of my head, Solr must have a Z appended to the date-time, else its invalid).
your search interface must format all dates in that format to query solr.
Solr can do conditional expressions, facets, range bucket faceting etc with dates.
I would go with the following schema:
hotel_name : string (for faceting)
hotel_name_searchable : text (for searching, this is a copy field:look it up)
room_id : string
start_date : date (when the room is availabe)
end_date : date (if not booked, set it to an infinite date, say 2040)
For each room you are ever tracking, store the date-times between which it is free.
You can search for rooms between the start_date and end_date.
Do faceting on hotel_name so your search for rooms "checkin Date 10thFeb2012 to 11thFeb2012" gets you:
Hotel1:[r1,r2,r3]
Hotel2:[r8]
Hotel3:[r2,r3,r4]
Faceting on hotel_name filters to one hotel, facet.mincount on room_id can return hotels having the required number of rooms.
A little warning: I may be a bit rusty on faceting, as I used to do a lot of processing on Solr results itself.

Related

Solr - facet by day of week

On Solr, I would like to get facets of a field date but according to a day of the week,
for example, if I have 3 records with the following values on the date field:
16/11
22/11
23/11
I would like to get the following facet:
Sunday: 2,
Monday: 1
Is it possible?
Solr does not having anything which could provide you the day of the week based on the date.
You need to index the weeks data in a separate field.
Where your fields would be holding the values like SUNDAY, MONDAY etc..
Once you have this field and indexed the data in solr.
Then you could be able to achieve the faceting based on the weeks.

Sunspot solr index search and index on range data

I am storing availability timing for my users where they enter for each day of week what timings they would be available
for example - Mr X would be available on
sunday for 2-5, 8-12, 15-18
monday for 1-3, 5-8, 10-12
and so on entire week
what would be the best possible way of indexing and searching this data in solr
a database query for searching such a dataset would be like
select * from schedule inner join days on schedule.day_id = days.id
where days.name = 'Sunday' and schedule.start>=5 and schedule.end>=8
Use the DateRangeField which became available in Solr5. This allows you to query for documents that contain ranges that matches your query time.
fq={!field f=dateRange op=Contains}[2013 TO 2018]
Before Solr 5 there's a neat hack that uses the spatial support in Solr to query for overlapping durations (if this point is contained within the expected time area, etc.).
Depending on the needed resolution, you could also index seven different fields (monday - sunday) and then index an integer for each hour that the person is available. You can then query the field with a regular query, such as available_sunday:15 to find matching persons.

Solr match any date in given month

In Solr, is it possible to search for all records in a given month regardless of the year or day ? For example, the snippet below would match everything on 01.01.2013 - what I want to do is find everything that appeared on 01.01 for any year.
date:2013-01-01T00:00:00Z
No, not with a date field. Solr can only deal with ranges of dates, just like it only deals with ranges of numbers or ranges of strings. Asking Solr to only query a date field based on the first day of the month is like asking it to query on a numeric field and only give you odd numbers, or querying a string but only those starting with vowels.
What you'll need to do is break up the date into month and day components and then query on those. If your base field is sale_date, you'll also need sale_month and sale_day. Then you can query on month:3 to get everything that happened in any March, or day:1 and get everything for the first day of any month or month:3 AND day:1 to get everything that happened on any March 1st.

Filtering on two different dates on the same filter in Solr

In Solr, if you have an indexed piece of data, and within that data you had a set of date values, how can you query against the index and ask for events between X and Y date?
For example, if I have a list of Event Venues, each with dozens of events (single, all day, or multi-day), how would you construct the filter to return venues whos events are between the start and end date specified in a search?
Right now, if I search in a form and submit it through to Solr, the query string looks like this:
&fq=dm_event_start_date\:value:["2013-01-04T05:00:00Z" TO *]
&fq=dm_event_end_date\:value:[* TO "2013-01-08T05:00:00Z"]
&fq=bm_tickets_left\:value:"TRUE"
What I really want to ask for are events that occur or start on January 4th, don't last beyond January 8th, AND still have tickets left.
I feel like what I am getting in return is any event that either falls between the two dates, or has tickets available- not necessarily matching the dates.
Probably the date field values needs to be out of the quote in the range query e.g. :-
&fq=dm_event_start_date:value:[2013-01-04T05:00:00Z TO *]
&fq=dm_event_end_date:value:[* TO 2013-01-08T05:00:00Z]
&fq=bm_tickets_left:value:true
Also the bm_tickets_left needs to be just the string value.

Solr: org.apache.solr.common.SolrException: Invalid Date String:

I am new to solr and this is my first attempt at indexing solr data, I am getting the following exception while indexing,
org.apache.solr.common.SolrException: Invalid Date String:'2011-01-07'
at org.apache.solr.schema.DateField.parseMath(DateField.java:165)
at org.apache.solr.schema.TrieDateField.createField(TrieDateField.java:169)
at org.apache.solr.schema.SchemaField.createField(SchemaField.java:98)
at org.apache.solr.update.DocumentBuilder.addField(DocumentBuilder.java:204)
at org.apache.solr.update.DocumentBuilder.toDocument(DocumentBuilder.java:277)
I understand from reading some articles that Solr stores time only in UTC, this is the query i am trying to index,
Select id,text,'language',links,tweetType,source,location, bio,url,utcOffset,timeZone,frenCnt,createdAt,createdOnGMT,createdOnServerTime,follCnt,favCnt,totStatusCnt,usrCrtDate,humanSentiment,replied,replyMsg,classified,locationDetail, geonameid,country,continent,placeLongitude,placeLatitude,listedCnt,hashtag,mentions,senderInfScr, createdOnGMTDate,DATE_FORMAT(CONVERT_TZ(createdOnGMTDate,'+00:00','+05:30'),'%Y-%m-%d') as IST,DATE_FORMAT(CONVERT_TZ(createdOnGMTDate,'+00:00','+01:00'),'%Y-%m-%d') as ECT,DATE_FORMAT(CONVERT_TZ(createdOnGMTDate,'+00:00','+02:00'),'%Y-%m-%d') as EET,DATE_FORMAT(CONVERT_TZ(createdOnGMTDate,'+00:00','+03:30'),'%Y-%m-%d') as MET,sign(classified) as sentiment from
Why i am doing this timezone conversion is because i need to group results by the user timezone. How can i achieve this?
Regards,
Rohit
Solr dates must be in the form 1995-12-31T23:59:59Z. You're only giving the date part, but not the time.
See the DateField javadocs for more details.
Date faceting is entirely driven by query params, so if we index your events using the "true" time that they happend at (formatted as a string in UTC) you can then select your date ranges using whatever timezone offset is specified by your user at query time as a UTC offset.
facet.range = dateField
facet.range.start = 2011-01-01T00:00:00Z+${useroffset}MINUTES
facet.range.gap = +1DAY
This would return result in the users timezone and there is actually no need to timezone conversion the query and indexing that column separately.
Regards,
Rohit
Credit For Answer: Chris Hostetter (Solr User Group )

Resources