How to change apache Nutch timestamp value - solr

I am using apache nutch 2.3. Documents are indexed by nutch to solr are ok. But I have to know when a document was indexed in solr. I need time and date. I am getting following format in timestamp for a documents
"tstamp": "2015-04-06T10:11:16.619Z"
If I suppose that first part is the date then this is third month and not fourth. How I can change this issue.
Any suggestion ?

This was a bug. The next fetch time was being mistakenly assigned as the current fetch time.
There is a patch for it:
https://issues.apache.org/jira/browse/NUTCH-2045

Related

Solr: how to index date and time

I have Solr 7.2.1 and in my managed-schema.xml file I have a field which represents date object of type "pDate".
Now I need to index also the time of the day, but I saw I can't search for the time with "pDate" field type. If I query solr searching for my_date_field:[2018-03-12T00:00:00.000Z TO *] it works; instead if i search [2018-03-12T12:00:00.000Z TO *] I can't find any results.
so, basically, what type is better to use to achieve that ? Is the field type the origin of the problem ?
Solr Ref Guide says: "Solr’s date fields (DatePointField, DateRangeField and the deprecated TrieDateField) represent "dates" as a point in time with millisecond precision." So the field can not be the origin of your problem.
Check the format "12/03/2018T12:00:00.000Z". Is is really correct? I only know dates formatted like "2018-03-12T12:00:00.000Z". See here Date Formatting
Also find the document in Solr Admin UI and inspect the JSON response. What is the value of my_date_field?
I found the solution,
I had to set $SOLR_TIMEZONE variable in the "solr.in" config file, but the correct value was not "CEST" but "Europe/Rome" for me

Solr average of function result

I have a solr index with documents that have a fields started and stopped which both hold a datetime. I would like solr to output an average difference between them.
To get the difference between started and stopped I have used diff:ms(started, stopped) within fl.
I know you can get stats about a field with stats=true and stats.field=fieldname but if i use either diff or ms(started, stopped) as the field name it errors with an undefined field error.
So is what I want possible? If so, how do i go about it?
You should be able to use the function support in the JSON Faceting API:
q=*:*&
json.facet={
"avg_time": "avg(ms(started, stopped))"
}

MSSQL datetime to SOLR date using ColdFusion

My query has a field CONTENTDISPLAYDATE and cfdump displays it as "2014-10-16 00:00:00.0". I add it to the SOLR collection using contentdisplaydate_dt="ContentDisplayDate" in my cfindex statement.
When I cfdump the resulting data from the cfsearch result, the field appears as "Thu Oct 16 00:00:00 EDT 2014" and sorting on it doesn't work. Using query of queries on the resultset and ordering by it also doesn't work. So looks like assigning it to a SOLR date field isn't working. Can anyone shed light on what I'm doing wrong? We're using the default version of SOLR that ships with CF 10.
The first thing to do would be to check to make sure that the field contentdisplaydate_dt is defined as a date field in Solr. You can do this by looking at the file schema.xml under this particular collection (often C:\ColdFusion9\collections\mycollection\conf\).
You can also confirm the content of contentdisplaydate_dt by querying the Solr index directly in your browser (using the Solr web service):
http://mysolrserver:8983/solr/myindex/select?q=searchterm&fl=contentdisplaydate‌​_dt
(The above URL will return XML data by default; if you prefer JSON then add &wt=json to the URL.)
My guess is that what is happening is that ColdFusion is trying to convert the Solr dates (which are always of the format yyyy-mm-ddThh:mm:ssZ) and the results aren't pretty. You have to do some manipulation in order to convert Solr dates to a date format recognized by CF.
Last, I would encourage you to use Solr's web service both to index and to search your data rather than using <cfindex> and <cfsearch>. Searching is especially easy with the Solr web service; just use <cfhttp> to call the web service and deserializeJSON() to process the data returned (assuming you're returning JSON instead of XML).

SolrNet query to get records between to fields value

hi guys i have started working on a project which need solr implemtation for searching.
I am using SolrNet Lib and my question is:
I have two field in solr index Maxsal and Minsal and i have Currentsal parameter which contains salary amount. What i want is, get all records which satisfy this condition:
currentsal< Maxsal && currentsal> Minsal
Take a look at Solr range query. It should allow to create query like this
minsal:[* TO PARAM] AND maxsal:[PARAM TO *]
For more information look here - http://www.solrtutorial.com/solr-query-syntax.html
Never noticed that Query() take string parameter too.
So,
Solr.Query("MaxSal<="+parameter && MinSal>=parameter")

Get Latest N Documents from Solr

I have a standard Solr 3.6 index and am looking to get the latest N documents back (date ascending from indexing them).
This site was helpful but not exactly what I'm looking for.
I am looking to do something like this:
localhost:8080/solr/select/?q=greekbailout&wt=json; date asc
Basically, query whatever with json output and the latest N submitted documents to the index. Anybody run into this before?
Use &sort=date asc for pure sorting and this for boosting newer documents.
solr query using date field with N documents returned in results
localhost:8080/solr/select/?q=greekbailout&wt=json&sort=date asc&rows=N
default schema of solr has a field called timestamp, which stores time at which a particular document is created or modified, so if your date field doesn't quite store this and this is your requirement, you can use timestamp.. just replace date with timestamp
In your Solr URL just apend &sort=<field>+<asc/desc>. Also your field should be indexed and not multivalued.
You can also sort on multiple fields.
&sort=<field name>+<direction>[,<field name>+<direction>]...
http://wiki.apache.org/solr/CommonQueryParameters#sort

Resources