Solr: how to index date and time - solr

I have Solr 7.2.1 and in my managed-schema.xml file I have a field which represents date object of type "pDate".
Now I need to index also the time of the day, but I saw I can't search for the time with "pDate" field type. If I query solr searching for my_date_field:[2018-03-12T00:00:00.000Z TO *] it works; instead if i search [2018-03-12T12:00:00.000Z TO *] I can't find any results.
so, basically, what type is better to use to achieve that ? Is the field type the origin of the problem ?

Solr Ref Guide says: "Solr’s date fields (DatePointField, DateRangeField and the deprecated TrieDateField) represent "dates" as a point in time with millisecond precision." So the field can not be the origin of your problem.
Check the format "12/03/2018T12:00:00.000Z". Is is really correct? I only know dates formatted like "2018-03-12T12:00:00.000Z". See here Date Formatting
Also find the document in Solr Admin UI and inspect the JSON response. What is the value of my_date_field?

I found the solution,
I had to set $SOLR_TIMEZONE variable in the "solr.in" config file, but the correct value was not "CEST" but "Europe/Rome" for me

Related

How expiration field values for documents from a "time to live" (TTL) is calculated in Solr?

I went through the Java-doc # DocExpirationUpdateProcessorFactory . It says :
The DocExpirationUpdateProcessorFactory provides two features related
to the "expiration" of documents which can be used individually, or in
combination:
Computing expiration field values for documents from a "time to live" (TTL)
Periodically delete documents from the index based on an expiration field
But it didn't specify how expirationField values are calculated from ttl field.
Anybody can help to understand how its calculated?
ttlFieldName - Name of a field this process should look for in each document processed, defaulting to _ttl_. If the specified field name exists in a document, the document field value will be parsed as a Date Math Expression relative to NOW and the result will be added to the document using the expirationFieldName.
This means that you can use terms like +2 HOURS in the _ttl_ field to make the document expire in two hours from when it was indexed. This date value will then be stored in the expirationFieldName field.
From Cloudera's documentation about the introduction of the feature:
Current Time is: 2016-10-26 20:14:00
_ttl_ is defined as: +2HOURS
This will result in an expiration value of 2016-10-26 22:14:00
There are also more examples in Lucidworks description of the feature:
{ "id" : "live_2_minutes_b",
"time_to_live_s" : "+120SECONDS"
},

Solr average of function result

I have a solr index with documents that have a fields started and stopped which both hold a datetime. I would like solr to output an average difference between them.
To get the difference between started and stopped I have used diff:ms(started, stopped) within fl.
I know you can get stats about a field with stats=true and stats.field=fieldname but if i use either diff or ms(started, stopped) as the field name it errors with an undefined field error.
So is what I want possible? If so, how do i go about it?
You should be able to use the function support in the JSON Faceting API:
q=*:*&
json.facet={
"avg_time": "avg(ms(started, stopped))"
}

Change Solr Field type

One string field in Lucene/Solr is stored like this: 'yyyyMMdd'.
I need to convert the field to tdate type.
How can I achieve this and do a re-index?
If your data is coming with the incomplete date format and you want to parse it, you need to use UpdateRequestProcessor chain for that. The specific URP is ParseDateFieldUpdateProcessorFactory. It's used as part of schemaless example in Solr, so you can check its usage in the solrconfig.xml there.
Most likely, you need to re-index from the source collection. There is no rewrite in-place options in Solr for individual fields.

How to change apache Nutch timestamp value

I am using apache nutch 2.3. Documents are indexed by nutch to solr are ok. But I have to know when a document was indexed in solr. I need time and date. I am getting following format in timestamp for a documents
"tstamp": "2015-04-06T10:11:16.619Z"
If I suppose that first part is the date then this is third month and not fourth. How I can change this issue.
Any suggestion ?
This was a bug. The next fetch time was being mistakenly assigned as the current fetch time.
There is a patch for it:
https://issues.apache.org/jira/browse/NUTCH-2045

Get Latest N Documents from Solr

I have a standard Solr 3.6 index and am looking to get the latest N documents back (date ascending from indexing them).
This site was helpful but not exactly what I'm looking for.
I am looking to do something like this:
localhost:8080/solr/select/?q=greekbailout&wt=json; date asc
Basically, query whatever with json output and the latest N submitted documents to the index. Anybody run into this before?
Use &sort=date asc for pure sorting and this for boosting newer documents.
solr query using date field with N documents returned in results
localhost:8080/solr/select/?q=greekbailout&wt=json&sort=date asc&rows=N
default schema of solr has a field called timestamp, which stores time at which a particular document is created or modified, so if your date field doesn't quite store this and this is your requirement, you can use timestamp.. just replace date with timestamp
In your Solr URL just apend &sort=<field>+<asc/desc>. Also your field should be indexed and not multivalued.
You can also sort on multiple fields.
&sort=<field name>+<direction>[,<field name>+<direction>]...
http://wiki.apache.org/solr/CommonQueryParameters#sort

Resources