Solr4.1 Cant delete documents older than 30 days - solr

I am running Solr4.1.
I do have a version field but I do not understand how to delete by query with regards to time. I dont have any field in my schema that has a timestamp in it that I can see.
What I am trying to do is run a query that deletes all documents older than say 30 days.
I have tried everything I can find on the net.
curl http://localhost:8983/solr/update?commit=true -H "Content-Type: text/xml" --data-binary '<delete><query>_version_:[* TO NOW-60DAYS] </query></delete>'
curl http://localhost:8983/solr/update?commit=true -H "Content-Type: text/xml" --data-binary '<delete><query>timestamp:[* TO NOW-60DAYS] </query></delete>'
curl http://localhost:8983/solr/update?commit=true -H "Content-Type: text/xml" --data-binary '<delete><query>createTime:[* TO NOW-60DAYS] </query></delete>'
other deletes work fine eg
curl http://localhost:8983/solr/update?commit=true -H "Content-Type: text/xml" --data-binary '<delete><query>field:value</query></delete>'

You can enable the timestamp field that is included in the schema.xml, just it is commented out. That field is auto-populated with the current datetime each time a document is inserted into the index. Look for the following in your schema.xml:
<!-- Uncommenting the following will create a "timestamp" field using
a default value of "NOW" to indicate when each document was indexed.
-->
<!--
<field name="timestamp" type="date" indexed="true" stored="true"
default="NOW" multiValued="false"/>
-->
You will need to re-index your documents for them to have this value set.

Related

Is there a way to overcome the limitation of Solr

Is there a way to overcome the limitation of Solr
How to add an additional column to the collection that I have already created and have crores of data in it.
To add a new field to an existing schema, you can use the Solr Schema API:
curl -X POST -H 'Content-type:application/json' --data-binary '{
"add-field":{
"name":"sell_by",
"type":"pdate",
"stored":true }
}' http://localhost:8983/solr/gettingstarted/schema
The type parameter corresponds to the field type you want the new field to have.
If you're using the old schema.xml format, you can add the field type in the XML there:
<field name="sell_by" type="pdate" indexed="true" stored="true"/>
You'll have to reload the configuration for the collection after changing it. If you're using Zookeeper (i.e. you're manually uploading your configuration to Zookeeper), you can use zkCli.sh and downconfig and upconfig to download and upload the configuration set.
After adding the field, you'll have to reindex the documents that should contain the field by submitting them to Solr again, so that the content is added to the field as expected.

Solr atomic updates not working with date field

I am using Solr 6.6. I am trying atomic updates on a date field. The field is defined in schema as
field name="inventory_update_time" type="date" indexed="true" stored="true" omitNorms="true" multiValued="false" omitTermFreqAndPositions="true"/
and I am firing the curl request as
curl 'localhost:8081/solr/sitename/update' -H 'Content-type:application/json' -d '[{"id":"9988062","inventoryUpdateTime":"2018-07-03T06:29:29Z"}]'
but the date is not getting updated.
any suggestions?
Your field name and your JSON name is not the same. You're not doing atomic updates either, since that would require a "set" command.
Your schema has the field name set as inventory_update_time, but in your JSON structure you're using inventoryUpdateTime as the key.
To actually perform an atomic update:
[
{
"id":"9988062",
"inventory_update_time":{
"set":"2018-07-03T06:29:29Z"
}
}
]

Solr add field to schema using curl

A:\DOS> curl -X POST -H 'Content-type:application/json' \
--data-binary '{"add-field":{"name":"timestamp","type":"date","indexed":true,"stored":true,\
"default":NOW,"multiValued":false}}' http://localhost:8983/solr/testt/schema
{
"reponseHeader":{
"status":0,
"QTime":0},
"errors":"no stream"}
I am trying to add a 'timestamp' field to solr and this is the error which I am getting. Can anyone help me figure out where I am wrong in this?
There may have two problem
&commit=true
At the end of URL commit parameter was not added.
schema.xml
Schema not contain field you wanted to set.
There is a problem with your curl command. Because if you try it differently it works
curl -d '#timestamp.json' -X POST -H 'Content-type:application/json' http://localhost:8983/solr/testt/schema
and create the timestamp.json file
{
"add-field":{"name":"timestamp","type":"date","indexed":true,"stored":true,"default":NOW,"multiValued":false}
}

Delete all documents from Solr that have a certain empty field

Querying for those documents works with: "fq=-myfield:[* TO *]".
But how can I delete all those? It seems that the delete syntax update?stream.body=<delete><query>... accepts only a query, no filters...
Only pass -myfield[* TO *] in query tag. Do not pass fq parameter. Then it will work I feel. Once I had to delete all documents with id that contained word "data" in the id field string, I just passed id:*data* between query tags, and it worked. Let me know if that helps you.
The correct answer should be: -myfield:* or even -myfield:[* TO *], but the : is mandatory.
This is is an example with curl:
curl http://localhost:8983/solr/collection/update \
-H "Content-Type: text/xml" \
--data-binary '<delete><query>-myfield:*</query></delete>'

How to delete the documents a month ago

We are using Solr 1.4.
How to delete the documents a month ago?
We are doing something similar where we purge items from one of our indexes, using curl and taking advantage of the timestamp field in the Solr schema.
Here is the curl command that you would issue to delete items older than 30 days (using DateMathParser to calculate based on current day), using the timestamp field in the schema.
curl "http://localhost:8983/solr/update?commit=true" -H "Content-Type: text/xml"
--data-binary "<delete><query>timestamp:[* TO NOW/DAY-30DAYS]</query></delete>"
Of course you will need to change the url to match your solr instance and you may choose to use a different field.
Also here is the field definition for the timestamp field from the schema.xml that comes with the Solr distribution in the example folder.
<field name="timestamp" type="date" indexed="true" stored="true" default="NOW" multiValued="false"/>
You need to POST in order to do deletes but if you use the post.jar from the example folder in the installation it is simply:
java -Ddata=args -Dcommit=yes -jar post.jar "<delete><query>$DateField:[* TO $DateOneMonthAgo]</query></delete>"
where $DateField is the name of the field where the date is stored and $DateOneMonthAgo is the date one month from now (2011-11-09T11:48:00Z)

Resources