How to get the version of lucene index in solr - solr

My use case: I am currently on solr 5.5 and upgrading to solr 8.8
For this, I will need to do re-indexing on all machines where solr is installed. I need to do a check on the index version, if the index is made from the old version, then I will run the re-indexing logic, and if it already is the new version, I will skip the re-indexing.
Is there a way to detect the index version?
NOTE: the config files will already be updated to the new version so cannot use tag from solrconfig.xml

HTTP GET request to retrive the info :
yoursolrhost:8983/solr/admin/info/system
It would be something like below
http://yoursolrhost:8983/solr/admin/info/system?wt=json

Related

Sitecore and SolrCloud switch on rebuild

As you may already know that Sitecore, configured to work with SolrCloud, does not support index switch on rebuild. Is there a way to achieve this with version 4.10.3 of Solr and Sitecore 8.0?
We found a link - https://github.com/SitecoreSupport/Sitecore.Support.449298 - but this has only been tested from version 5.2.1 to 5.5.1. Does anyone have any experience implementing this for version 4.10.3? Any issues that we may need to be aware of?
Thanks
This patch was created as the old SwitchOnRebuild used the Solr Core switching API which is now deprecated in version 5.* and above. It was not recommended when running in SolrCloud mode due to an issue with Zookeeper.
This code uses the Solr 'collections' API (/solr/admin/collections?action=LIST) instead , you would need to check if this API is available for Solr 4.10 (I think it is but I'm not 100% sure)
You would then just need to ignore the parts about the schemaFactory as that is Solr 5.* specific.
Note that this patch relies on the 405677 patch to be applied too.

Solr - Migrate Documents from one Collection to another existing one

I need to move all Solr Documents from one collection to another (already existing collection) - there are 500,000 documents.
I have tried the solr migrate but cannot get the routing key correct. I have tried:
curl 'http://localhost:8983/solr/admin/collections?action=MIGRATE&collection=oldCollection&target.collection=newCollection&split.key=!'
I have solr 4.10.3 installed in a cloudera installation.
Copy your existing oldCollection, and rename the as newCollection,
After that you may need to update some config files for the same.
Or create a new one using the below api
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1
The answer and the question are quite old, starting from 8.1 solr version, there is a feature specific for this purpose which is the reindexcollection api which can directly be used to reindex docs from source to a target collection with a lot of configurable options. Here is the link to the official doc : https://lucene.apache.org/solr/guide/8_1/collections-api.html#reindexcollection

Mahout & Lucene Version Compatibility

I am trying to use Mahout to do some analysis on the term vectors stored in my Solr/Lucene index. Unfortunately, it seems that the latest Mahout release is behind the latest Solr/Lucene release.
My Solr/Lucene installation is 4.10.3. As far as I can tell, the latest Mahout release (1.0) expects Lucene indexes at version 4.6.1.
When I run mahout lucene.vector I get the error:
Exception in thread "main" org.apache.lucene.index.IndexFormatTooNewException: Format version is not supported (resource: MMapIndexInput(path="/path/to/data/index/segments.gen")): -3 (needs to be between -2 and -2)
I have tried two things so far to tackle this problem:
First, I edited my solrconfig.xml file to say:
<luceneMatchVersion>4.6.1</luceneMatchVersion>
delete my indexed data, and built a clean index from the original documents. This has done nothing to fix the error.
So secondly, I tried to change the lucene.version in the Mahout pom.xml file to 4.10.3 and recompile the binary to see if the capabilities had been added yet. I knew this was unlikely to work, but tried anyway.
My question is, how do I appropriately change the Lucene version that Solr uses for writing index files if it is not the above luceneMatchVersion setting in solrconfig.xml?
Mahout seems to support Solr 3.x for now. You may try this patch for mahout.

Solr Upgrade from 3.4 to 4

In order to make use of pivot feature present on Solr 4, I upgraded from 3.4.
Shall I proceed with a full reindex of the content due this upgrade or are they compatible somehow?
And regarding my client-applications that are currently accessing my solr server 3.4, will they present problem after upgrade? (The preliminary test I did they are running, seems the xml schema returned in a query response didn't changed when you don't use new features)
You need to do a full reindex if you want to use the Solr 4 index structure. Else you need to change the Lucene version in solrconfig to use the old index.
The schema will need a new field called _version_ if you want to use the Real Time Get functionality.
Other then that most things are pretty much the same for the client.

Upgrade solr 1.4 index to solr 3.3?

I have an existing index build using apache solr 1.4.
I want to use this existing index in version 3.3. As you know the index format is changed after 3.x, so how is it possible to do this?
I have exported the existing index (that is in 1.4 version) using Luke to XML.
There's two ways to do this:
if your index is unoptimized, then simply optimize it - this will upgrade the file format along the way.
if your index is already optimized, you can't do this. Instead, use the command line tool supplied with solr (your path may differ from mine
java -cp work/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/webapp/WEB-INF/lib/lucene-core-3.3.0.jar org.apache.lucene.index.IndexUpgrader -verbose /path/to/index/directory
However, note that this only changes the file format - it won't stop deprecation warnings because unless you tell it otherwise, solrconfig.xml defaults to still assuming you're using an old index format. see http://www.mail-archive.com/dev#lucene.apache.org/msg23233.html
You may still get lots of lines like this in your logfile:
WARNING: LowerCaseFilterFactory is using deprecated LUCENE_24 emulation. You should at some point declare and reindex to at least 3.0, because 2.x emulation is deprecated and will be removed in 4.0
until you tell solrconfig.xml that you're ready to use all the features of the new index format. You do this by adding the following to solrconfig.xml (at the top level, just after the abortOnConfigurationError setting).
<!-- Controls what version of Lucene various components of Solr
adhere to. Generally, you want to use the latest version to
get all bug fixes and improvements. It is highly recommended
that you fully re-index after changing this setting as it can
affect both how text is indexed and queried.
-->
<luceneMatchVersion>LUCENE_33</luceneMatchVersion>
If you have the data: the best way is indexing all the data new in solr 3.3
You can use the data import handler to index your exported XML files.
If building up a new index is not an solution for you, you have got different possibilities:
As far as i know, Solr 3.3 can read old indexes.
So one idea could be using shards. One shard for the old data (read only) an the other shard for the new data. Unfortunately, in this solution you will be unable to modify old data.

Resources