I cannot upload anymore CSV data in SOLR - database

i have tried multiple times to upload a big data set into solr, and i get this error, does anyone know what can i do ?
https://i.stack.imgur.com/DUIKC.png

I am able to do this in Solr 7.x (tried in 7.2 as well as 7.6) version. See if you can use that version for your project.
Probably in the newer version, solrconfig.xml needs some changes depending on which one you picked for your collection creation like _deafult or sample_techproducts_configs.

Related

upgrade solr from 4.2.1 to 5.3.1

I've been tasked with migrating from our solr 4.2.1 server to a new solr server, 5.3.1. I was hoping I could just pick up the cores, and move them over with a little but of editing files. But atlas, I can't quite figure it out.
I have tried moving a single core, and creating a core.properties files with the name of the core and I get:
testcore: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'solr.JsonUpdateRequestHandler'
Any thoughts as to what the problem might be? Any thoughts would be appreciated, thank you!
I am in the final stages of the similar upgrade; here is how I suggest you proceed.
Install both versions side by side and create the collection in new solr
Take your default schema/solrconfig from the new solr and move stuff into it from your old schema/solrconfig. The formatting changed, so you will need to manually move all of your config.
Make sure that works
Move the indexes - once your solrconfig and schema match up you should be able to use your old indexes (data directory).
To complete the upgrade you will need to re-index into a new but similar collection. This will upgrade the underlying lucene indexes. Your new version of solr has cursor mark support so it becomes much simplier; especially if you are using collection aliases.
JSON does not have its own request handler any longer (changed in 4.x, removed in 5.x). It has now been merged into the standard solr.UpdateRequestHandler, and the request handler is selected internally based on the Content-Type header of the request.

Solr - Migrate Documents from one Collection to another existing one

I need to move all Solr Documents from one collection to another (already existing collection) - there are 500,000 documents.
I have tried the solr migrate but cannot get the routing key correct. I have tried:
curl 'http://localhost:8983/solr/admin/collections?action=MIGRATE&collection=oldCollection&target.collection=newCollection&split.key=!'
I have solr 4.10.3 installed in a cloudera installation.
Copy your existing oldCollection, and rename the as newCollection,
After that you may need to update some config files for the same.
Or create a new one using the below api
https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1
The answer and the question are quite old, starting from 8.1 solr version, there is a feature specific for this purpose which is the reindexcollection api which can directly be used to reindex docs from source to a target collection with a lot of configurable options. Here is the link to the official doc : https://lucene.apache.org/solr/guide/8_1/collections-api.html#reindexcollection

Solr luceneMatchVersion syntax

I have Solr 4.10 and I have collection on it with solorconfig.xml has the value for <luceneMatchVersion> as follows:
<luceneMatchVersion>4.7</luceneMatchVersion>
Is this correct? I saw other examples that has values such as LUCENE_35 What I need to know also, how could I express LUCENE_xx from my current Solr version?
You should use:
<luceneMatchVersion>4.10.4</luceneMatchVersion>
I recommend you to check your current solr version, in my case was 4.10.4.
if you are going to reindex, then both numbers should match. The only reason you might want to have them different, is if you had and index created with say Lucene 4.7, then you would have
<luceneMatchVersion>4.7</luceneMatchVersion>
Then, you upgrade lucene to 4.10.
Now, if among the changes in between 4.7 and 4.10 there are things that work differently regarding analysis (you get the same sentence analysed in both versions and get different output as a result), then, you might want to keep the version number at 4.7, otherwise some queries that contain affected terms might not work (as they were analysed at index time in a different way than at query time). You have to asses how critical that issue might be.
That is why the recommendation is to upgrade, change the setting to the current number, and reindex. This way you are sure to avoid any issue.
If anyone is using Drupal, the Search API Solr (search_api_solr) module has config templates by version in /sites/all/modules/search_api_solr/solr-conf/.
The template README.md states the following:
The solr-conf-templates directory contains config-set templates for
different Solr versions.
These are templates and are not to be used as config-sets!
To get a functional config-set you need to generate it via the Drupal
admin UI or with drush solr-gsc. See README.md in the module
directory for details.
The module's README.md lists these instructions:
Make sure you have Apache Solr started and accessible (i.e. via port 8983). You can start it without having a core configured at
this stage.
Visit Drupal configuration (/admin/config/search/search-api) and create a new Search API Server according to the search_api
documentation using "Solr" as Backend and the connector that
matches your setup. Input the correct core name (which you will
create at step 4, below).
Download the config.zip from the server's details page or by using drush solr-gsc with proper options, for example for a server named
"my_solr_server": drush solr-gsc my_solr_server config.zip 8.4.
Copy the config.zip to the Solr server and extract.
I generated a config file for 8.x, and it uses this:
<luceneMatchVersion>${solr.luceneMatchVersion:LUCENE_80}</luceneMatchVersion>

Disappearing cores in Solr

I am new to Solr.
I have created two cores from the admin page, let's call them "books" and "libraries", and imported some data there. Everything works without a hitch until I restart the server. When I do so, one of these cores disappears, and the logging screen in the admin page contains:
SEVERE CoreContainer null:java.lang.NoClassDefFoundError: net/arnx/jsonic/JSONException
SEVERE SolrCore REFCOUNT ERROR: unreferenced org.apache.solr.core.SolrCore#454055ac (papers) has a reference count of 1
I was testing my query in the admin interface; when I refreshed it, the "libraries" core was gone, even though I could normally query it just a minute earlier. The contents of solr.xml are intact. Even if I restart Tomcat, it remains gone.
Additionally, I was trying to build a query similar to this: "Find books matching 'war peace' in libraries in Atlanta or New York". So given cores "books" and "libraries", I would issue "books" the following query (which might be wrong, if it is please correct me):
(title:(war peace) blurb:(war peace))
AND _query_:"{!join
fromIndex=libraries from=libraryid to=libraryid
v='city:(new york) city:(atlanta)'}"
When I do so, the query fails with "libraries" core disappears, with the above symptoms. If I re-add it, I can continue working (as long as I don't restart the server or issue another join query).
I am using Solr 4.0; if anyone has a clue what is happening, I would be very grateful. I could not find out anything about the meaning of the error message, so if anyone could suggest where to look for that, or how go about debugging this, it would be really great. I can't even find where the log file itself is located...
I would avoid the Debian package which may be misconfigured and quirky. And it contains (a very early build of?) solr 4.0, which itself may have lingering issues; being the first release in a new major version. The package maintainer may not have incorporated the latest and safest Solr release into his package.
A better way is to download Solr 4.1 yourself and set it up yourself with Tomcat or another servlet container.
In case you are looking to install SOLR 4.0 and configure, you can following the installation procedure from here
Update the solr config for the cores to be persistent.
In your solr.xml, update <solr> or <solr persistent="false"> to <solr persistent="true">

Upgrade solr 1.4 index to solr 3.3?

I have an existing index build using apache solr 1.4.
I want to use this existing index in version 3.3. As you know the index format is changed after 3.x, so how is it possible to do this?
I have exported the existing index (that is in 1.4 version) using Luke to XML.
There's two ways to do this:
if your index is unoptimized, then simply optimize it - this will upgrade the file format along the way.
if your index is already optimized, you can't do this. Instead, use the command line tool supplied with solr (your path may differ from mine
java -cp work/Jetty_0_0_0_0_8983_solr.war__solr__k1kf17/webapp/WEB-INF/lib/lucene-core-3.3.0.jar org.apache.lucene.index.IndexUpgrader -verbose /path/to/index/directory
However, note that this only changes the file format - it won't stop deprecation warnings because unless you tell it otherwise, solrconfig.xml defaults to still assuming you're using an old index format. see http://www.mail-archive.com/dev#lucene.apache.org/msg23233.html
You may still get lots of lines like this in your logfile:
WARNING: LowerCaseFilterFactory is using deprecated LUCENE_24 emulation. You should at some point declare and reindex to at least 3.0, because 2.x emulation is deprecated and will be removed in 4.0
until you tell solrconfig.xml that you're ready to use all the features of the new index format. You do this by adding the following to solrconfig.xml (at the top level, just after the abortOnConfigurationError setting).
<!-- Controls what version of Lucene various components of Solr
adhere to. Generally, you want to use the latest version to
get all bug fixes and improvements. It is highly recommended
that you fully re-index after changing this setting as it can
affect both how text is indexed and queried.
-->
<luceneMatchVersion>LUCENE_33</luceneMatchVersion>
If you have the data: the best way is indexing all the data new in solr 3.3
You can use the data import handler to index your exported XML files.
If building up a new index is not an solution for you, you have got different possibilities:
As far as i know, Solr 3.3 can read old indexes.
So one idea could be using shards. One shard for the old data (read only) an the other shard for the new data. Unfortunately, in this solution you will be unable to modify old data.

Resources