Solr data-import error log - solr

I have Solr 3.6 powering search on a Wordpress site I maintain, and this morning I saw that Sorl could not execute a data import. I was attempting to run http://example.com:9393/solr/wordpress/dataimport?command=full-import. Whereas until today the import would chug happily along, now I am getting only the message, Indexing failed. Rolled back all changes.
I'm probably missing something obvious, but where does Solr keep the data import logs? I would like to check them out to see what the problem is, but I have not been able to find the right logs.

Solr does not have exclusive log file for data-import, log statements related to data-import process are written to standard log file that Solr writes to. If you are using Tomcat it should be ../logs/catalina.out .
Error could be caused by any number of problems between Solr, Data source, perhaps the data itself. You might want to check the following questions as well
Indexing failed. Rolled back all changes. (Solr DataImport)
solr dataimport error: Indexing failed. Rolled back all changes

Related

Optimistic concurrency issue in SolrCloud

I am using Solr v7.7.1 in cloud mode. I am facing an issue related to optimistic concurrency:
I have a nested document which can be updated concurrently multiple times before committing the updates. During the process of indexing, we fetch the document which we want to modify along with its _version_, modify it and then send it to solr along with the same _version_. If the update happens more than once before committing, the following error is thrown:
Caused by:
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException:
Error from server at
http://1.2.3.4:8983/solr/mcollection_shard1_replica_n2: version
conflict for 1111 expected=1645085633861910528
actual=1645090791527284737
In the above error, we are basically trying to index a document with id 1111 before a previous version of the document was indexed and committed. The solution for this problem is to simply commit all the updates and then again try indexing the new document. However, the solr is giving the same error with same version codes even after committing. What could possibly the issue?
A strange observation is that this problem is not faced when solr is not running in the cloud mode.
This seems to be a very specific issue with solr when we are using nested documents.
While indexing a document, when _version_ is mentioned, the solr checks the version of the already existing latest document by doing a real-time get. The real-time get gets the data from update logs (which means that the data which is not yet open for search is also accessible). For this, solr does something like following:
http://1.2.3.4:8983/solr/mcollection/get?id=1111
Now if you have 2 nested documents where, in one document (doc1), parent has id=1111 and in other document(doc2), the child has id=1111, then it may be possible that solr might check version of doc2 when you intended to index doc1. This might be because solr still indexes all the documents in flat structure and doesn't consider parent-child relationship while doing real-time get.
The solution to this is to make the id of parent and child documents different from each other.
The bug has been reported: https://issues.apache.org/jira/browse/SOLR-13785

Solr reindex is stopping prematurely when running Collective Solr for Plone

My team is working on a search application for our websites. We are using Collective Solr in Plone to index our intranet and documentation sites. We recently set up shared blob storage on our test instance of the intranet site because Solr was not indexing our PDF files. This appears to be working, however, each time I run the reindexing script (##solr-maintenance/reindex) it stops after about an hour and a half. I know that it is not indexing our entire site as there are numerous pages, files, etc. missing when I run a query in the Solr dashboard.
The warning below is the last thing I see in the Solr log before the script stops. I am very new to Solr so I'm not sure what it indicates. When I run the same script on our documentation site, it completes without error.
2017-04-14 18:05:37.259 WARN (qtp1989972246-970) [ ] o.a.s.h.a.LukeRequestHandler Error getting file length for [segments_284]
java.nio.file.NoSuchFileException: /var/solr/data/uvahealthPlone/data/index/segments_284
I'm hoping someone out there might have more experience with Collective Solr for Plone and could recommend some good resources for debugging this issue. I've done a lot of searching lately but haven't found much useful info.
This was a bug fixed some time ago with https://github.com/collective/collective.solr/pull/122

Sitecore SOLR Errors

I am using SOLR with sitecore, on production environment, I am getting a lot of errors in SOLR log, but sites are working fine, I have 32 solr cores, and I am using Solr version 4.10.3.0 with Sitecore 8.1 update 2, below is sample of these errors, any one can explain to me these errors :
Most of the errors are self-descriptive, like this one:
undefined field: "Reckless"
tells that the field in question is not defined in the solr schema. Try to analyze the queries you system is accepting and the system sending these in.
The less obvious one:
Overlapping onDeckSearchers=2
is warning about warming searchers, in this case 2 of them concurrently. This means, that there were commits to the Solr index in a quick succession, each of which triggered a warming searcher. The reason it is wasteful is that even though the first searcher has warmed up and is ready to serve queries, it will be thrown away as the new searcher warms up and is ready to serve.

Solr nodes' replication is getting stuck

We have standalone solr servers which are master and slave. Also have a full indexer job nightly. Generally, when job executed successful everything is alright. But last days, we noticed that indexer node has different document number with searching node. So, expected productions are not available in our production system. That's why we had to restart nodes and start replication manually, then problem went away. We need to prevent to occur this problem again. What do you suggest us to check or where should i look at? Indeed i think that essential error about the issue is: "SEVERE: No files to download for index generation"
Regards

Disappearing cores in Solr

I am new to Solr.
I have created two cores from the admin page, let's call them "books" and "libraries", and imported some data there. Everything works without a hitch until I restart the server. When I do so, one of these cores disappears, and the logging screen in the admin page contains:
SEVERE CoreContainer null:java.lang.NoClassDefFoundError: net/arnx/jsonic/JSONException
SEVERE SolrCore REFCOUNT ERROR: unreferenced org.apache.solr.core.SolrCore#454055ac (papers) has a reference count of 1
I was testing my query in the admin interface; when I refreshed it, the "libraries" core was gone, even though I could normally query it just a minute earlier. The contents of solr.xml are intact. Even if I restart Tomcat, it remains gone.
Additionally, I was trying to build a query similar to this: "Find books matching 'war peace' in libraries in Atlanta or New York". So given cores "books" and "libraries", I would issue "books" the following query (which might be wrong, if it is please correct me):
(title:(war peace) blurb:(war peace))
AND _query_:"{!join
fromIndex=libraries from=libraryid to=libraryid
v='city:(new york) city:(atlanta)'}"
When I do so, the query fails with "libraries" core disappears, with the above symptoms. If I re-add it, I can continue working (as long as I don't restart the server or issue another join query).
I am using Solr 4.0; if anyone has a clue what is happening, I would be very grateful. I could not find out anything about the meaning of the error message, so if anyone could suggest where to look for that, or how go about debugging this, it would be really great. I can't even find where the log file itself is located...
I would avoid the Debian package which may be misconfigured and quirky. And it contains (a very early build of?) solr 4.0, which itself may have lingering issues; being the first release in a new major version. The package maintainer may not have incorporated the latest and safest Solr release into his package.
A better way is to download Solr 4.1 yourself and set it up yourself with Tomcat or another servlet container.
In case you are looking to install SOLR 4.0 and configure, you can following the installation procedure from here
Update the solr config for the cores to be persistent.
In your solr.xml, update <solr> or <solr persistent="false"> to <solr persistent="true">

Resources