Building Solr indexes through Haystack throws unknown field error - solr

I'm trying to integrate Haystack with Solr. When I try to build the index, I get an error
"Unknown field django_id" from SOLR. What's causing this to happen?

You also get this error if you haven't given Solr the schema.xml file which Haystack generates for you, as explained here in the docs.
django-haystack.readthedocs.io/en/latest/tutorial.html#reindex

The schema.xml was malformed as I had copied additional text from the console.

If you added new fields to your database and copied the generated XML files from Haystack, you might also be getting this error because you haven't restarted jetty/Tomcat/whatever server you are using. This solved it for me on Ubuntu and Jetty:
sudo /etc/init.d/jetty stop
sudo /etc/init.d/jetty start
(by the way, that would also be the same as simply doing this):
sudo service jetty restart
Or, if you are using tomcat, that would be
sudo service tomcat6 restart
Edit: (tested this with Tomcat, and it solved the same problem again, the same as with Jetty).

Related

Why can I not create a core when running a 2nd Apache Solr instance?

My first instance:
sudo bin/solr start -p 8983 -s ../coaps
My second instance:
sudo bin/solr start -p 8984 -s ../newcoaps
Using the python http utility I verified connections:
http :8983/solr/
http :8984/solr/
I can ping my first one with :8983/solr/samos/admin/ping/ but I can NOT ping the other one because the core located in ../newcoaps is not added upon startup.
The ../newcoaps directory looks like this before I started up Solr:
ls -R ../newcoaps/
../newcoaps/:
samos solr.xml
../newcoaps/samos:
conf data
../newcoaps/samos/conf:
schema.xml solrconfig.xml
../newcoaps/samos/data:
I copied the files in here directly from my other instance, which is running smoothly. Everything is default except for several fields I defined.
In the web browser, I see that the second instance has no cores, so I tried to add it manually but I get this response:
Error CREATEing SolrCore 'new_core': Unable to create core [new_core] Caused by: Can't find resource 'synonyms.txt' in classpath or '/opt/solr/newcoaps/samos'
What is going on here and why is that file important enough to prevent me from adding this core? What steps can I take to figuring out a solution to this problem?
Your schema (schema.xml) is referencing the synonyms.txt file (in a SynonymFilter definition). Remove the filter from the configuration if you're not expanding synonyms, or create an empty file named synonyms.txt to allow the core to start up.
As a possible explanation: If you started the first node without a schema.xml present the first time, it might have switched to using the managed schema functionality instead of reading the schema.xml, but when starting the second node with the schema present, it'll try to read and parse it.

Fail to load ExtractingRequestHandler when running the Solr Quickstart Tutorial

I installed Solr 5.0.0 on OS X 10.10.2 using Homebrew. I am trying to follow the quick start instructions and am getting errors when I try to index a directory of files.
I am able to successfully start the sample Solr server by running
bin/solr start -e cloud -noprompt
as directed by the tutorial. I then try to index a directory of files by running
./bin/post -c gettingstarted docs/
(Note that this has to be done from the libexec subdirectory of the Solr install root.)
I get a server error 500 for every file it tries to add. The relevant stack:
Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.extraction.ExtractingRequestHandler'
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:492)
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:423)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:559)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:632)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.createRequestHandler(RequestHandlers.java:326)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:298)
... 30 more
The issue appears to be that ExtractingRequestHandler is not on the classpath.
ExtractingRequestHandler is in the solr-cell-5.0.0.jar.
jar tf dist/solr-cell-5.0.0.jar | grep ExtractingRequestHandler
org/apache/solr/handler/extraction/ExtractingRequestHandler.class
It's not clear to me if it needs to be on the classpath of the command doing the posting or the Solr instance. The answer to this question makes it sound like the latter. However, I tried setting
export CLASSPATH=dist/solr-cell-5.0.0.jar
before trying to index the files and saw the same error.
I don't see anything in the tutorial about how to configure this. What is the error and how do I get past it?
Looks like the problem is incorrect paths in the Solr example configuration. A workaround is to add softlinks from SOLR_ROOT/contrib and dist to the corresponding directories beneath SOLR_ROOT/libexec/contrib
Details here and here.

can't index rich documents on both solr 3.6 and solr 4.0 using update/extract getting "#500 lazy loading error"

I'v just started to learn solr. From last 3 days I'm in trouble. I can not
index rich documents on solr 3.6 and 4.0. I am using windows7 64bit.
what i tried is as:
First I installed solr 3.6 with tomcat-jetty.using BitNami Apache
1.tried -Durl command what i got :
error #500 lazy loading error
2.Download curl for my window machine and tried curl i got: error #500 lazy loading error
3.copied a program from solr tutorial to upload a file using solrJ for
SolrJ in NetBeans IDE and tried a pdf files to indexed using
update/extract
then i got:
org.apache.solr.common.SolrException: Server at
"myServer:port/solr" returned non ok status:500, message:Internal
Server Error
4.changed solconfig.xml so removed startup=lazy from update/extract
request handler and got the same thing
I re-installed solr 3.6 again but can't succeed. 4.0 gives the same error.
Same problem with some other request handler also like /browse says
etc.
Should i switch to Linux?
Looks like the packager (Bitnami) did not include that library, even though they left Solr configured to use that library. You may ask them to resolve it. Or you can deploy it yourself.
Here's how to deploy Solr on Tomcat. Its equally easy to install on Windows; and it starts as a Windows service. Once installed, to enable the rich document support, copy the contents of contrib/extraction/lib/ to a directory and point the sharedLib in solr.xml to that directory. If you have used that guide, you will understand those new terms :-)

NoClassDefFoundError MimeTypeException with PDF extraction

I am getting an exception trying to use update/extract with PDF files
My Set up is:-
Ubuntu Server 11.10
Tomcat 6
Solr 3.5.0.2011.11.22.15.54.38
I can browse to solr/admin OK
I have put all the contrib/extract and apache-solr-cell3.5.0.jar libraries into the tomcat folder webapps/solr/WEB-INF/lib
I am calling extract using:-
curl "http://localhost:8080/solr/update/extract?uprefix=attr_&fmap.content=attr_content&commit=true" -F "file=/path/to/my.pdf"
error is
java.lang.NoClassDefFoundError: org/apache/tika/mime/MimeTypeException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:383)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:425)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:461)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:248)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:239)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
Would appreciate any pointers - the only time this error seems to come up elsewhere is with Nutch and cached results.
I have tried sending the mimetype in the querystring and also a *.doc file but got the same error.
According to the error message it is not a MimeTypeException exception you get: The problem is a NoClassDefFoundError, because Solr cannot load the class MimeTypeException.
Normally this class is present in tika-core.jar.
Make sure you actually have that file and also check if you have a lib statement in your solrconfig.xml pointing to the right directory.
This was due to the basic error of copying the necessary tika libraries (to tomcat6/webapps/solr/WEB-INF/lib) but leaving ownership of the jar files as ROOT instead of chown-ing them to TOMCAT6. After setting the right permission and restarting Tomcat it started working OK
Found the solution of this problem, I was using SolrJ to update my pdf indexing.
after deploy solr to tomcat, I didn't include the following libraries into the tomcat/webapp
and I get all the lazy loading problem, etc etc
I even try to get apache tika...
until I do this...
shutdown tomcat
\apache-solr-3.5.0\contrib\extraction
copy the libraries above to below
\apache-tomcat-7.0.26\webapps\solr\WEB-INF\lib
startup tomcat
cheers

org.apache.solr.common.SolrException: missing content stream

I have installed Apache Solr with Tomcat and my /solr/admin is working fine. But when I try to issue /solr/update I am getting the following error. What could be the reason?
org.apache.solr.common.SolrException: missing content stream
If you add commit parameter i.e. ?commit=true, it will work
/solr/update will look for any input documents to be indexed. Running plain /solr/update will cause this exception since there is no input for it. The easiest way to run it is like,
java -Durl=localhost:8080/<your apache solr context path, mostly solr>/update -jar post.jar *.xml
This can also happen through SolrJ/spring-data-solr if you try to persist an empty collection of documents.
So solrClient.add(new ArrayList<SolrInputDocument>(), 10000);
would also cause the error.

Resources