NoClassDefFoundError MimeTypeException with PDF extraction - solr

I am getting an exception trying to use update/extract with PDF files
My Set up is:-
Ubuntu Server 11.10
Tomcat 6
Solr 3.5.0.2011.11.22.15.54.38
I can browse to solr/admin OK
I have put all the contrib/extract and apache-solr-cell3.5.0.jar libraries into the tomcat folder webapps/solr/WEB-INF/lib
I am calling extract using:-
curl "http://localhost:8080/solr/update/extract?uprefix=attr_&fmap.content=attr_content&commit=true" -F "file=/path/to/my.pdf"
error is
java.lang.NoClassDefFoundError: org/apache/tika/mime/MimeTypeException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:264)
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:383)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:425)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:461)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:248)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:239)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1372)
Would appreciate any pointers - the only time this error seems to come up elsewhere is with Nutch and cached results.
I have tried sending the mimetype in the querystring and also a *.doc file but got the same error.

According to the error message it is not a MimeTypeException exception you get: The problem is a NoClassDefFoundError, because Solr cannot load the class MimeTypeException.
Normally this class is present in tika-core.jar.
Make sure you actually have that file and also check if you have a lib statement in your solrconfig.xml pointing to the right directory.

This was due to the basic error of copying the necessary tika libraries (to tomcat6/webapps/solr/WEB-INF/lib) but leaving ownership of the jar files as ROOT instead of chown-ing them to TOMCAT6. After setting the right permission and restarting Tomcat it started working OK

Found the solution of this problem, I was using SolrJ to update my pdf indexing.
after deploy solr to tomcat, I didn't include the following libraries into the tomcat/webapp
and I get all the lazy loading problem, etc etc
I even try to get apache tika...
until I do this...
shutdown tomcat
\apache-solr-3.5.0\contrib\extraction
copy the libraries above to below
\apache-tomcat-7.0.26\webapps\solr\WEB-INF\lib
startup tomcat
cheers

Related

SolrCore Initialization Failures on JBoss 7

I have deployed a stand alone instance of Solr 5.2.1 on JBoss 7 using these very simple instructions from the Solr wiki:
http://wiki.apache.org/solr/SolrInstall and
http://wiki.apache.org/solr/SolrJBoss
I have set my solr.solr.home (which points to the folder copied directly from example/example-DIH) this folder has 5 sub folders (db,mail,rss,solr and tika) I have made no modifications to any of them, each has conf and some have lib folders, which I assume is what Solr is looking for....
solr.data.dir I have not set in my environment variable as I don't know where it should point to? I had planed to point it to a coreName/data once I have everything running, but that means I can only have 1 core?
Solr does start and I can see the console, however I get the following Initialization Failures:
tika: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'solr.DataImportHandler'
db: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'solr.DataImportHandler'
mail: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'solr.DataImportHandler'
rss:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'solr.DataImportHandler'
solr: org.apache.solr.common.SolrException:org.apache.solr.common.SolrException: Error loading class 'solr.DataImportHandler'
Clearly Solr is looking for these libraries, I just don't know where they are or how to tell Solr to look in the 5 folders described above, I had hoped the war file would have all the necessary libs it needed.
Unfortunately I am not sure what needs to be configured / added next and the wiki seems to lead one to a point and then just stop short of explaining additional configuration steps.
Can somebody point me in the correct direction to go from here?
Also is it possible to secure the admin console (Jboss7 - Solr5.2.1)
Thanks
Marc
Mostly including following in solrconfig.xml should resolve the issue

Fail to load ExtractingRequestHandler when running the Solr Quickstart Tutorial

I installed Solr 5.0.0 on OS X 10.10.2 using Homebrew. I am trying to follow the quick start instructions and am getting errors when I try to index a directory of files.
I am able to successfully start the sample Solr server by running
bin/solr start -e cloud -noprompt
as directed by the tutorial. I then try to index a directory of files by running
./bin/post -c gettingstarted docs/
(Note that this has to be done from the libexec subdirectory of the Solr install root.)
I get a server error 500 for every file it tries to add. The relevant stack:
Caused by: org.apache.solr.common.SolrException: Error loading class 'solr.extraction.ExtractingRequestHandler'
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:492)
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:423)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:559)
at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:632)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.createRequestHandler(RequestHandlers.java:326)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:298)
... 30 more
The issue appears to be that ExtractingRequestHandler is not on the classpath.
ExtractingRequestHandler is in the solr-cell-5.0.0.jar.
jar tf dist/solr-cell-5.0.0.jar | grep ExtractingRequestHandler
org/apache/solr/handler/extraction/ExtractingRequestHandler.class
It's not clear to me if it needs to be on the classpath of the command doing the posting or the Solr instance. The answer to this question makes it sound like the latter. However, I tried setting
export CLASSPATH=dist/solr-cell-5.0.0.jar
before trying to index the files and saw the same error.
I don't see anything in the tutorial about how to configure this. What is the error and how do I get past it?
Looks like the problem is incorrect paths in the Solr example configuration. A workaround is to add softlinks from SOLR_ROOT/contrib and dist to the corresponding directories beneath SOLR_ROOT/libexec/contrib
Details here and here.

Unable to create a new core Solr 3.5. Error in default implementation of CREATE

I am running Solr 3.5 and already have two cores set up by my senior. I need to add a new core. The Solr home is /runtime/local/solr/. This directory contains the Solr.xml. So I create a new directory here with my core's name and then run the following
http://localhost:7658/solr/admin/cores?action=CREATE&name=core0&instanceDir=/runtime/local/solr/core0/
And Apache tomcat keeps returning a 400 error with the message
Error executing default implementation of CREATE
and the description says
The request sent by the client was syntactically incorrect (Error executing default implementation of CREATE).
What is going wrong here. The syntax seems to be correct from what I've searched on the web.
Found a fix. Not sure if it's the right approach. I created a directory for my core in the Solr Home folder and within it added a folder called conf. To this folder, I copied all the files which were there in the conf folder of the other core and after that ran the CREATE command. Booyah ! It worked.
Each core requires its own configuration, so you do need to have the conf folder set-up but it does not need to be the same as the configuration for your first core

Solr 4 Data Import Handler doesn't work

I am deploying Solr 4.3.0 in Tomcat 7.
Everything works fine but DataImportHandler. I can go to the
http://localhost:8080/solr/#/collection1/dataimport//dataimport
screen and see the dataimport options load at the UI.
Still, I can see any of my entities load in the "entity" combo box. Inside the configuration box, at the right side I can see the error below.
Apache Tomcat/7.0.41 - Error
report
525D76;}--> HTTP Status 500 - Filter execution threw an exception
noshade="noshade">type Exception reportmessage
Filter execution threw an exceptiondescription
The server encountered an internal error that prevented it from
fulfilling this request.exception
javax.servlet.ServletException: Filter execution threw an
exception root cause
java.lang.NoClassDefFoundError: org/apache/log4j/spi/LoggingEvent
org.apache.solr.logging.log4j.EventAppender.append(EventAppender.java:35)
org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
org.apache.log4j.Category.callAppenders(Category.java:206)
org.apache.log4j.Category.forcedLog(Category.java:391)
org.apache.log4j.Category.log(Category.java:856)
org.slf4j.impl.Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:498)
org.apache.solr.common.SolrException.log(SolrException.java:119)
org.apache.solr.servlet.ResponseUtils.getErrorInfo(ResponseUtils.java:58)
org.apache.solr.servlet.SolrDispatchFilter.sendError(SolrDispatchFilter.java:691)
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:380)
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:155)
note The full stack trace of the root cause is
available in the Apache Tomcat/7.0.41 logs.Apache Tomcat/7.0.41
Problem is that I have the "log4j-1.2.16.jar" loaded in the classpath (it's on Tomcat lib dir).
Anyone have stepped in this problem?
Try following the steps outlined in Using the example logging setup in containers other than Jetty. I have encountered this same error when running Solr 4.3 until I followed these steps to configure logging.
After changing the directory, did you change the directory path in solrconfig.xml file.
I just want to make sure after the making changes in configuration file, did you restart the tomcat and solr server?
You need to copy the slf4j-log4j12-1.6.6.jar from the ext of Solr into the lib folder.
You also need to put the logging.properties file there.

How to set solr/home in linux OS?

I know how to configure solr.home by using Tomcat 6, but I don't know how to set solr.home by using Glassfish(V2.1). I have tried to set the solr.home in .profile as fellows:
export solr.home=/home/huenzhao/search/solr
export solr/home=/home/huenzhao/search/solr
export solr.solr.home=/home/huenzhao/search/solr
export JAVA_OPTS=$JAVA_OPTS -Dsolr.solr.home=/home/huenzhao/search/solr
and they all not work. The error is:
HTTP Status 500 - Severe errors in solr configuration. Check your log
files for more detailed information on what may be wrong. If you want
solr to continue after configuration errors, change:
false in null
------------------------------------------------------------- java.lang.RuntimeException: Can't find resource 'solrconfig.xml' in
classpath or 'solr/conf/',
cwd=/home/huenzhao/search/glassfish/domains/domain1/config at
org.apache.solr.core.SolrResourceLoader.openResource(SolrResourceLoader.java:194)
at
org.apache.solr.core.SolrResourceLoader.openConfig(SolrResourceLoader.java:162)
at org.apache.solr.core.Config.(Config.java:100) at
org.apache.solr.core.SolrConfig.(SolrConfig.java:113) at
org.apache.solr.core.SolrConfig.(SolrConfig.java:70) at
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:117)
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:273)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:385)
at
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:119)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:4498)
at
org.apache.catalina.core.StandardContext.start(StandardContext.java:5317)
at com.sun.enterprise.web.WebModule.start(WebModule.java:353) at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:989)
at
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:973)
at
org.apache.catalina.core.StandardHost.addChild(StandardHost.java:704)
at
com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1627)
at
com.sun.enterprise.web.WebContainer.loadWebModule(WebContainer.java:1232)
at
com.sun.enterprise.server.WebModuleDeployEventListener.moduleDeployed(WebModuleDeployEventListener.java:182)
at
com.sun.enterprise.server.WebModuleDeployEventListener.moduleDeployed(WebModuleDeployEventListener.java:278)
at
com.sun.enterprise.admin.event.AdminEventMulticaster.invokeModuleDeployEventListener(AdminEventMulticaster.java:1005)
at
……
Anybody knows?
if you are running solr inside tomcat as a container you can specify the solr home inside the XML descriptor for this webapp. (my terminology for this is probably a little off).
I've got xml fragments for each solr instance I want to run and they specify their own local solr home directory inside the xml fragment. The fragments live at /conf/Catalina/localhost and each one manages a solr instance. This way I can have multiple solr instances on the same machine each with their own solr home variable.
The info is here:
http://wiki.apache.org/solr/SolrTomcat
In paticular
Create a Tomcat Context fragment to
point docBase to the
$SOLR_HOME/apache-solr-1.3.0.war file
and solr/home to $SOLR_HOME:
Symlink
or place the file in
$CATALINA_HOME/conf/Catalina/localhost/solr-example.xml,
where Tomcat will automatically pick
it up. Tomcat deletes the file on
undeploy (which happens automatically
if the configuration is invalid).
Try to set the following:
export JAVA_OPTS="$JAVA_OPTS -Dsolr.solr.home=/home/huenzhao/search/solr/"
Try setting a Java environment parameter from Java or edit your VM configuration:
System.setProperty("solr.solr.home", "/home/user/apache-solr-1.4/example/solr");
In my case I simply copied the 'solr' folder to glassfish/domains/domain1/config and it worked.

Resources