Solr indexing fails on server.request(up) - solr

while indexing into solr, i am getting an error like this.
HTTP Status 500 - lazy loading error
org.apache.solr.common.SolrException: lazy loading error at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:260)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
The URL formed is : http://localhost:8080/solr/update/extract?Latitude=51.9125&Longitude=179.5&commit=true&waitFlush=true&waitSearcher=true&wt=javabin&version=2
(I have configured tomcat using Xampp on Windows machine)
I have been following SOF and various other blogs/forums and tried to debug it but for hours i could not find anything.
I have added the following things in the solr.xml
<maxFieldLength>10000</maxFieldLength>
<writeLockTimeout>60000</writeLockTimeout>
<commitLockTimeout>60000</commitLockTimeout>
<lockType>simple</lockType>
<unlockOnStartup>true</unlockOnStartup>
<reopenReaders>true</reopenReaders>
<requestParsers enableRemoteStreaming="true"
multipartUploadLimitInKB="2048000" />
<lst name="defaults">
<!--str name="echoParams">explicit</str-->
<!--int name="rows">10</int-->
<!--str name="df">text</str-->
<str name="Latitude">Latitude</str>
<str name="Longitude">Longitude</str>
</lst>
Even tried adding the following to solconfig.xml ands restarting tomcat i get
<requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
<lst name="defaults">
<str name="ext.map.Last-Modified">last_modified</str>
<bool name="ext.ignore.und.fl">true</bool>
</lst>
</requestHandler>
On the Java console it shows an error :
org.apache.solr.common.SolrException: Internal Server Error
I realized the issue might be because of my solr home path. I created a new directory and copied all the config files there and mentioned that as my solr path. However, I later update the solrconfig.xml, correcting paths for all the jars.
Also tried adding the 'pdfbox and fontbox' jars in to solr lib folder and restarting Tomcat
My Java code is :
String urlString = "http://localhost:8080/solr";
SolrServer server = new CommonsHttpSolrServer(urlString);
ContentStreamUpdateRequest up = new ContentStreamUpdateRequest("/update/extract");
String fileName=f.toString();
up.addFile(new File(fileName));
up.setParam("Latitude", Latitude);
up.setParam("Longitude", Longitude);
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
server.request(up);
(Port8080 is the one i have configured)
Still solr indexing is not working at my end.. i have tried hours debugging this and figuring out. It would be really great if you can show me some hint or suggest anything i am doing wrong.
As for your ref i have already tried :
http://wiki.apache.org/solr/FrontPage
http://wiki.apache.org/solr/ContentStreamUpdateRequestExample
http://wiki.apache.org/solr/UpdateRichDocuments
http://wiki.apache.org/solr/ExtractingRequestHandler#Configuration
http://lucene.472066.n3.nabble.com/Problem-using-ExtractingRequestHandler-with-tomcat-td494930.html
http://lucene.472066.n3.nabble.com/Internal-Server-Error-td715713.html
How to index pdf's content with SolrJ?

Finally I find a way to solve this.
Just modify SOLR_HOME/conf/solrconfig.xml, change all the dir attribute of <tab> tag from "../../dist/" to "../dist/", save the file
If you encounter this problem, that the solr directory must be move from apache-solr-x.x.x\example to some other place, so the relative path "../../dist/" need to be changed accordingly.
Remember to restart your tomcat and see if it work.

Do you have solr-cell.jar in your class path. ?
The ExtractingRequestHandler is in the solr-cell.jar, which is not packaged with the default solr-server

Related

How can use the /export request handler via SolrJ?

I'm using Solr 4.10.
I have enabled the /export request handler for an index by adding this to the solrconfig.xml (as mentioned here: https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets):
<requestHandler name="/export" class="solr.SearchHandler">
<lst name="invariants">
<str name="rq">{!xport}</str>
<str name="wt">xsort</str>
<str name="distrib">false</str>
</lst>
<arr name="components">
<str>query</str>
</arr>
</requestHandler>
Now I can use: http://localhost:8983/solr/index/select?.... as well as http://localhost:8983/solr/index/export?.... from a browser or curl.
But, I cannot get it to run properly using SolrJ.
I tried (as suggested here: https://lucene.apache.org/solr/4_10_0/solr-solrj/index.html):
SolrQuery query = new SolrQuery();
...
query.setRequestHandler("/export");
...
httpSolrServer.query(query);
The query now has a parameter &qt=export. It blew up giving me:
org.apache.solr.client.solrj.SolrServerException: Error executing query
More search suggested using SolrRequest instead of SolrQuery, I tried it:
SolrQuery query = new SolrQuery();
...
query.setRequestHandler("/export");
SolrRequest solrRequest = new QueryRequest(query);
httpSolrServer.request(solrRequest);
Now I get:
java.nio.charset.UnsupportedCharsetException: gzip
Any ideas?
---edit---
I found an option in httpSolrServer.request() to add a ResponseParser. I found 4 ResponseParsers. Tried them all, the only one that worked was NoOpResponseParser. Now I have the correct results, but dumped as a plain string in a single entry in a NamedList. I tried to parse it as JSON, but it's not in proper format. Each 30,000 document, there's a missing , !!!!.
I returned back to solrconfig.xml and changed wt in the /export handler from xsort to json. Now the response format has changed, but it's also not in proper format (data is incomplete) !!!!. And XML is not
supported.
I'm truly baffled.

Solr and schemaless

I'm using Cloudera 5.4 with Solr 4.10.2 and I would like to activate the schemaless.
I've edited the solrconf.xml with
<schemaFactory class="ManagedIndexSchemaFactory">
<bool name="mutable">true</bool>
<str name="managedSchemaResourceName">managed-schema</str>
</schemaFactory>
I don't know if I need something else for this version. I have read the Solr documentation (https://docs.lucidworks.com/display/solr/Managed+Schema+Definition+in+SolrConfig)
When I try to index an document, I get an exception:
Exception in thread "main" org.apache.solr.client.solrj.impl.CloudSolrServer$RouteException: ERROR: [doc=5.417393032179468E7] unknown field 'campo1'
I have seen that there's an schema.xml.bak how the documentation says.
Do I need to do something else?

Set default search fields in Apache Solr

I am trying to Implement Apache Solr search through SolrNet library.So far I have managed to run an instance of Solr in my machine and make some queries based on specific fields.
My code to do it looks like this
var solr = ServiceLocator.Current.GetInstance<ISolrOperations<Product>>();
var results = solr.Query(new SolrQueryByField("id", "SP2514N"));
This one works fine now,But I would like to make queries with out specifying a field , So that when I enter a search key word solr will look in to the all fields available and return a result.I have Found the code to make it in SolrNet library from here
var solr = ServiceLocator.Current.GetInstance<ISolrOperations<Product>>();
var results = solr.Query(new SolrQuery("SP2514N"));
But this never worked,When I drilled down to bottom ,I found that I need to set default search fields in Solr instance so that Solr will search that fields when nothing else is selected(This is how i understood it I am not sure about this).
So I went to set default fields in Solr ,I took Solrconfig.XML and edited it like this
<requestHandler name="/query" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="wt">json</str>
<str name="indent">true</str>
<str name="df">text</str>
<str name="df">id</str>
</lst>
</requestHandler>
[just added <str name="df">id</str> this field as extra].But this too never helped And I am stuck ,Can any one tell me How I could set default search field in Solr correctly?Or am i doing any thing else wrong?
I have Uploaded My Solrconfig file here
I do not know about SolrNet library, but to make a default field for search you need to define DefaultSearchField in schema.xml i.e. <defaultSearchField>FieldName</defaultSearchField>.
You can find this file # <SOLR_HOME>\apache-solr-3.6.0\example\example-DIH\solr\testsyndrome\conf\schema.xml
I hope that's what you are looking for.
Don't start from SolrNet, use Solr's built-in Web Admin interface. Iterate there until you understand the request handlers and the parameters. Then, go back to SolrNet.
In your case, it seems that you changed default request handler and tried to use df parameter twice. I would stick to the original request handler for now just to avoid the extra issue.
With using df parameter, are you trying to search a single field or multiple fields? If single field, keep only one value for the parameter. If multiple, you need to switch to using eDisMax, where you can provide a set of fields.
Again, admin interface lets you experiment with it, then you can add it into the handler's default parameter.

Solr, clustering (carrot) and NoClassDefFoundError

i'm running Solr 3.4 and would like to use the clusteringComponent.
Following this tutorial: http://wiki.apache.org/solr/ClusteringComponent in combination with default entries at the solrconfig.xml i have the following configuration #solrconfig.xml
<searchComponent name="clustering"
enable="${solr.clustering.enabled:true}"
class="org.apache.solr.handler.clustering.ClusteringComponent" >
<!-- Declare an engine -->
<lst name="engine">
<str name="name">default</str>
<str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>
<str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>
</lst>
<lst name="engine">
<str name="name">stc</str>
<str name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str>
</lst>
</searchComponent>
<requestHandler name="/cl" class="solr.SearchHandler" >
<lst name="defaults">
<str name="echoParams">explicit</str>
<bool name="clustering">true</bool>
<str name="clustering.engine">default</str>
<bool name="clustering.results">true</bool>
<!-- Fields to cluster on -->
<str name="carrot.title">UEBSCHRIFT</str>
<str name="carrot.snippet">TEXT</str>
</lst>
So if i try to use the requestHandler http://server:8080/solr/mycore/cl?q=*:* i get the following Java exception:
java.lang.NoClassDefFoundError: com.carrotsearch.hppc.ObjectContainer
at java.lang.J9VMInternals.verifyImpl(Native Method)
at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.<init>(BasicPreprocessingPipeline.java:106)
at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.<init>(CompletePreprocessingPipeline.java:32)
at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.<init>(LingoClusteringAlgorithm.java:129)
at java.lang.J9VMInternals.newInstanceImpl(Native Method)
at java.lang.Class.newInstance(Class.java:1325)
at org.carrot2.util.pool.SoftUnboundedPool.borrowObject(SoftUnboundedPool.java:80)
at org.carrot2.core.PoolingProcessingComponentManager.prepare(PoolingProcessingComponentManager.java:128)
at org.carrot2.core.Controller.process(Controller.java:333)
at org.carrot2.core.Controller.process(Controller.java:240)
at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:136)
at org.apache.solr.handler.clustering.ClusteringComponent.process(ClusteringComponent.java:91)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:735)
Caused by: java.lang.ClassNotFoundException: com.carrotsearch.hppc.ObjectContainer
at java.lang.Throwable.<init>(Throwable.java:80)
at java.lang.ClassNotFoundException.<init>(ClassNotFoundException.java:76)
at java.net.URLClassLoader.findClass(URLClassLoader.java:419)
at java.lang.ClassLoader.loadClass(ClassLoader.java)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:923)
at java.lang.ClassLoader.loadClass(ClassLoader.java:609)
... 31 more
The point is, that I have no idea what this mean. I'm searching for hours without finding an solution.
By the way: I'm running on tomcat with the following options:
export CATALINA_OPTS="-Dsolr.clustering.enabled=true"
(is this still required in Solr 3.4?)
The catalina option is part of the java command, as you can see with ps -efa
/usr/lib64/jvm/java-1_6_0-ibm-1.6.0/jre//bin/java
-Djava.util.logging.config.file=/opt/tomcat6/conf/logging.properties -Xms2048m -Xmx2048m -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Dsolr.clustering.enabled=true -Djava.endorsed.dirs=/opt/tomcat6/endorsed -classpath /opt/tomcat6/bin/bootstrap.jar -Dcatalina.base=/opt/tomcat6
-Dcatalina.home=/opt/tomcat6 -Djava.io.tmpdir=/opt/tomcat6/temp org.apache.catalina.startup.Bootstrap start
Does anyone has an idea what i could do to solve this problem?
//Update:
if i add hppc-0.3.4-jdk15.jar, i get the following error:
java.lang.NoClassDefFoundError: org.apache.mahout.math.matrix.DoubleMatrix2D
at java.lang.J9VMInternals.verifyImpl(Native Method)
at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.<init>(LingoClusteringAlgorithm.java:134)
[...]
Caused by: java.lang.ClassNotFoundException: org.apache.mahout.math.matrix.DoubleMatrix2D
at java.lang.Throwable.<init>(Throwable.java:80)
at java.lang.ClassNotFoundException.<init>(ClassNotFoundException.java:76)
at java.net.URLClassLoader.findClass(URLClassLoader.java:419)
at java.lang.ClassLoader.loadClass(ClassLoader.java)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:923)
at java.lang.ClassLoader.loadClass(ClassLoader.java:609)
... 29 more
it looks like, i have to install an mahout archive, but i think, all packages for clustering are included in Solr 3.4?! It looks like, i'm on the wrong way?!
If you are using Solr with tomcat as a seperate instance, you would need to copy the jars so that they are available for Solr.
Quote from README.txt
NOTE: This Solr example server references certain Solr jars outside of
this server directory for non-core modules with statements in
solrconfig.xml. If you make a copy of this example server and wish to
use the ExtractingRequestHandler (SolrCell), DataImportHandler (DIH),
UIMA, the clustering component, or other modules in "contrib", you
will need to copy the required jars into solr/lib or update the paths
to the jars in your solrconfig.xml.
Check for the clustering and carrot jars in solrconfig.xml.
Probably you are missing hppc-0.3.4-jdk15.jar
Why not use Solr's default packaging (this is officially supported)? It ships with Jetty and will save you the headaches connected with classpath because things are already configured.
Answering your question, you'll need all the JARs from Solr's default clustering extension folders; for 4.0 alpha this would be: contrib/clustering/lib/*.jar
carrot2-core-3.5.0.jar
hppc-0.3.3.jar
jackson-core-asl-1.7.4.jar
jackson-mapper-asl-1.7.4.jar
mahout-collections-0.3.jar
mahout-math-0.3.jar
simple-xml-2.4.1.jar
Did you add the Mahout Math package?
It seems to be a separate package.
NoClassDefFoundError: org.apache.mahout.math.matrix.DoubleMatrix2D
^^^^^^^^^^^^^^^^^^^^^^
With solR 4 copy this file in folder conf
http://svn.apache.org/repos/asf/labs/alike/trunk/demo/solrhome/collection1/conf/solrconfig.xml

Can SOLR configuration files be located in parent folders?

I have configured the QueryElevation searchComponent of SOLR as documented here:
http://wiki.apache.org/solr/SchemaXml#The_Unique_Key_Field
However, I would like to load the elevate.xml file from several folders above the default one.
I cannot get this to work... all of the following generate an error:
<str name="config-file">../../elevate.xml</str>
<str name="config-file">..\..\elevate.xml</str>
<str name="config-file">c:/elevate.xml</str>
<str name="config-file">c:\elevate.xml</str>
Per the Solr wiki:
Path to the file that defines query elevation. This file must exist in:
${instanceDir}/conf/${config-file} , or
${dataDir}/${config-file}

Resources