I'm using Cloudera 5.4 with Solr 4.10.2 and I would like to activate the schemaless.
I've edited the solrconf.xml with
<schemaFactory class="ManagedIndexSchemaFactory">
<bool name="mutable">true</bool>
<str name="managedSchemaResourceName">managed-schema</str>
</schemaFactory>
I don't know if I need something else for this version. I have read the Solr documentation (https://docs.lucidworks.com/display/solr/Managed+Schema+Definition+in+SolrConfig)
When I try to index an document, I get an exception:
Exception in thread "main" org.apache.solr.client.solrj.impl.CloudSolrServer$RouteException: ERROR: [doc=5.417393032179468E7] unknown field 'campo1'
I have seen that there's an schema.xml.bak how the documentation says.
Do I need to do something else?
Related
I am trying to get the more-like-this query parser working on my test system. The test system has SOLR cloud 6.5.0 installed. The MLT handler is enabled with the following configuration:
<requestHandler name="/mlt" class="solr.MoreLikeThisHandler">
<lst name="defaults">
<str name="mlt.qf">search_text_st</str>
<str name="mlt.fl">search_text_st</str>
<int name="mlt.minwl">4</int>
<int name="mlt.maxwl">18</int>
</lst>
</requestHandler>
When I query for document similar to a specific document with the handler, I get expected results. For example:
http://localhost:8983/solr/MyCloud/mlt?q=id:123
The above query will return results:
"response":{"numFound":361,"start":0,"maxScore":113.24594,"docs":[...]}
However, when I try an equivalent query using the MLTQParser with {!mlt qf=search_text_st fl=search_text_st minwl=4 maxwl=18}123, I get no results:
http://localhost:8983/solr/MyCloud/select?q=%7B!mlt+qf%3Dsearch_text_st+fl%3Dsearch_text_st+minwl%3D4+maxwl%3D18%7D123
The response looks like this:
"response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]}
I have done nothing so far to enable or configure MLTQParser, but it does appear to be enabled because I get an error when using a document ID that doesn't exist.
Any idea why this is not working?
I eventually figured out why this was failing. The search_text_st field was being created using copyField. The Cloud MLT Query Parser uses the realtime get handler to retrieve the fields to be mined for keywords. Because of the way the realtime get handler is implemented, it does not return data for fields populated using copyField. (see https://issues.apache.org/jira/browse/SOLR-3743)
Changing the configuration to use the source fields made it work.
I'm using Solr 4.10.
I have enabled the /export request handler for an index by adding this to the solrconfig.xml (as mentioned here: https://cwiki.apache.org/confluence/display/solr/Exporting+Result+Sets):
<requestHandler name="/export" class="solr.SearchHandler">
<lst name="invariants">
<str name="rq">{!xport}</str>
<str name="wt">xsort</str>
<str name="distrib">false</str>
</lst>
<arr name="components">
<str>query</str>
</arr>
</requestHandler>
Now I can use: http://localhost:8983/solr/index/select?.... as well as http://localhost:8983/solr/index/export?.... from a browser or curl.
But, I cannot get it to run properly using SolrJ.
I tried (as suggested here: https://lucene.apache.org/solr/4_10_0/solr-solrj/index.html):
SolrQuery query = new SolrQuery();
...
query.setRequestHandler("/export");
...
httpSolrServer.query(query);
The query now has a parameter &qt=export. It blew up giving me:
org.apache.solr.client.solrj.SolrServerException: Error executing query
More search suggested using SolrRequest instead of SolrQuery, I tried it:
SolrQuery query = new SolrQuery();
...
query.setRequestHandler("/export");
SolrRequest solrRequest = new QueryRequest(query);
httpSolrServer.request(solrRequest);
Now I get:
java.nio.charset.UnsupportedCharsetException: gzip
Any ideas?
---edit---
I found an option in httpSolrServer.request() to add a ResponseParser. I found 4 ResponseParsers. Tried them all, the only one that worked was NoOpResponseParser. Now I have the correct results, but dumped as a plain string in a single entry in a NamedList. I tried to parse it as JSON, but it's not in proper format. Each 30,000 document, there's a missing , !!!!.
I returned back to solrconfig.xml and changed wt in the /export handler from xsort to json. Now the response format has changed, but it's also not in proper format (data is incomplete) !!!!. And XML is not
supported.
I'm truly baffled.
Hi i am using a slorj api to query solr indexes. But i am getting some exception when i add following query to SolrQuery object.
When i run following query in browser it is working fine
http://localhost:8983/solr/hellosolr/select?q=fkey:book+OR+bookstore+AND+whword:what&fl=fanswer
it is working fine but when i run the same query using SolrQuery i am getting following exception
SolrQuery solrQuery = new SolrQuery();
solrQuery.set("q", "fkey:book+OR+bookstore+AND+whword:what");
solrQuery.set("fl", "fanswer");
Exception-
org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://localhost:8983/solr/hellosolr: org.apache.solr.search.SyntaxError: Cannot parse 'fkey:book+OR+bookstore+AND+whword:what': Encountered " ":" ": "" at line 1, column 39.
Was expecting one of:
<EOF>
<AND> ...
<OR> ...
<NOT> ...
"+" ...
Please tell me how i can write above html query using SolrQuery java api.
Your error message is a syntax error. Take out the + signs.
For the exception, try this query instead:
(fkey:book OR fkey:bookstore) AND whword:what
or you can write it like this (note the parentheses):
(fkey:book bookstore) AND whword:what
If you have OR defined as your default, then Solr will insert it between book and bookstore for you. Otherwise it will do an AND. I believe OR is the default. Check your solrconfig.xml to be sure.
If you don't specify a field in front of a search term, Solr will use the default field (it's text in my version, for example). If Solr can't find it, you'll get the undefined field [name] error.
Now, why your bad query is working from the admin panel, I don't know, but this should solve your SolrJ problem.
Assuming you're using the /select handler, you can go to solrconfig.xml and change the default:
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">blah</str>
</lst>
i'm running Solr 3.4 and would like to use the clusteringComponent.
Following this tutorial: http://wiki.apache.org/solr/ClusteringComponent in combination with default entries at the solrconfig.xml i have the following configuration #solrconfig.xml
<searchComponent name="clustering"
enable="${solr.clustering.enabled:true}"
class="org.apache.solr.handler.clustering.ClusteringComponent" >
<!-- Declare an engine -->
<lst name="engine">
<str name="name">default</str>
<str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>
<str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>
</lst>
<lst name="engine">
<str name="name">stc</str>
<str name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str>
</lst>
</searchComponent>
<requestHandler name="/cl" class="solr.SearchHandler" >
<lst name="defaults">
<str name="echoParams">explicit</str>
<bool name="clustering">true</bool>
<str name="clustering.engine">default</str>
<bool name="clustering.results">true</bool>
<!-- Fields to cluster on -->
<str name="carrot.title">UEBSCHRIFT</str>
<str name="carrot.snippet">TEXT</str>
</lst>
So if i try to use the requestHandler http://server:8080/solr/mycore/cl?q=*:* i get the following Java exception:
java.lang.NoClassDefFoundError: com.carrotsearch.hppc.ObjectContainer
at java.lang.J9VMInternals.verifyImpl(Native Method)
at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
at org.carrot2.text.preprocessing.pipeline.BasicPreprocessingPipeline.<init>(BasicPreprocessingPipeline.java:106)
at org.carrot2.text.preprocessing.pipeline.CompletePreprocessingPipeline.<init>(CompletePreprocessingPipeline.java:32)
at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.<init>(LingoClusteringAlgorithm.java:129)
at java.lang.J9VMInternals.newInstanceImpl(Native Method)
at java.lang.Class.newInstance(Class.java:1325)
at org.carrot2.util.pool.SoftUnboundedPool.borrowObject(SoftUnboundedPool.java:80)
at org.carrot2.core.PoolingProcessingComponentManager.prepare(PoolingProcessingComponentManager.java:128)
at org.carrot2.core.Controller.process(Controller.java:333)
at org.carrot2.core.Controller.process(Controller.java:240)
at org.apache.solr.handler.clustering.carrot2.CarrotClusteringEngine.cluster(CarrotClusteringEngine.java:136)
at org.apache.solr.handler.clustering.ClusteringComponent.process(ClusteringComponent.java:91)
at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:194)
at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:129)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1368)
at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:356)
at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:252)
at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:298)
at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:857)
at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:588)
at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)
at java.lang.Thread.run(Thread.java:735)
Caused by: java.lang.ClassNotFoundException: com.carrotsearch.hppc.ObjectContainer
at java.lang.Throwable.<init>(Throwable.java:80)
at java.lang.ClassNotFoundException.<init>(ClassNotFoundException.java:76)
at java.net.URLClassLoader.findClass(URLClassLoader.java:419)
at java.lang.ClassLoader.loadClass(ClassLoader.java)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:923)
at java.lang.ClassLoader.loadClass(ClassLoader.java:609)
... 31 more
The point is, that I have no idea what this mean. I'm searching for hours without finding an solution.
By the way: I'm running on tomcat with the following options:
export CATALINA_OPTS="-Dsolr.clustering.enabled=true"
(is this still required in Solr 3.4?)
The catalina option is part of the java command, as you can see with ps -efa
/usr/lib64/jvm/java-1_6_0-ibm-1.6.0/jre//bin/java
-Djava.util.logging.config.file=/opt/tomcat6/conf/logging.properties -Xms2048m -Xmx2048m -Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager -Dsolr.clustering.enabled=true -Djava.endorsed.dirs=/opt/tomcat6/endorsed -classpath /opt/tomcat6/bin/bootstrap.jar -Dcatalina.base=/opt/tomcat6
-Dcatalina.home=/opt/tomcat6 -Djava.io.tmpdir=/opt/tomcat6/temp org.apache.catalina.startup.Bootstrap start
Does anyone has an idea what i could do to solve this problem?
//Update:
if i add hppc-0.3.4-jdk15.jar, i get the following error:
java.lang.NoClassDefFoundError: org.apache.mahout.math.matrix.DoubleMatrix2D
at java.lang.J9VMInternals.verifyImpl(Native Method)
at java.lang.J9VMInternals.verify(J9VMInternals.java:72)
at java.lang.J9VMInternals.initialize(J9VMInternals.java:134)
at org.carrot2.clustering.lingo.LingoClusteringAlgorithm.<init>(LingoClusteringAlgorithm.java:134)
[...]
Caused by: java.lang.ClassNotFoundException: org.apache.mahout.math.matrix.DoubleMatrix2D
at java.lang.Throwable.<init>(Throwable.java:80)
at java.lang.ClassNotFoundException.<init>(ClassNotFoundException.java:76)
at java.net.URLClassLoader.findClass(URLClassLoader.java:419)
at java.lang.ClassLoader.loadClass(ClassLoader.java)
at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:923)
at java.lang.ClassLoader.loadClass(ClassLoader.java:609)
... 29 more
it looks like, i have to install an mahout archive, but i think, all packages for clustering are included in Solr 3.4?! It looks like, i'm on the wrong way?!
If you are using Solr with tomcat as a seperate instance, you would need to copy the jars so that they are available for Solr.
Quote from README.txt
NOTE: This Solr example server references certain Solr jars outside of
this server directory for non-core modules with statements in
solrconfig.xml. If you make a copy of this example server and wish to
use the ExtractingRequestHandler (SolrCell), DataImportHandler (DIH),
UIMA, the clustering component, or other modules in "contrib", you
will need to copy the required jars into solr/lib or update the paths
to the jars in your solrconfig.xml.
Check for the clustering and carrot jars in solrconfig.xml.
Probably you are missing hppc-0.3.4-jdk15.jar
Why not use Solr's default packaging (this is officially supported)? It ships with Jetty and will save you the headaches connected with classpath because things are already configured.
Answering your question, you'll need all the JARs from Solr's default clustering extension folders; for 4.0 alpha this would be: contrib/clustering/lib/*.jar
carrot2-core-3.5.0.jar
hppc-0.3.3.jar
jackson-core-asl-1.7.4.jar
jackson-mapper-asl-1.7.4.jar
mahout-collections-0.3.jar
mahout-math-0.3.jar
simple-xml-2.4.1.jar
Did you add the Mahout Math package?
It seems to be a separate package.
NoClassDefFoundError: org.apache.mahout.math.matrix.DoubleMatrix2D
^^^^^^^^^^^^^^^^^^^^^^
With solR 4 copy this file in folder conf
http://svn.apache.org/repos/asf/labs/alike/trunk/demo/solrhome/collection1/conf/solrconfig.xml
while indexing into solr, i am getting an error like this.
HTTP Status 500 - lazy loading error
org.apache.solr.common.SolrException: lazy loading error at
org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.getWrappedHandler(RequestHandlers.java:260)
at org.apache.solr.core.RequestHandlers$LazyRequestHandlerWrapper.handleRequest(RequestHandlers.java:242)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1376) at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:365)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:260)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
at
The URL formed is : http://localhost:8080/solr/update/extract?Latitude=51.9125&Longitude=179.5&commit=true&waitFlush=true&waitSearcher=true&wt=javabin&version=2
(I have configured tomcat using Xampp on Windows machine)
I have been following SOF and various other blogs/forums and tried to debug it but for hours i could not find anything.
I have added the following things in the solr.xml
<maxFieldLength>10000</maxFieldLength>
<writeLockTimeout>60000</writeLockTimeout>
<commitLockTimeout>60000</commitLockTimeout>
<lockType>simple</lockType>
<unlockOnStartup>true</unlockOnStartup>
<reopenReaders>true</reopenReaders>
<requestParsers enableRemoteStreaming="true"
multipartUploadLimitInKB="2048000" />
<lst name="defaults">
<!--str name="echoParams">explicit</str-->
<!--int name="rows">10</int-->
<!--str name="df">text</str-->
<str name="Latitude">Latitude</str>
<str name="Longitude">Longitude</str>
</lst>
Even tried adding the following to solconfig.xml ands restarting tomcat i get
<requestHandler name="/update/extract" class="org.apache.solr.handler.extraction.ExtractingRequestHandler">
<lst name="defaults">
<str name="ext.map.Last-Modified">last_modified</str>
<bool name="ext.ignore.und.fl">true</bool>
</lst>
</requestHandler>
On the Java console it shows an error :
org.apache.solr.common.SolrException: Internal Server Error
I realized the issue might be because of my solr home path. I created a new directory and copied all the config files there and mentioned that as my solr path. However, I later update the solrconfig.xml, correcting paths for all the jars.
Also tried adding the 'pdfbox and fontbox' jars in to solr lib folder and restarting Tomcat
My Java code is :
String urlString = "http://localhost:8080/solr";
SolrServer server = new CommonsHttpSolrServer(urlString);
ContentStreamUpdateRequest up = new ContentStreamUpdateRequest("/update/extract");
String fileName=f.toString();
up.addFile(new File(fileName));
up.setParam("Latitude", Latitude);
up.setParam("Longitude", Longitude);
up.setAction(AbstractUpdateRequest.ACTION.COMMIT, true, true);
server.request(up);
(Port8080 is the one i have configured)
Still solr indexing is not working at my end.. i have tried hours debugging this and figuring out. It would be really great if you can show me some hint or suggest anything i am doing wrong.
As for your ref i have already tried :
http://wiki.apache.org/solr/FrontPage
http://wiki.apache.org/solr/ContentStreamUpdateRequestExample
http://wiki.apache.org/solr/UpdateRichDocuments
http://wiki.apache.org/solr/ExtractingRequestHandler#Configuration
http://lucene.472066.n3.nabble.com/Problem-using-ExtractingRequestHandler-with-tomcat-td494930.html
http://lucene.472066.n3.nabble.com/Internal-Server-Error-td715713.html
How to index pdf's content with SolrJ?
Finally I find a way to solve this.
Just modify SOLR_HOME/conf/solrconfig.xml, change all the dir attribute of <tab> tag from "../../dist/" to "../dist/", save the file
If you encounter this problem, that the solr directory must be move from apache-solr-x.x.x\example to some other place, so the relative path "../../dist/" need to be changed accordingly.
Remember to restart your tomcat and see if it work.
Do you have solr-cell.jar in your class path. ?
The ExtractingRequestHandler is in the solr-cell.jar, which is not packaged with the default solr-server