Cassandra and Solr integration: Unable to execute query - solr

I'm trying to integrate cassandra and solr.
I'm using solr -6.6.0 version, cassandra 3.10 version and java 8.
To my solrconfig.xml I added these lines
<lib dir="/home/bkoganti/solr-6.6.0/contrib/dataimporthandler/" regex="cassandra-jdbc-.*\.jar"/>
<lib dir="/home/bkoganti/solr-6.6.0/contrib/dataimporthandler/" regex="cassandra-all-.*\.jar"/>
<lib dir="/home/bkoganti/solr-6.6.0/contrib/dataimporthandler/" regex="cassandra-thrift-.*\.jar"/>
<lib dir="/home/bkoganti/solr-6.6.0/contrib/dataimporthandler/" regex="libthrift-.*\.jar"/>
<lib dir="/home/bkoganti/solr-6.6.0/contrib/dataimporthandler/" regex="cassandra-driver-core-*\.jar"/>
.
.
.
.
<requestHandler name="/dataimport" class="solr.DataImportHandler">
<lst name="defaults">
<str name="config">sample-data-config.xml</str>
</lst>
</requestHandler>`
sample-data-config.xml
<dataConfig>
<dataSource type="JdbcDataSource" driver="org.apache.cassandra.cql.jdbc.CassandraDriver" url="jdbc:cassandra://127.0.0.1:9160/demo" autoCommit="true"/>
<document name="content">
<entity name="test" query="SELECT id,org,name,dep,place,sal from tutor" autoCommit="true">
<field column="id" name="id" />
<field column="org" name="org" />
<field column="name" name="name" />
<field column="dep" name="dep" />
<field column="place" name="place" />
<field column="sal" name="sal" />
</entity>
</document>
</dataConfig>
to managed schema I added these
<field name="org" type="string" indexed="true" stored="true" required="true" />
<field name="dep" type="string" indexed="true" stored="true" required="true" />
<field name="place" type="string" indexed="true" stored="true" required="true" />
<field name="sal" type="string" indexed="true" stored="true" required="true" />
On running solr and trying to import data from sample core, I'm unable to import. I keep getting this error.
I'm unable to figure out where I'm wrong could someone help me out. Thanks in advance.

The last version of cassandra being supported by JDBC is 1.2.5. Later on, Datatstax developed the required drivers for Cassandra to get connected with JAVA applications. However, the datastax drivers cannot be used with solr as they have their own DSE search engine.
This jdbc driver available is compatible with the latest versions of cassandra. Using this JDBC I could integrate.
And also this JDBC works.

that error means solr cannot reach cassandra via jdbc. First, you should check you can connect to cassandra from the Solr host, with java, by using some db tool like squirrelSQL or something similar.
Once you verified you can access it this way, move to solr. But you have something preventing it (a firewall, some wrong port, who knows...)

use nodetool enablethrift for cassandra . your error is clearly saying unable to connect . connection refused .. after you enable thrift check it using nodetool info . hope this solves your problem .

Related

Solr Cloud in Kubernetes Indexing error - HttpSolrCall Unable to write response

I am trying to do an Indexing with Solr Cloud running in Kubernetes cluster. I defined a Data Import Handler and I can see the configuration in Solr UI.
The Data Import Handler will allow me to trigger a SQL query and fetch the Polygon data for building the index.
<dataSource
type="JdbcDataSource" processor="XPathEntityProcessor"
driver="oracle.jdbc.driver.OracleDriver" ...... />
<document>
<entity name="pcode" pk="PC" transformer="ClobTransformer"
query="select PCA as PC, GEOM as WPOLYGON,
SBJ,PD,CD
from SCPCA
where SBC is not null>
<field column="PC" name="pCode" />
<field column="WPOLYGON" name="wpolygon" clob="true"/>
<field column="SBJ" name="sbjcode" clob="true"/>
<field column="PD" name="portid"/>
<field column="CD" name="cancid"/>
</entity>
</document>
</dataConfig>
After triggering the index via UI.It runs for around 1 minute and fails with following errors in the console
qtp1046545660-14) [c:sba s:shard1 r:core_node6 x:sba_shard1_replica_n4] o.a.s.u.p.LogUpdateProcessorFactory [sba_shard1_replica_n4] webapp=/solr path=/dataimport params={core=sba&debug=true&optimize=false&indent=on&commit=true&name=dataimport&clean=true&wt=json&command=full-import&_=164589234356779&verbose=true}{deleteByQuery=*:*,commit=} 0 70343
2022-02-26 16:30:38.092 INFO (qtp10465423460-14) [c:sba s:shard1 r:core_node6 x:sba_shard1_replica_n4] o.a.s.s.HttpSolrCall Unable to write response, client closed connection or we are shutting down => org.eclipse.jetty.io.EofException: Reset cancel_stream_error
at org.eclipse.jetty.http2.server.HTTP2ServerConnectionFactory$HTTPServerSessionListener.onReset(HTTP2ServerConnectionFactory.java:159)
org.eclipse.jetty.io.EofException: Reset cancel_stream_error
I am using Solr Cloud 8.9 with Solr operator 0.5.0 and I checked jetty config and it have an idle timeout of 120000.
Any one faced similar issues and fixed it?
Jetty's EofException almost always means one specific thing. The client
closed the connection before Solr could respond, so when Solr finally
finished processing and tried to have Jetty send the response, there was
nowhere to send it -- the connection was gone.
In my case I was doing a full data import to Solr and it failed with this HttpSolrCall Unable to write response EofException . This was happening due to issues with my managedSchema / schema.xml . I forgot to add all columns correctly in the schema.xml which caused the Indexing to fail with EofException. After correcting my schema.xml it worked fine.
It is bit confusing error as why there is an EofException for wrong schema. However, if it is Solr always check the schema.xml / managedSchema for any discrepancies.

Unable to use Update Log: _version_fiels must exist in schema

I am new at solr apache and i am trying to build a search engine based on this platform. i uploaded all the documents but it continue to give me such a error i don't know what going on. this error bring other errors like CoreContainer which unable to create core files and i am not able to perform a query search. Can you help me guys??
org.apache.solr.common.SolrException: _version_ field must exist in schema and be searchable (indexed or docValues) and retrievable(stored or docValues) and not multiValued (_version_ does not exist)
at org.apache.solr.update.VersionInfo.getAndCheckVersionField(VersionInfo.java:68)
at org.apache.solr.update.VersionInfo.<init>(VersionInfo.java:94)
at org.apache.solr.update.UpdateLog.init(UpdateLog.java:308)
at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:137)
at org.apache.solr.update.UpdateHandler.<init>(UpdateHandler.java:94)
at org.apache.solr.update.DirectUpdateHandler2.<init>(DirectUpdateHandler2.java:102)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:706)
at org.apache.solr.core.SolrCore.createUpdateHandler(SolrCore.java:768)
at org.apache.solr.core.SolrCore.initUpdateHandler(SolrCore.java:1009)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:874)
at org.apache.solr.core.SolrCore.<init>(SolrCore.java:776)
at org.apache.solr.core.CoreContainer.create(CoreContainer.java:842)
at org.apache.solr.core.CoreContainer.lambda$load$0(CoreContainer.java:498)
at org.apache.solr.core.CoreContainer$$Lambda$24/135640095.call(Unknown Source)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:229)
at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor$$Lambda$25/328827614.run(Unknown Source)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Those are my field build in schema.xml
<field name="_version_" type="long" indexed="true" stored="true"/>
<field name="id" type="string" indexed="true" stored="true" required="true"/>
<field name="title" type="string" indexed="true" stored="true"/>
<field name="revision" type="int" indexed="true" stored="false"/>
<field name="user" type="string" indexed="true" stored="false"/>
<field name="userId" type="int" indexed="true" stored="false"/>
<field name="text" type="text_en" indexed="true" stored="true"/>
<uniqueKey> id </uniqueKey>
first you download latest configset associated with schema Latest configset can be downloaded using following command: solr zk -downconfig -d directory to download -n configset name -z ip:port of zookeeper
for example : solr zk -downconfig -d C:\solr\workingConfig -n configsetName -z localhost:2181
make required changes in schema.xml
add version if it does not exist
Then Upload latest configset to zookeeper using following command.
solr zk -upconfig -d directory to upload -n configset name -z ip:port of zookeeper For example: solr zk -upconfig -d C:\solr\workingConfig -n configsetName -z localhost:2181
Then you need to Reload schema so that changes can take effect immediately.
Schema can be reloaded using following command: http://ip:port/solr/admin/collections?action=RELOAD&name=
And then try to add docs

SOLR Field not reflected in schema browser

I created a solr core using bin/solr -c core1 and then copied the schema.xml file from basic config set to core1/conf folder and added a field
<field name="title" type="text" indexed="true" stored="true"/>.
But this field is not reflected in schema browser.
What configurations should I make to get the new fields reflected in schema browser in solr admin ui?
I am using solr 5.3.1
By default when you create a solr core it will use managed schema. You will see the following configuration in solrconfig.xml after core is created.
<schemaFactory class="ManagedIndexSchemaFactory">
<bool name="mutable">true</bool>
<str name="managedSchemaResourceName">managed-schema</str>
</schemaFactory>
Above this configuration you will find the comments on how use managed-schema. Comment this out and uncomment the following to use schema.xml
<schemaFactory class="ClassicIndexSchemaFactory"/>
You need to reload the core: go to http://yourhost:8983/solr/#/~cores/core1 and press "Reload" button.

solr search over ms sql server - an updated tutorial needed

I am trying to config solr over ms sql server.
I found only this tutorial which is a bit old (2011)
Is there an updated tutorial?
Is there a formal tutorial?
Steps to configure solr on Tomcat
http://zensarteam.wordpress.com/2011/11/25/6-steps-to-configure-solr-on-apache-tomcat-7-0-20/
Everything about data import handler can be found here..
http://wiki.apache.org/solr/DataImportHandler
After creating the dataimport handler (ex file name data-config.xml) you need to add this request handler to solrconfig.xml as below
<requestHandler name="/dataimport" class="org.apache.solr.handler.dataimport.DataImportHandler">
<lst name="defaults">
<str name="config">/home/username/data-config.xml</str>
</lst>
</requestHandler>
Sample ms sql DIH configuration:
data-config.xml:
<dataConfig>
<dataSource type="JdbcDataSource"
driver="com.microsoft.sqlserver.jdbc.SQLServerDriver"
url="jdbc:sqlserver://myserver;databaseName=mydb;responseBuffering=adaptive;selectMethod=cursor"
user="sa"
password="password"/>
<document>
<entity name="results" query="SELECT statements">
<field column="fielda" name="fielda"/>
<field column="fieldb" name="fieldb"/>
<field column="fieldc" name="fieldc"/>
</entity>
</document>
</dataConfig>

Can not use ICUTokenizerFactory in Solr

I am trying to use ICUTokenizerFactory in Solr schema. This is how I have defined field and fieldType.
<fieldType name="text_icu" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.ICUTokenizerFactory"/>
</analyzer>
</fieldType>
<field name="fld_icu" type="text_icu" indexed="true" stored="true"/>
And, when I start Solr, I am get this error
Plugin init failure for [schema.xml] fieldType "text_icu": Plugin init failure for [schema.xml] analyzer/tokenizer: Error loading class 'solr.ICUTokenizerFactory'
I have searched in for that with no success. I don't know if I am missing something or there is some problem in schema.
If someone has tried ICUTokenizerFactory then please suggest what could be the problem.
Add this at the top of your solrconfig.xml:
<config>
<lib dir="${user.dir}/../contrib/analysis-extras/lucene-libs/" />
<lib dir="${user.dir}/../contrib/analysis-extras/lib/" />
This assumes that you are running from example directory with solr.solr.home set to your instance. Otherwise, just use absolute path to your Solr installation.
You can also copy all those jars into lib directory (under your core, not solr home). But the above is an easier way.
From the Wiki:
Lucene provides support for segmenting these languages into syllables with solr.ICUTokenizerFactory in the analysis-extras contrib module. To use this tokenizer, see solr/contrib/analysis-extras/README.txt for instructions on which jars you need to add to your SOLR_HOME/lib

Resources