Run Cassandra query using Solr without using Datstax - solr

I currently set up a Solr indexing schema with one of my databases in Cassandra. I am able to run queries using the Solr Admin UI when visiting my local IP address with the appropriate port. However, I am unable to run any queries when trying CQL Solr Queries. Running a line such as SELECT * FROM keyspace.table WHERE solr_query='name: cat name: dog -name:fish' results in a "unknown column solr_query" error. I also tried adding the line <requestHandler class="com.datastax.bdp.search.solr.handler.component.CqlSearchHandler" name="solr_query" /> to my solrconfig.xml as was mentioned in https://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/srch/srchCql.html but got a "Could not load class" error. Any idea on what I'm doing wrong?
Thanks!

Related

Apache SOLR - full import fetches only like half of the records

I'm using the JdbcDataSource to fetch records from a view in my MySQL database.
When running the view in phpMyadmin issuing a count(*) command, I get 97670 records.
But when running a full import in SOLR, only 56428 records are fetched and indexed. I don't get any errors, everything seems to be ok on SOLR (I did set batchSize to -1).
Any explanations for this behaviour?
I am assuming that you have checked solr logs and no errors are logged there In this case you can try executing the commit command on your solr collection as sometimes the documents are buffered in memory and we can push them to the disk by commiting them

How to integrate Zeppelin Solr

After create Solr interpreter then trying to query collection through zookeeper, however it is throwing Exception Caused by: org.noggit.JSONParser$ParseException:
are you trying to look at this directly? you can go via the jdbc service it wasn't too complicated
https://lucene.apache.org/solr/guide/7_7/solr-jdbc-apache-zeppelin.html

Error while re-loading the Solr core after schema.xml is modified. could not achieve replication factor 1 (found 0 replicas only)

Currently working with Cassandra in Solr mode and started running Cassandra in Solr.
using dse 4.7
cassandra 2.1.8
./dsetool create_core vin_service_development.vinid_search1
generateResources=true reindex=true
Created indexes successfully and able to see the table under Core Selector Select list in http://10.14.210.22:8983/solr/#/
Changed the schema.xml field type from "TextField" to "StrField" and want to reload the changes made to schema.xml file.
After executing the below command.
./dsetool reload_core vin_service_development.vinid_search1 reindex=true solrconfig=solr.xml
solr.xml is placed in the same path of dsetool.
Error Info:
brsblcdb012:/apps/apg-data.cassandra/bin ./dsetool reload_core vin_service_development.vinid_search1 reindex=true solrconfig=solr.xml
WARN 20:21:14 Error while computing token map for datacenter datacenter1: could not achieve replication factor 1 (found 0 replicas only), check your keyspace replication settings. Note that this can affect the performance of the driver.
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error in xpath:/config/luceneMatchVersion for solrconfig.xml
at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:665)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:303)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:294)
at com.datastax.bdp.tools.SearchDseToolCommands.createOrReloadCore(SearchDseToolCommands.java:383)
at com.datastax.bdp.tools.SearchDseToolCommands.access$200(SearchDseToolCommands.java:53)
at com.datastax.bdp.tools.SearchDseToolCommands$ReloadCore.execute(SearchDseToolCommands.java:201)
at com.datastax.bdp.tools.DseTool.run(DseTool.java:114)
at com.datastax.bdp.tools.DseTool.run(DseTool.java:51)
at com.datastax.bdp.tools.DseTool.main(DseTool.java:174)
Is this the correct way to re-load the core in Solr after making changes to the xml files?
Updated:
One of my keyspace was using NetworkTopologyStrategy earlier. Fixed this to SimpleStrategy. Now all the keyspaces have SimpleStrategy in the datacenter Solr.
After executing the same command, got this error.
brsblcdb012:/apps/apg-data.cassandra/bin ./dsetool reload_core vin_service_development.vinid_search1 reindex=true solrconfig=solr.xml
org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException: Error in xpath:/config/luceneMatchVersion for solrconfig.xml
at org.apache.solr.client.solrj.impl.HttpSolrServer.executeMethod(HttpSolrServer.java:665)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:303)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:294)
at com.datastax.bdp.tools.SearchDseToolCommands.createOrReloadCore(SearchDseToolCommands.java:383)
at com.datastax.bdp.tools.SearchDseToolCommands.access$200(SearchDseToolCommands.java:53)
at com.datastax.bdp.tools.SearchDseToolCommands$ReloadCore.execute(SearchDseToolCommands.java:201)
at com.datastax.bdp.tools.DseTool.run(DseTool.java:114)
at com.datastax.bdp.tools.DseTool.run(DseTool.java:51)
at com.datastax.bdp.tools.DseTool.main(DseTool.java:174)
what would be the recommended change now?
To sum up the conversation:
The keyspace replication configuration was initially wrong (updated to SimpleStrategy RF2):
Your nodes are now in Datacenter 'Solr' but one of your keyspaces is configured with NetworkTopologyStrategy and a replication factor referencing 'datacenter1'.
You had accidentally replaced your solrconfig with the wrong XML which caused this error. To fix this you can recreate your solr core.
In DSE 4.8 you can remove your solr core using unload_core and recreate it. If on an older verison of DSE you can follow 'Remove core from Datastax Solr'.

Clurdera cannot find class 'org.apache.hadoop.hive.solr.SolrStorageHandler'

I'm using SolrStorageHandler to insert data into Solr from Hive tables.
I'm following this tutorial but when lunching oozie workflow i get an error
"Error while compiling statement: FAILED: SemanticException Cannot find class 'org.apache.hadoop.hive.solr.SolrStorageHandler' (state=42000,code=40000)".
I cannot find SolrStorageHandler.jar on the web to add it manually to hive script. I'm doing something wrong or is this approach deprecated? If it's depracted how can I insert data to Solr from Hive in efficient way?
btw. Cloudera version is 5.4.3

Solr Write-lock issue

Environment: Solr 1.4 on Windows/MS SQL Server
A write lock is getting created whenever I am trying to do a full-import of documents using DIH. Logs say "Creating a connection with the database....." and the process is not going forward (Not getting a database connection). So the indexes are not getting created. Note that no other process is accessing the index and even I restarted my MS SQL Server service. However still I see a write.lock file in my index directory.
What could be the reason for this? Even I have set the flag unlockOnStartup in solrconfig to be true, still the indexing is not happening.
Problem was resolved. There was some issue with the java update and the microsoft jdbc driver.

Resources