Configuring Solr Cloud

Configuring Solr Cloud - solr

I've installed solr and zookeeper in two different machines, and have edited zoo.cfg file as instructed on solr wiki, zookeeper was launched and connected successfully, but when i try to ingest data on one machine, it does not reflect on other machine, indexed file should go in zookeeper data folder, but it is getting stored in solr data folder.
Can anyone help over this, or give me steps from scratch on how to configure it and check if it is working.

you should have zookeeper ensemble setup, as you mentioned you have it
you should setup solr cluster using multiple machines.
once solrcloud setup is done you should start solr with zookeeper ensebmle using param -z (ex: bin/solr start -z zookeepermachineIP:2181)
Everything is well explained in detail Here. also refer Wiki to setup a ZooKeeper Ensemble

right after the solrCloud setup with zookeeper, I was trying to copy my collection from stand-alone solr,but DIH was not reflecting,
Then I copied an example DIH named db from solr/example dir. and made changes to it according to my connection and query, and pasted it in configsets dir and placed necessary jars in lib. Also, node name has to be same on both the machines. It's working fine now.

Related

Replicate aws cloudsearch data into apache solr

Is there any way to replicate replicate file index data which is aws cloudsearch to apache solr hosted on ec2 real time.

Not really- in order to make sure that all of your original documents are indexed correctly you have to reindex them into Solr in their original form.

Is it possible to manage backups from SolrJ?

I can create / restore Solr backups from Solr via CollectionAdminRequest.Backup and CollectionAdminRequest.Restore.
Looks like it's possible via http api, e.g.:
http://localhost:8983/solr/gettingstarted/replication?command=details&wt=xml
But is it possible to list all backups and drop one by name from SolrJ?
I'm using Solr 7.5.

From what I found, it's not possible to do it from SolrJ directly.
So I've ended up working with HDFS directly. I've configured Solr to use HDFS as backup storage. And from my code I'm accessing it via HDFS client - I can list and remove backups from it.

Solr SQL for Single Solr Instance

I was excited to hear that Solr 6 had an SQL interface, but soon found that it only works with SolrCloud and not a single Solr instance. We currently have two Solr servers. One is a master production server and it is replicated to a slave reporting server. I would love to be able to use SQL on the slave.
So a couple questions.
Is it actually possible to use SQL on a single Solr instance and I just missed something?
If I need to use SolrCloud for SQL, how can I set that up and maintain a similar architecture to what I have now? That is, I only have two hosts, all production traffic including writes go to one host and all background reports go to the other host.
I welcome any other suggestions you might have.

This blog explains how SQL is integrated, and it looks as if you cannot do it with just Solr.
https://sematext.com/blog/2016/04/18/solr-6-solrcloud-sql-support/

Integrating Hbase that Holds Nutch Crawled Data and Solr

I have a Hbase Database that holds crawled information of wikipedia.org. My machine is at Amazon Wweb Services.
I have downloaded the Solr and I want to index the data at Hbase after that I will make search on it.
I am new to Solr and Hbase, how can I do that?

All you have to do is, run this command - sudo bin/nutch solrindex http://localhost:8983/solr/ -reindex
But before you do that please make sure your solr instance is up and running which you can verify by visiting this link http://localhost:8983/solr/ and if you can find Solr admin then your Solr instance is running.

migrate SOLR files

How can I migrate SOLR files (indexed) from one server to another?

Just copy your data directory under the path, where you configured your solr-Home wherever you want. See this, because it is a Lucene index. Or you can use the solr backup tool.