Setting up Kibana with an Existing Solr Instance - solr

I have an existing Solr instance (4.9) up and running with several cores. I've been trying to set up Kibana for the majority of the day and I can't figure out how to incorporate it with my instance of Solr. I'm running locally on Windows 7 for dev purposes but the production is Linux. I've read through here and here a few times and I'm not picking up on how to get this done. The banana project seemed like the easiest choice but adding it to the banana/ directory did nothing. I started up LucidWorks but wasn't able to figure out how to get my existing cores in here. I have about 1.5TB of data in all of my cores (9 of them) so re-indexing is not an option.
Can someone provide me with resources or tutorials on how to incorporate Kibana with an existing Solr instance or a tutorial on how this is done?

If banana can fulfill your requirements, it is still the easiest choice. I recently set up banana for one of my Solr 5 cores (even though it was released for a 4.x series as far as I can tell), and also had the initial problem that "nothing happened".
Get it from https://github.com/LucidWorks/banana/ and then follow the instructions in the QUICKSTART file:
Copy Banana folder to $SOLR_HOME/example/solr-webapp/webapp/
Browse to http://localhost:8983/solr/banana/src/index.html#/dashboard
Perform the copy so that you will have a banana folder in webapp/, do not copy the contents of the banana folder directly in webapp/
Further information on the banana dashboard configuration and its options that I found useful are here: https://docs.lucidworks.com/display/SiLK/Dashboard+Configuration

Related

Change "Solr Cluster" in Lucidworks Fusion 4

I am running Fusion 4.2.4 with external Zookeeper (3.5.6) and Solr (7.7.2). I have been running a local set of servers and am trying to move to AWS instances. All of the configuration from my local Zookeepers has been duplicated to the AWS instances so they should be functionally equivalent.
I am to the point where I want to shut down the old (local) Zookeeper instances and just use the ones running in AWS. I have changed the configuration for Solr and Fusion (fusion.properties) so that they only use the AWS instances.
The problem I have is that Fusion's solr cluster (System->Solr Clusters) associated with all of my collections is still set to the old Zookeepers :9983,:9983,:9983 so if I turn off all of the old instances of Zookeeper my queries through Fusion's Query API no longer work. When I try to change the "Connect String" for that cluster it fails because the cluster is currently in use by collections. I am able to create a new cluster but there is no way that I can see to associate the new cluster with any of my collections. In a test environment set up similar to production, I have changed the searchClusterId for a specific collection using Fusion's Collections API however after doing so the queries still fail when I turn off all of the "old" Zookeeper instances. It seems like this is the way to go so I'm surprised that it doesn't seem to work.
So far, Lucidworks's support has not been able to provide a solution - I am open to suggestions.
This is what I came up with to solve this problem.
I created a test environment with an AWS Fusion UI/API/etc., local Solr, AWS Solr, local ZK, and AWS ZK.
1. Configure Fusion and Solr to only have the AWS ZK configured
2. Configure the two ZKs to be an ensemble
3. Create a new Solr Cluster in Fusion containing only the AWS ZK
4. For each collection in Solr
a. GET the json from <fusion_url>:8764/api/collections/<collection>
b. Edit the json to change “searchClusterId” to the new cluster defined in Fusion
c. PUT the new json to <fusion_url>:8764/api/collections/<collection>
After doing all of this, I was able to change the “default” Solr cluster in the Fusion Admin UI to confirm that no collections were using it (I wasn’t sure if anything would use the ‘default’ cluster so I thought it would be wise to not take the chance).
I was able to then stop the local ZK, put the AWS ZK in standalone mode, and restart Fusion. Everything seems to have started without issues.
I am not sure that this is the best way to do it, but it solved the problem as far as I could determine.

Solr luceneMatchVersion syntax

I have Solr 4.10 and I have collection on it with solorconfig.xml has the value for <luceneMatchVersion> as follows:
<luceneMatchVersion>4.7</luceneMatchVersion>
Is this correct? I saw other examples that has values such as LUCENE_35 What I need to know also, how could I express LUCENE_xx from my current Solr version?
You should use:
<luceneMatchVersion>4.10.4</luceneMatchVersion>
I recommend you to check your current solr version, in my case was 4.10.4.
if you are going to reindex, then both numbers should match. The only reason you might want to have them different, is if you had and index created with say Lucene 4.7, then you would have
<luceneMatchVersion>4.7</luceneMatchVersion>
Then, you upgrade lucene to 4.10.
Now, if among the changes in between 4.7 and 4.10 there are things that work differently regarding analysis (you get the same sentence analysed in both versions and get different output as a result), then, you might want to keep the version number at 4.7, otherwise some queries that contain affected terms might not work (as they were analysed at index time in a different way than at query time). You have to asses how critical that issue might be.
That is why the recommendation is to upgrade, change the setting to the current number, and reindex. This way you are sure to avoid any issue.
If anyone is using Drupal, the Search API Solr (search_api_solr) module has config templates by version in /sites/all/modules/search_api_solr/solr-conf/.
The template README.md states the following:
The solr-conf-templates directory contains config-set templates for
different Solr versions.
These are templates and are not to be used as config-sets!
To get a functional config-set you need to generate it via the Drupal
admin UI or with drush solr-gsc. See README.md in the module
directory for details.
The module's README.md lists these instructions:
Make sure you have Apache Solr started and accessible (i.e. via port 8983). You can start it without having a core configured at
this stage.
Visit Drupal configuration (/admin/config/search/search-api) and create a new Search API Server according to the search_api
documentation using "Solr" as Backend and the connector that
matches your setup. Input the correct core name (which you will
create at step 4, below).
Download the config.zip from the server's details page or by using drush solr-gsc with proper options, for example for a server named
"my_solr_server": drush solr-gsc my_solr_server config.zip 8.4.
Copy the config.zip to the Solr server and extract.
I generated a config file for 8.x, and it uses this:
<luceneMatchVersion>${solr.luceneMatchVersion:LUCENE_80}</luceneMatchVersion>

Getting Banana to work with Solr 4.2

I'm running Solr 4.2 and would like to try out LucidWorks Banana product. However, when I navigate to the banana directory on my Solr server, I receive a 404 error.
I'm following the instructions from their github site here, basically dropping the banana src directory into the my SOLR_HOME\solr-webapp\webapp directory.
I've tried modifying the src\config.js and src\app\dashboards\default.json files as suggested by the readme file to change the localhost URL to the actual server name or the IP address. Both attempts still resulted in the 404 error.
Has anyone had luck in getting banana working with Solr 4.2? Is it not supported on this version of Solr? Hopefully I'm missing something really simple.
Thanks!
At LucidWorks, we have only tested Banana on Solr 4.4 and above. So, I am not sure whether all the functionality will work with Solr 4.2. However, I do know a couple of users on Solr 4.3 (who might have made some small code changes).
That said, the webapp is just javascript/html and should at least come up (maybe with errors on the dashboard, but not a 404), even without modifying config.js/default.json.
Just to confirm your process (from https://github.com/LucidWorks/banana/blob/release/QUICKSTART):
Copy Banana folder to $SOLR_HOME/example/solr-webapp/webapp/
Browse to http://localhost:8983/solr/banana/src/index.html#/dashboard
If you dropped the src folder (and not the banana folder), then you will need to
browse to http://solrserver.yourdomain.com:8983/solr/src/index.html#/dashboard
I figured this out thanks to Ravi's suggestion that made me re-examine my configuration.
Turns out my web app was actually running from C:\Windows\solr-webapp, and not $SOLR_HOME\solr-webapp\webapp. I missed that my CWD was set to C:\Windows even though everything else was pointing to $SOLR_HOME directories.
Once I dropped the Banana folder into C:\Windows\solr-webapp\webapp, I was able to bring up the dashboard. Now I have other errors, but that's a different set of issues.
Hope this helps someone. TL;DR - make sure your CWD is consistent with the destination of banana.

Install Jetty or run embedded for Solr install

I am about to install Solr on a production box. It will be the only Java applet running and be on the same box as the web server (nginx).
It seems there are two options.
Install Jetty separately and configure to use with Solr
Set Solr's embedded Jetty server to start as a service and just use that
Is there any performance benefit in having them separate?
I am a big fan of KISS, the less setup the better.
Thanks
If you want KISS there is no question: 2. stick to vanilla Solr distrib with included jetty.
Doing the work of installing an external servlet engine would make sense if you needed Tomcat for example, but just to use the same thing (Jetty) Solr already includes...no way.
Solr is still using jetty 6. So there would be some benefits if you can get the solr application to run in a recent jetty distribution. For example you could use jetty 9 and use features like SPDY to enhance the response times of your application.
However I have no idea or experience if it's possible to run the solr application standalone in a servlet engine.
Another option for running Solr and keeping it simple is to use Solr-Undertow which is a high performance with small footprint server for Solr. It is easy to use on local machines for development and also production. It supports simple config files for running instances with different data directories, ports and more. It also can run just by pointing it at a distribution .zip file without needing to unpack it.
(note, I am the author of Solr-Undertow)
Link here: https://github.com/bremeld/solr-undertow with releases under the "Releases" tab.

Apache Solr setup for two diffrent projects

I just started using apache solr with it's data import functionality on my project
by following steps in http://tudip.blogspot.in/2013/02/install-apache-solr-on-ubuntu.html
but now I have to make two different instances of my project on same server with different databases but with same configuration of solr for both projects. How can I do that?
Please help me if anyone can?
Probably the closest you can get is having two different Solr cores. They will run under the same server but will have different configuration (which you can copy paste).
When you say "different databases", do you mean you want to copy from several databases into one join collection/core? If so, you just define multiple entities and possibly multiple datasources in your DataImportHandler config and run either all of them or individually. See the other question for some tips.
If, on the other hand, you mean different Solr cores/collections, then you just want to run Solr with multiple cores. It is very easy, you just need solr.xml file above your collection level as described on the Solr wiki. Once you get the basics, you may want to look at sharing instance directories and having separate data directories to avoid changing the same config twice (instanceDir vs. dataDir settings on each core).

Resources