How do you change the default location of all your data / cores under Solr? - solr

How do you change the default location of where your data is kept under Solr.
On an AWS setup everything I create goes into /var/solr/data
In what config file is this default location stored?
I'd like to point it at the /data directory which is a 100GB mounted hard drive.

Aside from editing each solrconfig.xml for each collection (or index) there is a work around that I've come up with in past installations.
Instead of editing 4 files to point to the new location, it is easier to simply "trick" SOLR into storing on your mounted drive using a symlink.
You will point your SOLR directory to your mounted drive:
/var/solr/data -> /data/whatever/directory
Copy your SOLR files:
cp /var/solr/data /data/whatever/directory
Back up your current data:
mv /var/solr/data/ /var/solr/data_backup
Create you symink (target -> symlink):
ln -s /data/whatever/directory /var/solr/data
After all is said and done, you, probably need to repair permissions, setting ownership to SOLR and ensuring that the linking worked correctly. Inside the /var/solr directory .. You can run a ls -lah and you should be able to see whether the link is correctly routed. If it is not, it'll be highlighted in red on most Debian systems. It should look something like:
lrwxrwxrwx 1 solr solr 31 Apr 30 2021 data -> /data/whatever/directory
Once all finished up .. Restart the SOLR service and re-index your collections.

Related

Solr Error: Unable to create core [mycore] Caused by solr.ICUCollationField

I am trying to create a solr core, I am using drupalvm with vagrant and virtual box.
When setting up solr with this command:
sudo su - solr -c "/opt/solr/bin/solr create -c m4m -d /tmp/search_api_solr/solr-conf/7.x/"
I am getting this error:
INFO - 2018-11-05 19:21:45.804; org.apache.solr.util.configuration.SSLCredentialProviderFactory; Processing SSL Credential Provider chain: env;sysprop
ERROR: Error CREATEing SolrCore 'mycore': Unable to create core [mycore] Caused by: solr.ICUCollationField
Creating a core without specifying the -d <confdir> option is successful but gives me some really weird errors in the solr dashboard and Drupal UI which research indicates has something to do with a corrupted core.
Any help with why I am getting this error would be much appreciated. Other developers using the same vagrant installation is running without issue.
If you create the core without the config directory, solr will use it's default configurations.
Which in turn, will have none of the drupal needed field definitions, and so forth.
What you need to do, if you know a little bit about the solr's structure, and if you use solr > version 7 is:
go to where your solr installation is
cd /PATH_TO_SOLR/server/solr-webapp/webapp/WEB-INF/lib
Copy all jars from the analysis-extras folder to your wEB-INF/lib folder
cp /PATH_TO_SOLR/contrib/analysis-extras/lib/*.jar ./
restart solr the way you normally do, specifying your -d config directory. That's important.
Hope this helps.
OR...
Save your hassle and let the pros handle all this for you with a SaaS such as the likes of https://opensolr.com
You can create your solr index with 1 click, and you need 2 more clicks to upload your config files and you're done.
I need jars from 2 directories:
cd /PATH_TO_SOLR
cp solr/contrib/analysis-extras/lib/*.jar solr/server/solr-webapp/webapp/WEB-INF/lib/
cp solr/contrib/analysis-extras/lucene-libs/*.jar solr/server/solr-webapp/webapp/WEB-INF/lib/
see solr/contrib/analysis-extras/README.txt

How to correctly add additional SOLR 5 (vm) nodes to SOLR Cloud

I have a SOLR / Zookeeper / Kafka setup. Each on separate VMs.
I have successfully run this all using two SOLR 4.9 vms (Ubuntu)
Now I wish to build two SOLR 5.4 vms and get it all working again.
Essentially, "Upgrade by Replacement"
I have "hacked" a solution to my problem but that makes me very nervous.
To begin, Zookeeper is running. I turn off my SOLR 4.9 vms and delete the config out of Zookeeper (not necessarily in that order... ;-) )
Now, I start up my 'solr5' VM (and SOLR in cloud mode) where I have installed SOLR 5.4 according to the "Production Install" instructions on the SOLR Wiki. I have also installed 5.4 on 'solr6', but it's not running yet.
I issue this command on the 'solr5' machine:
/opt/solr/bin/solr create -c fooCollection -d /home/john/conf -shards 1 -replicationFactor 1
and I get the following output:
Connecting to ZooKeeper at 192.168.56.5,192.168.56.6,192.168.56.7/solr ...
Re-using existing configuration directory statdx
Creating new collection 'fooCollection' using command:
http://localhost:8983/solr/admin/collections?action=CREATE&name=fooCollection&numShards=1&replicationFactor=1&maxShardsPerNode=1&collection.configName=fooCollection
{
"responseHeader":{
"status":0,
"QTime":3822},
"success":{"":{
"responseHeader":{
"status":0,
"QTime":3640},
"core":"fooCollection_shard1_replica1"}}}
Everything is working great. I turn on my microservice, and it pumps all my SOLR docs from Kafka into 'solr5'.
Now, I want to add 'solr6' to the collection. I can't find a way to do this besides my hack (which I'll describe later).
The command I used before to create a collection, errors out with the observation that my collection already exists.
There seems to be no zkcli.sh or solr command that will do what I want. None of the api commands seem to do this either.
Is there not a simple way to say to (SOLR? Zookeeper?) I want to add another machine to my SOLR nodes, please configure it like the first (solr5) and begin replicating data?
Maybe I should have had both machines running when I issued the create command?
I'd be grateful for some "approved" method for doing this since I need to come up with a "solution" to do the same kind of approach in Prod every time there is a need to upgrade SOLR.
Now for my hack. Keep in mind I'm now two days trying to find clear docs on this. No flames please, I totally get that this is not the way to do things. At least, I HOPE this is not the way to do things...
Copy the fooCollection directory from where the create collection
command put it on 'solr5' (which was
/opt/solr/server/solr/fooCollection_shard1_replica1) to the same
location on my 'solr6' VM.
Make what changes seem logical to the collection directory name (becomes
fooCollection_shard1_replica2)
Make what changes seem logical in the core.properties file:
For reference, here's the core.properties file that was created by the create command.
#Written by CorePropertiesLocator
#Wed Jan 20 18:59:08 UTC 2016
numShards=1
name=fooCollection_shard1_replica1
shard=shard1
collection=fooCollection
coreNodeName=core_node1
Here is what the file looked like on 'solr6' when I was done hacking.
#Written by CorePropertiesLocator
#Wed Jan 20 18:59:08 UTC 2016
numShards=1
name=fooCollection_shard1_replica2
shard=shard1
collection=fooCollection
coreNodeName=core_node2
When I did this and rebooted 'solr6' everything appeared golden. The "Cloud" web page looked right in the Admin web page - and when I added documents to 'solr5' they were available in 'solr6' if I hit it directly from the Admin web pages.
I would be grateful if someone can tell me how to achieve this without a hack like this... or if this IS the right way to do this...
=============================
In answer to #Mani and the suggested procedure
Thanks Mani - I did try this very carefully following your steps.
In the end, I get this output from the collection status query:
john#solr6:/opt/solr$ ./bin/solr healthcheck -z 192.168.56.5,192.168.56.6,192.168.56.7/solr5_4 -c fooCollection
{
"collection":"fooCollection",
"status":"healthy",
"numDocs":0,
"numShards":1,
"shards":[{
"shard":"shard1",
"status":"healthy",
"replicas":[{
"name":"core_node1",
"url":"http://192.168.56.15:8983/solr/fooCollection_shard1_replica1/",
"numDocs":0,
"status":"active",
"uptime":"0 days, 0 hours, 6 minutes, 24 seconds",
"memory":"31 MB (%6.3) of 490.7 MB",
"leader":true}]}]}
This is the kind of result I've been finding in my experimentation all along. The core will get created on one of the SOLR VM's (the one I issue the command line to create the collection on) but I don't get anything created on the other VM -- which, based on your steps below, I believe you also thought should occur, yes?
Also, I'll note for anyone reading that in 5.4, the command is "healthcheck" and not healthstatus. The command line shows you immediately, so it's no big deal.
===============
Update 1 :: Manual add of 2nd core
If I go to the other VM and manually add the following:
sudo mkdir /opt/solr/server/solr/fooCollection_shard1_replica2
sudo mkdir /opt/solr/server/solr/fooCollection_shard1_replica2/data
nano /opt/solr/server/solr/fooCollection_shard1_replica2/core.properties
(in here I add only collection=fooCollection and then save/close)
Then I reboot my SOLR server on that same VM:
sudo /opt/solr/bin/solr restart -c -z zoo1,zoo2,zoo3/solr
I will find a second node magically appearing in my Admin console. It will be a "follower" (I.E. not the leader) and both will be branching off "shard1" in the cloud UI.
I don't know if this is "the way" but it's the only way I've found so far. I'm going to reproduce to that point and try with the Admin UI and see what I get. That would be a little easier for my IT guys when the time comes - if it works.
===============
Update 2 :: Slight modification of create command
#Mani -- I believe I have success following your steps - and like many things, it's simple once you understand.
I reset everything (deleted directories, cleared out zookeeper (rmr /solr) and re did everything from scratch.
I changed the "create" command slightly thus:
./bin/solr create -c fooCollection -d /home/john/conf -shards 1 -replicationFactor 2
Note the "replicationFactor 2" rather than 1.
Suddenly I did indeed have cores on both VMs.
A couple of notes:
I found that I couldn't get a happy result from the status call just by starting the SOLR 5.4 servers in Cloud mode with the Zookeeper IP addresses. The "node" in Zookeeper was not yet created.
The create command also failed at that point.
The way I found around this was to use the zkcli.sh to load the configs like this:
sudo /opt/solr/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -confdir /home/john/conf/ -confname fooCollection -z 192.168.56.5/solr
When I checked Zookeeper immediately after running this command, there was a /solr/configs/fooCollection "path".
NOW the create command works and I assume that if I had wanted to override the configs, I could have done so at that point although I haven't tried.
I'm not positive at what point, but it seems I needed to reboot the SOLR Servers (probably after the create command) in order to find everything on status etc... I may be misremembering that because I've been through it so many times. If in doubt after the create command, try a reboot of the servers. (This can be IP addresses or names that resolve correctly)
sudo /opt/solr/bin/solr restart -c -z zoo1,zoo2,zoo3/solr
sudo /opt/solr/bin/solr restart -c -z 192.168.56.5,192.168.56.6,192.168.56.7/solr
After doing these slight modifications to #Mani's recommended procedure, I get a Leader and a "follower" each on different VM's - in the /opt/solr/server/solr directory (fooCollection in this case) and I was able to send data in to one and search the other via the Admin console hitting the IP addresses.
=============
Variations
One thing anyone reading this may want to try is simply making another "node" in Zookeeper (solr5_4 for example).
I tried this and it works like a charm. Everywhere you see the /solr chroot associated with the Zookeeper ensemble, you could replace it with /solr5_4. This would allow the older SOLR VM's to keep functioning in Prod while you build out your new SOLR 5.4 "environment" and the same Zookeeper VM's could be used for both -- because a different chroot should guarantee no interaction or overlap.
Again, the "node" in Zookeeper won't be created until you do the config upload, but you need to start your SOLR process like this or you'd be in the wrong context later on. Note the "solr5_4" as the chroot.
sudo /opt/solr/bin/solr restart -c -z zoo1,zoo2,zoo3/solr5_4
Once done with testing, the solr5_4 "environment" becomes what matters for Prod and the SOLR 4.x VM's and Zookeeper "node" of solr can be removed. It should be a fairly simple matter to point a load balancer at the new SOLR VM's and do a switchover without users really even noticing.
This strategy will work for SOLR 6, 6.5, 7, and so on.
This command also worked to add the collections/cores. However, the solr server had to be running first.
http://192.168.56.16:8983/solr/admin/collections?action=CREATE&name=fooCollection&numShards=1&replicationFactor=2&collection.configName=fooCollection
==================
Use as Upgrade By Replacement
In case it's not obvious, this technique (especially if using the "new" chroot in Zookeeper of something like /solr5_4 or similar) gives you the luxury of leaving your older version of SOLR running for as long as you want. Allowing a re-indexing of all your data to take days if needed.
I haven't tried, but I'm guessing a backup of the index could be dropped into the new machines as well.
I just wanted readers to understand that this was an approach intended to make upgrades really low stress and straightforward. (Don't need to upgrade in place, just build new VMs and install latest version of SOLR.)
This would allow the switch-over to occur without affecting prod until you're ready to drop the hammer and re-direct your load balancer at the new SOLR ip addresses (Which you will have already tested of course...)
The one assumption here is that you have the resources to bring up a set of SOLR VMs or physical servers to match whatever you already have in Production. Obviously, if you're resource-limited to only the boxes or VMs you have, upgrade-in-place may be your only option.
This is how I would do it. I am assuming that you have the luxury of having downtime & have ability to completely reindex the documents. Since you are essentially upgrading from 4.9 to 5.4.
Stop the 4.9 solr nodes and uninstall solr.
Remove the config from zk nodes using zkcli.sh with the clear command.
Install the solr on both solr5 & solr6 vm
Start both the solr nodes and make sure both can talk to zk. =>
On solr5 vm ./bin/solr start -c -z zk1:port1,zk2:port1,zk3:port1
On solr6 vm ./bin/solr start -c -z zk1:port1,zk2:port1,zk3:port1
Verify the status of Solrcloud using ./bin/solr status => this should return liveNodes as 2
Now create the fooCollection using the CollectionsAPI from anyone of solr nodes. This uploads the configsets to zookeeper and also creates the collection =>
./bin/solr create -c fooCollection -d /home/john/conf -shards 1 -replicationFactor 1
Verify the healthstatus of the fooCollection =>
./bin/solr healthstatus -z zk1:port1,zk2:port1,zk3:port1 -c fooCollection
Now verify the config is present in Zookeeper by checking Solr-AdminConsole -> CloudSection -> Tree .. /configs
And also check the CloudSection -> Graph showing the active status on the nodes. That indicates that everything is good.
Now start pushing documents into the collection
The below wiki is very helpful to do the above.
https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Reference

Solr 5: not loading shards from symlinked directories

I've tried to upgrade from Solr 4.10.3 to 5.4.1. Solr shards are placed on different disks and symlinks (ln -s) are created to SOLR_HOME (SOLR_HOME itself is set as an absolute path and works fine).
When Solr starts, it loads only shards placed in home directory, but not symlinked ones.
If I copy shard to home directory (in file system path remains unchanged, like SOLR_HOME/my_shard1, both symlinked and copied), it works.
Are there any ways to overcome this issue?
CorePropertiesLocator does NOT follow symlinks it's a known bug, appeared in 5.4, will be fixed in the following release (and there is a ready patch to fix 5.4)
https://issues.apache.org/jira/browse/SOLR-8548

Subversion Repository has weird files in it

I am managing a TortoiseSVN application rollout for a number of software developers.
An administrator has created a test repository for me to test before rollout and there are a number of directories here that I didn't create (or at least I don't remember) nor do I have access to modify the files in those directories.
Here are the directories: conf, db, format and hooks
There was a post were someone else ran into something like this (however I don't have enough posts to make a comment to their post, so had to create a new post).
Unusual Subversion Folders Appeared After Update
I'm not understanding how a repository could have been created to the Subversion server from TortoiseSVN. I don't have access to the server, just the application. I'm not able to right-click from Windows7 and the TortoiseSVN Repository Browser and create a new repository within a repository.
And can I go ahead and remove these files and not cause any issues?
the directories: conf, db, format and hooks
It's server-side tree, you have don't worry about it, because for client-side usage you have to know (as SVN-admin tell about it) only URL of repository, how it correlate with physical tree on server isn't your question and headache
how a repository could have been created to the Subversion server from TortoiseSVN
No ways. TortoiseSVN is pure client-side tool, creating repository is server-side administrative job (unless you have some server's space mounted as local drive, in which on any empty folder you can use Create repository here)
>dir /b /s
z:\Repo
Create repository here magic on z:\Repo casted
>dir /b /s
z:\Repo
z:\Repo\locks
z:\Repo\hooks
z:\Repo\conf
z:\Repo\README.txt
z:\Repo\db
z:\Repo\format
z:\Repo\svn.ico
...
but from client side it's just
>svn ls -R -v file:///Z:/Repo
1 Badger июн 28 01:19 ./
1 Badger июн 28 01:19 branches/
1 Badger июн 28 01:19 tags/
1 Badger июн 28 01:19 trunk/
can I go ahead and remove these files and not cause any issues?
It depends. There you get these files? Inside Working Copy?

From where nagios run custom plugins

I am using nagios 3.2 for monitoring. I have a custom plug-in which I have placed in...
/usr/local/nagios/libexec, for nagios monitoring.
My custom plugin reads a configuration file for functioning properly. and this configuration file should be in the same directory.
Form this directory(../nagios/libexec), I am able to execute the binary.
However when nagios try to run it, it is not able to read associated configuration file.
Troubleshooting tried:-
1.) I have given full privileges to both binary and configuration file
-rwsrwxrwx 1 root root 2102 Mar 7 04:53 ------.properties
-rwsrwxrwx 1 root root 2079462 Mar 6 12:03 binary
Please let me know if nagios run the custom plugin from any other directory?
or any other suggestion...
Thanks,
Ruchir
Check /usr/local/nagios/etc/resource.cfg $USER1$ variable. It points to plugin directory.
Does your plugin need any privileges to access an specific dir or something? Maybe nagios user doesn't have access to it, or you need to add nagios to sudoers.
So I am able to find out by replacing by plugin with a scripts that gives directory (PWD)
and find out that it is running nagios daemon from / (root) directory.
So I have placed my configuration file there and it worked.
Thanks everone for your suggestions!!!
What language is this plugin written in (this sometimes makes a difference in how the plugin will handle your environment vars)? Have you tried using the FULL path to your configuration file, in the plugin (not just "./conffile)? If you su to the Nagios user and attempt to execute said plugin (with config), does it work?

Resources