Solr 9.1 Collection creation - solr

I've installed zookeeper 3.7.1 and Solr 9.1 on three EC2s running Ubuntu 22.04.1 LTS in a solcrloud deployment. The zoo.cfg is as follows:
tickTime=2500
dataDir=/zookeeper
clientPort=2181
maxClientCnxns=80
initLimit=10
syncLimit=5
server.1=10.9.9.x:2888:3888
server.2=10.9.10.y:2888:3888
server.3=10.9.13.z:2888:3888
4lw.commands.whitelist=*
The Solr deployment is almost straight out of the box. The solr.xml is unmodified. Here is the section:
<solrcloud>
<str name="host">${host:}</str>
<int name="hostPort">${solr.port.advertise:0}</int>
<str name="hostContext">${hostContext:solr}</str>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool>
<int name="zkClientTimeout">${zkClientTimeout:30000}</int>
<int name="distribUpdateSoTimeout">${distribUpdateSoTimeout:600000}</int>
<int name="distribUpdateConnTimeout">${distribUpdateConnTimeout:60000}</int>
<str name="zkCredentialsProvider">${zkCredentialsProvider:org.apache.solr.common.cloud.DefaultZkCredentialsProvider}</str>
<str name="zkACLProvider">${zkACLProvider:org.apache.solr.common.cloud.DefaultZkACLProvider}</str>
<str name="zkCredentialsInjector">${zkCredentialsInjector:org.apache.solr.common.cloud.DefaultZkCredentialsInjector}</str>
<bool name="distributedClusterStateUpdates">${distributedClusterStateUpdates:false}</bool>
<bool name="distributedCollectionConfigSetExecution">${distributedCollectionConfigSetExecution:false}</bool>
</solrcloud>
The config in /etc/default/solr.in.sh is as follows:
SOLR_JETTY_HOST="0.0.0.0"
ZK_HOST="10.9.9.x:2181,10.9.13.y:2181,10.9.10.z:2181"
SOLR_JAVA_MEM="-Xms2G -Xmx4G"
SOLR_PID_DIR="/srv/apps_data/solrcloud"
SOLR_HOME="/srv/apps_data/solrcloud/data"
LOG4J_PROPS="/srv/apps_data/solrcloud/log4j2.xml"
SOLR_LOGS_DIR="/srv/apps_data/solrcloud/logs"
SOLR_PORT="8983"
# The following lines added by ./solr for enabling BasicAuth
SOLR_AUTH_TYPE="basic"
SOLR_AUTHENTICATION_OPTS="-Dsolr.httpclient.config=/srv/apps_data/solrcloud/data/basicAuth.conf"
I enabled basic authentication and everything looks good in all three Admin UIs. Zookeeper status is good and the three servers are now deployed in solrcloud mode. The security section is as follows:
So far so good.
I next create a config using an authenticated request using the basic authentication credentials, http://solr:solr#10.9.9.x:8983/solr/admin/configs?action=UPLOAD&name=calls with a zip file containing the two files
managed-schema.xml
solrconfig.xml
When I look at it under the zookeeper tree, /configs/calls, I see a little annotation {"trusted":true}. This all seems good so far.
The problem comes when trying to create a collection. I use the collections API V1, with 3 shards and 2 replicas, using the async approach
http://solr:solr#10.9.9.x:8983/solr/admin/collections?action=CREATE&name=calls&numShards=3&replicationFactor=2&collection.configName=calls&async=123456
"1234562330250161386315": {
"responseHeader": {
"status": 0,
"QTime": 0
},
"STATUS": "failed",
"msg": "Error CREATEing SolrCore 'calls_shard1_replica_n1': Unable to create core [calls_shard1_replica_n1] Caused by: solr.XSLTResponseWriter"
},
What am I missing here? My research indicates that the issue with the XSLTResponseWriter arises because of a lack of trust or authentication. What is the correct way to configure Solr 9.1 to allow collections to be created?
Any help will be greatly appreciated!!

I figured out the issue. In the release notes for 9.0, it makes the sort of passing comment
To improve security, XSLTResponseWriter has been moved to the
scripting Module instead of shipping as part of Solr core. This
module needs to be enabled explicitly.
In other words, solr won't work in cloud mode AT ALL until the out-of-the-box configuration is corrected. The required change needs to be made in /etc/default/solr.in.sh for my deployment on ubuntu, and requires enabling the scripting module.
The following lines appear in the freshly-installed solr.in.sh file (in /etc/default),
# The bundled plugins in the "modules" folder can easily be enabled as a comma-separated list in SOLR_MODULES variable
# SOLR_MODULES=extraction,ltr
In order to be able to use solrcloud and create collections, etc., the following line needs to be included:
SOLR_MODULES="extraction,ltr,scripting"
With this change, solrcloud works as expected!

Related

What determines whether InstanceDir has a full or relative path?

With Solr 4.x, http://localhost:8983/solr/admin/cores returns an XML description of loaded cores, which indicates the file path location of the instanceDir.
...
<lst name="collection1">
<str name="name">collection1</str>
<bool name="isDefaultCore">true</bool>
<str name="instanceDir">C:\solr\solr-4.10.1\example\solr\collection1\</str>
...
On my Windows 7 PC, this is presented as a full path, but others have reported this as relative path. What factors can cause this value to be presented as a relative path, and is there a way to force this to be presented as a full path?
Can you please confirm if you have set solr.home. Please check this Solr Wiki for more details. I hope setting solr.home should resolve the issue.
You can add it as JVM argument
java -Dsolr.solr.home=/your/solr/home/path/here -jar start.jar
Incase of Tomcat, you can also do as below
export JAVA_OPTS="$JAVA_OPTS -Dsolr.solr.home=/your/solr/home/path/here"
Thanks

SOLR 5: Error loading class 'solr.XsltUpdateRequestHandler'

On Solr 4.2.1, I was able to use the solr post and now I have upgraded to 5.4 and when I used the post method. I am getting Error loading class 'solr.XsltUpdateRequestHandler' error.
Here is the complete error:
null:org.apache.solr.common.SolrException: Error loading class 'solr.XsltUpdateRequestHandler'
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:559)
at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:490)
Caused by: java.lang.ClassNotFoundException: solr.XsltUpdateRequestHandler
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
I have used the following on the solrconfig and no errors during the solr start.
<requestHandler name="/update/xslt" class="solr.XsltUpdateRequestHandler" startup="lazy" />
Any help is much appreciated.
Thank you
Looks like that handler was deprecated back in 4.8 or possibly even sooner than that. I wouldn't be surprised if it was removed from Solr completely by 5.0, but I don't see anything in the docs about it. I pulled up the 4.2.1 docs and even those have the deprecated annotation.
You might want to try to downgrade all the way back to 4.10 at the latest to see if you can use that handler, then start upgrading versions one at a time until the error happens again. Jumping from 4.2.1 to 5.4.1 is huge. There have been quite a few changes between these versions.
If you don't want to bother, I did a quick search in my 5.4.1 solrconfig.xml and found the following:
<!-- XSLT response writer transforms the XML output by any xslt file found
in Solr's conf/xslt directory. Changes to xslt files are checked for
every xsltCacheLifetimeSeconds.
-->
<queryResponseWriter name="xslt" class="solr.XSLTResponseWriter">
<int name="xsltCacheLifetimeSeconds">5</int>
</queryResponseWriter>
I don't know how to use this as I've never had a reason to bother with xslt, but it looks like a good place to start.
Thanks #TMBT for your response. I was able to find the solution by using it UpdateRequestHandler as XsltUpdateRequestHandler is deprecated. After changing it to solr.UpdateRequestHandler, I was able to run the solr POST and it was successful.
<requestHandler name="/update/xslt" class="solr.UpdateRequestHandler" startup="lazy" />

Schema.xml is not created in the core directory

I was following this article to start with my apache solr expedition.
http://examples.javacodegeeks.com/enterprise-java/apache-solr/apache-solr-tutorial-beginners/
I have created a solr core using below command-
> solr create -c mycorename
This has created a core but schema.xml file is not created inside the conf directory. Instead of this i am able to see managed-schema.xml file. Does this command create a schemaless core. Please let me know how i can create a core that also have a schema.xml file created in it.
Yes, i've got some issue.
Shortly. Solr can use ether static schema, or dynamic (REST api) schema. So, you should select which one you'll use.
You can do id in solrconfig.xml
Like this:
<schemaFactory class="ManagedIndexSchemaFactory">
<bool name="mutable">true</bool>
<str name="managedSchemaResourceName">managed-schema</str>
</schemaFactory>
More info, read this guide

Solrcloud multicore configuration

I have a standalone Solr instance with 4 different cores working fine using the embedded Jetty server. I configured the cores for v4.10.3 but since I moved to v5.1 and all seems to work fine without any changes.
Before going into production, I need to set it up as a Solrcloud installation, initially with 2 nodes (two different machines) with 1 shard per node (to keep it simple). I have been trying to get it to work but I have not been able to do it.
I tried to run it like this (I think using start.jar is not the preferred way), having read that Solr will look for multiple configured cores in any nested folders (which works for standalone Solr):
java -DzkRun -DnumShards=2 -Dbootstrap_confdir=solr/ -jar start.jar
but that did not work, it does not find the needed solrconfig.xml file.
My Solr directory looks like this:
My solr.xml file is the standard one:
<solr>
<solrcloud>
<str name="host">${host:}</str>
<int name="hostPort">${jetty.port:8983}</int>
<str name="hostContext">${hostContext:solr}</str>
<int name="zkClientTimeout">${zkClientTimeout:30000}</int>
<bool name="genericCoreNodeNames">${genericCoreNodeNames:true}</bool>
</solrcloud>
<shardHandlerFactory name="shardHandlerFactory"
class="HttpShardHandlerFactory">
<int name="socketTimeout">${socketTimeout:0}</int>
<int name="connTimeout">${connTimeout:0}</int>
</shardHandlerFactory>
</solr>
Each core looks like this:
And the core.properties just has the name of the core:
name=users
My question is:
How do I start Solrcloud v5.1 so the 4 cores are picked up?
In SolrCloud each of your Core will become a Collection.
Each Collection will have its own set of Config Files and data.
You might find this helpful Moving multi-core SOLR instance to cloud
Solr 5.0 (onwards) has made some changes on how to create a SolrCloud setup with shards, and how to add collections etc.
Everything listed below is my understanding of the Solr Reference Guide. I will highly recommend going through it thoroughly.
https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide
I setup my servers on a Linux(CentOS) server, but the steps can be used to setup solr on Windows system also. For example, there is solr.cmd file instead of solr.sh
Here are the steps I followed to create a simple two shard SolrCloud setup.
Setup the zookeeper ensemble. I am assuming you are trying to use the
embedded ZK in solr. For a production system, it is highly
recommended to create a external ZK ensemble. You can find steps to install a external ensemble in this section of reference guid
Download solr to /opt folder.
Extract the install file ONLY.
tar xzf solr-5.0.0.tgz solr-5.0.0/bin/install_solr_service.sh --strip components=2
This command will install solr on your system
sudo bash ./install_solr_service.sh solr-5.0.0.tgz
The above command will create a new user called "solr" if it does not exist.
These are some of the default options it will assume. You can view this in /var/solr/solr.in.sh . This is the include file where you can specify other options.
* SOLR_PID_DIR=/var/solr
* SOLR_HOME=/var/solr/data
* LOG4J_PROPS=/var/solr/log4j.properties
* SOLR_LOGS_DIR=/var/solr/logs
* SOLR_PORT=8983
Running install_solr_service start in the above step will start a solr server. Stop the server using service solr stop before doing any of the changes below.
Change Java heap value
SOLR_HEAP="3g"
This will set Xmx and Xms as 3GB . (optional)
This variable is not mentioned in the solr.in.sh file in Solr 5.1 . Its a bug and has been fixed, will be released in next version.
SOLR_MODE="solrcloud" Required
this is what you need start solr in cloud mode.
ZK_HOST=ZK1:2181,ZK2:2181,ZK3:2181 Required
(replace zk with you zookeeper host names)
Running the install_solr_service.sh command also creates a init.d file as /etc/init.d/solr
This init.d script in turn calls the /opt/solr/bin/solr script and includes all the variables from /var/solr/solr.in.sh
Once you have made the above changes, start solr again using service solr start
You can check the status using service solr status
Creating Collections Shards and Replicas
- All shard, collection, replica related commands are now made using Collections API.
Before creating a collection a config folder should be uploaded to ZK .
This can be done using the zkcli.sh script in the solr folder (not on the zookeeper servers)
Folder: /opt/solr/server/scripts/cloud-scripts
The command to upload the confg folder is
sh zkcli.sh -cmd upconfig -zkhost zk1:2181,zk2:2181,zk3:2181 -confname yourconfigname -confdir /var/solr/configs/conf
You will run this command 4 times for each of your 4 cores, each time changing the path of the conf folder and config name.
This will upload all the config files in conf folder with the name 'yourconfigname' in zookeeper.
Creating a collection
I used the following command to create a new collection.
http://1.1.1.1:8983/solr/admin/collections?action=CREATE&name=yourcollectionname&numShards=2&replicationFactor=1&maxShardsPerNode=1&createNodeSet=1.1.1.1:8983_solr,2.2.2.2:8983_solr&collection.configName=yourconfigname
Happy Searching!
SolrCloud does not use configuration files stored in core conf directory. To make your cores visible in SolrCloud structure you need to upload the configuration files to ZooKeeper and keep it manage the files to you. All the time a Solr instance comes up it get the configuration files stored in ZooKeeper. This way your cores doesn't need to have conf directory to work. To upload your core configuration files to ZooKeeper follow the link bellow and take a look at Upload a configuration directory
https://cwiki.apache.org/confluence/display/solr/Command+Line+Utilities

Solr (4.4+) solrconfig.xml location when creating cores

I'm Trying to setup a multi core solr server for our webapplication but i'm having trouble creating new core through the coreadmin service.
I'm using Solr-4.4 because 4.3 ran into problems persisting the cores in solr.xml (datadir wasn't preserved) So i'm using the new Solr.xml configuration 4.4 and beyond
My solr.xml currently looks like:
<solr>
<str name="coreRootDirectory">default-instance/cores/</str>
</solr>
solrconfig.xml is located at (solrhome)/default-instance/conf/solrconfig.xml
When trying to create a core with the url
http:/example.org/solr/admin/cores?action=CREATE&name=test-name&schema=schema-test.xml&loadOnStartup=false
gives me the error:
Error CREATEing SolrCore 'test-name': Unable to create core: test-name
Caused by: Can't find resource 'solrconfig.xml' in classpath or
'default-instance/cores/test-name/conf/', cwd=/var/lib/tomcat7
The following seems to work:
http:/example.org/solr/admin/cores?action=CREATE&name=test-name&schema=schema-test.xml&loadOnStartup=false&config=/absolute/file/path/to/solrconfig.xml
The problem is this only seems to work with a absolute path (or possibly a relative path from /var/lib/tomcat7) which is not a workable solution.
What i'm looking for is a way to place solrconfig.xml so it can be used to create new cores with that config (or a way the create those cores with the current location).
More or less the same will be needed for schemas
This worked. Ran on command line and was viewable in admin console:
solr create -c (name for core or collection)
See README.txt for more info.
In my case I took advantage of the Core Discovery feature in 4.4+, rather than creating the core using the management web interface.
This simply involved copying the example collection1 folder from the examples directory (which I usually use as a starting point).
Then I had to make sure that there is core.properties in the root of my new core with name=<new core name> inside. Solr automatically detected the new core and allowed me to use it without any fuss.
This avoided the trouble of having to copying solrconfig.xml and schema.xml into any special location.
I had the same problem: solrconfig.xml was not in the classpath. I solved it by copying my configuration file templates into the classpath.
So I took a look at http://localhost:8983/solr/#/~java-properties to see solrs classpath definition and then i copied the template solrconfig.xml and schema.xml into the folder C:\servers\solr-4.4.0\example\resources. Furthermore i copied all the stopwords stuff there...
This solution is not a fully satisfying, but it works. Adding another path to the classpath should work, too. I'm slightly astonished that no default configuration for new cores can be declared within solr.xml
I recommend the new Config Sets for this use case.
If you place your schema.xml and solrconfig.xml (and other config files like stopwords etc.) in a directory $SOLR_HOME/configsets/myConfig/conf, you can create a new core with this config by calling:
http://solr/admin/cores?action=CREATE&name=mycore&instanceDir=my_instance&configSet=myConfig
See https://cwiki.apache.org/confluence/display/solr/Config+Sets
But they are not available until Solr 4.8, see https://issues.apache.org/jira/browse/SOLR-4478

Resources