Converting a DSE Search node to a DSE Spark node - solr

I saw from the FAQ that a DSE node can be reprovisioned from RT mode to Hadoop mode. Is something similar supported with DSE Search and DSE Spark? I have an existing 6-node DSE Search cluster. I want to test DSE Spark but I have very limited time left for development so if possible, I'd like to skip the bootstrap process by simply restarting my cluster as an Analytics DC instead of adding new nodes in a separate DC.
UPDATE:
I tried to find an answer on my own. These are the closest that I found:
http://www.datastax.com/wp-content/uploads/2012/03/WP-DataStax-WhatsNewDSE2.pdf
http://www.datastax.com/doc-source/pdf/dse20.pdf
These documents are for a very old release of DSE. Both documents say that only RT and Analytics node can be re-provisioned. The second document even explicitly says that a Solr node cannot be re-provisioned. Unfortunately, there is no mention about re-provisioning in more recent documentations.
Can anybody confirm whether this is still true with DSE 4.5.1? (preferably with a link to a reference)
I also saw this forum thread which explains why the section about re-provisioning was removed in recent documentations. However, in my case, I plan to re-provision all of my Search nodes as Analytics node (in contrast to re-provisioning only a subset), and the re-provisioning would only be temporary

Yes you can do that. Just start it using 'dse Cassandra -k'

Related

Solr AutoScaling - Add replicas on new nodes

Using Solr version 7.3.1
Starting with 3 nodes:
I have created a collection like this:
wget "localhost:8983/solr/admin/collections?action=CREATE&autoAddReplicas=true&collection.configName=my_col_config&maxShardsPerNode=1&name=my_col&numShards=1&replicationFactor=3&router.name=compositeId&wt=json" -O /dev/null
In this way I have a replica on each node.
GOAL:
Each shard should add a replica to new nodes joining the cluster.
When a node are shoot down. It should just go away.
Only one replica for each shard on each node.
I know that it should be possible with the new AutoScalling API but I am having a hard time finding the right syntax. The API is very new and all I can find is the documentation. Its not bad but I am missing some more examples.
This is how its looks today. There are many small shard each with a replication factor that match the numbers of nodes. Right now there are 3 nodes.
This video was uploaded yesterday (2018-06-13) and around 30 min. into the video there is an example of the Solr.HttpTriggerListener that can be used to call any kind of service, for example an AWS Lamda to add new nodes.
The short answer is that your goals are not not achievable today (till Solr 7.4).
The NodeAddedTrigger only moves replicas from other nodes to the new node in an attempt to balance the cluster. It does not support adding new replicas. I have opened SOLR-12715 to add this feature.
Similarly, the NodeLostTrigger adds new replicas on other nodes to replace the ones on the lost node. It, too, has no support for merely deleting replicas from cluster state. I have opened SOLR-12716 to address that issue. I hope to release both the enhancements in Solr 7.5.
As for the third goal:
Only one replica for each shard on each node.
To achieve this, a policy rule given in the "Limit Replica Placement" example should suffice. However, looking at the screenshot you've posted, you actually mean a (collection,shard) pair which is unsupported today. You'd need a policy rule like the following (following does not work because collection:#EACH is not supported):
{"replica": "<2", "collection": "#EACH", "shard": "#EACH", "node": "#ANY"}
I have opened SOLR-12717 to add this feature.
Thank you for these excellent use-cases. I'll recommend asking questions such as these on the solr-user mailing list because not a lot of Solr developers frequent Stackoverflow. I could only find this question because it was posted on the docker-solr project.

Can I use 2 Solr cores for reindexing and searching simultaneously?

I am using the Collective Solr 4.1.0 Search on our Plone 4.2.6 system.
I am running a solr_core on my server that is currently being used by our Plone live system's search. Now I want to build a new index but without shutting down the live system search for 10 or more hours (time for reindexing). Doing that on the same core is only available on collective.solr 5.0 and higher versions. See collective.solr changelog.
Is there way for me to build a new index on another core while still being able to use the search on the currently used core? I thought of it like this: live_system uses core_1 for query and builds a new index on core_2. Once the index is built, switch both cores so that the live_system now uses core_2 for its search.
I know there is a way to load an already built Solr index into a Solr core, but I can't figure out how do accomplish this switcheroo I'm thinking of.
Kindly check the Master- Slave architecture. That might can help here !!
Check the following link- https://cwiki.apache.org/confluence/display/solr/Index+Replication

Is it possible to upgrade from Solr 4.x directly to Solr 6.1?

We are looking to upgrade from SolrCloud 4.10.3 to SolrCloud 6.1. The documentation for Solr 6.1 is not very clear on backward compatibility.
I came across this post on the LucidWorks site.
The index format is backward compatible between two consecutive major
Solr versions. So a Solr 3.x index is compatible with a Solr 4.x
index. However if you have a Solr 1.x index and want to upgrade to
Solr 4.x then you would need to first upgrade to Solr 3.x first.
It was written before Solr 6.x was out, and the wording of "between two consecutive major Solr versions" is unclear. The example skips the exact scenario that I'm interested in (skipping exactly 1 major version).
Do I have to first upgrade to Solr 5.x and then go to Solr 6.1?
Since I face same situation on upgrading SOLR from 4.x to 6.x I have been lucky and found on git hub next script, that is making the upgrade:
https://github.com/cominvent/solr-tools.git/
All the credits goes to "cominvent" for this script.
Since the folder cores vers 4.x structure is not same with version 6.x I have made a script that is creating the right tree configuration, then is applying upgradeindex.sh.
The script (buildsorltree.sh) can be found on https://github.com/cradules/bash_scripts and the repo dose have upgradeindex.sh too. Since I have linked this too scripts, I put them on same repo. Good luck!
I was able to find this on the Apache website.
Solr 6 has no support for reading Lucene/Solr 4.x and earlier indexes.
Be sure to run the Lucene IndexUpgrader included with Solr 5.5 if you
might still have old 4x formatted segments in your index.
Alternatively: fully optimize your index with Solr 5.5 to make sure it
consists only of one up-to-date index segment.
So this means that you can upgrade directly, but only if you run the IndexUpgrader from Solr 5.5 first.

how to find out what Solr version is DSE using

I am trying to find out what Solr version our DSE setup is using. I know it uses a custom modified solr, but I want to know the index Lucene version.
Apart from opening an index with Luke, is there somewhere where DSE shows this info? I don't see it in the Solr admin overview.
EDIT: I am only counting on looking at the setup, not any doc
Check the release notes:
http://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/RNdse.html
You can also see it in your system.log on startup.
Note: solr and lucene versions are the same now that they are a single project:
https://github.com/apache/lucene-solr/releases
In the solrconfig.xml, there is usually a line such as this:
<luceneMatchVersion>5.3.0</luceneMatchVersion>
This gives you the minimum version of Lucene required.

Rebuild Solr index in a specific Search node

I've accidentally cancelled the Solr index build in one my Search nodes. How do I restart the indexing on that node?
nodetool rebuild_index doesn't work. The command exits almost immediately - probably because it is meant to work with native Cassandra indexes whereas my table's indexes are of custom type "com.datastax.bdp.search.solr.Cql3SolrSecondaryIndex"
Clicking the "Reindex"/"Full reindex" button in the Solr core admin UI, on the other hand, will trigger the re-indexing of the whole columnfamily across all Search nodes.
Is there a way to trigger the indexing in that node only? I'm using DSE 4.0.1 (Cassandra 2.0.5, Solr 4.6.0.1)
In order to reindex a single node, you have to reload its core with the reindex=true and distributed=false parameters, as explained in: http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/srch/srchReldCore.html

Resources