DSE Solr nodes and vnodes - solr

The following documentation pages say that it is not recommended to use vnodes for Solr/Hadoop nodes:
http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/srch/srchIntro.html
http://www.datastax.com/documentation/datastax_enterprise/4.0/datastax_enterprise/deploy/deployConfigRep.html#configReplication
What is the exact problem with using vnodes for these node types? I inherited a DSE setup wherein the Search nodes all use vnodes, and I wonder if I should take down the cluster and disable vnodes. Is there any harm in leaving vnodes enabled in such a case?

It is primarily a performance concern with DSE/Search since a query needs to fan out internally to enough nodes (or vnodes) to cover the full range of Cassandra rows in the DC, that's a lot more sub-queries when vnodes are enabled.
But, if your performance with vnodes in a DSE/Search DC is acceptable, then you have nothing to worry about.

Isn't the answer applicable only if number of virtual nodes is greater than the actual nodes where we do not configure token ranges manually. So, can actual number of virtual nodes will be more?
If they are same, then whether actual token ranges by manually configuring or by assigning pieces of the ranges using virtual nodes to each node will eventually leave us with same number of nodes, each with a bunch of tokens.
SOLR will have to go as many nodes as number of nodes itself unless virtual nodes themselves are more.

Related

Too many documents: an index cannot exceed 2147483519 but readers have total maxDoc=2147483531

Looking for better solution to avoid lucene's hard limit on total doc's. Is there a way to increase the limit.
We are running DSE Search on one of the Datacenter and we are hitting the lucene's hard limit on number of Documents.
Possible solutions thought were:
1) Add new node, so data gets redistributed with new tokens and search can be functional. Not viable for this in our case as of now.
2) Decomission one of the node and rebuild the node by increase the num_token so that it can accomodate lucene docs in more number of partitions(**Its my assumption).
FYI: I know that DSE prefers single token for Search but my organisation is using virtual token system.
Below is the actual log from system log file.
Caused by: org.apache.lucene.index.CorruptIndexException: Too many documents: an index cannot exceed 2147483519 but readers have total maxDoc=2147483531 (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/data/cassandra/data/solr.data/keyspace.table_name/index/segments_2lj")))
Any suggestions appreciated.
You're limited by Lucene that can't have more than 2B documents in the single index. You can decrease number of documents by:
Adding new nodes to cluster (as you already mentioned);
Not indexing UDTs that are indexed as separate documents

What to do when nodes in a Cassandra cluster reach their limit?

I am studying up Cassandra and in the process of setting up a cluster for a project that I'm working on. Consider this example :
Say I setup a 5 node cluster with 200 gb space for each. That equals up to 1000 gb ( round about 1 TB) of space overall. Assuming that my partitions are equally split across the cluster, I can easily add nodes and achieve linear scalability. However, what if these 5 nodes start approaching the SSD limit of 200 gb? In that case, I can add 5 more nodes and now the partitions would be split across 10 nodes. But the older nodes would still be writing data, as they are part of the cluster. Is there a way to make these 5 older nodes 'read-only'? I want to shoot off random read-queries across the entire cluster, but don't want to write to the older nodes anymore( as they are capped by a 200 gb limit).
Help would be greatly appreciated. Thank you.
Note: I can say that 99% of the queries will be write queries, with 1% or less for reads. The app has to persist click events in Cassandra.
Usually when cluster reach its limit we add new node to cluster. After adding a new node, old cassandra cluster nodes will distribute their data to the new node. And after that we use nodetool cleanup in every node to cleanup the data that distributed to the new node. The entire scenario happens in a single DC.
For example:
Suppose, you have 3 node (A,B,C) in DC1 and 1 node (D) in DC2. Your nodes are reaching their limit. So, decided to add a new node (E) to DC1. Node A, B, C will distribute their data to node E and we'll use nodetool cleanup in A,B,C to cleanup the space.
Problem in understanding the question properly.
I am assuming you know that by adding new 5 nodes, some of the data load would be transferred to new nodes as some token ranges will be assigned to them.
Now, as you know this, if you are concerned that old 5 nodes would not be able to write due to their limit reached, its not going to happen as new nodes have shared the data load and hence these have free space now for further write.
Isolating the read and write to nodes is totally a different problem. But if you want to isolate read to these 5 nodes only and write to new 5 nodes, then the best way to do this is to add new 5 nodes in another datacenter under the same cluster and then use different consistency levels for read and write to satisfy your need to make old datacenter read only.
But the new datacenter will not lighten the data load from first. It will even take the same load to itself. (So you would need more than 5 nodes to accomplish both problems simultaneously. Few nodes to lighten the weight and others to isolate the read-write by creating new datacenter with them. Also the new datacenter should have more then 5 nodes). Best practice is to monitor data load and fixing it before such problem happen, by adding new nodes or increasing data limit.
Considering done that, you will also need to ensure that the nodes you provided for read and write should be from different datacenters.
Consider you have following situation :
dc1(n1, n2, n3, n4, n5)
dc2(n6, n7, n8, n9, n10)
Now, for read you provided with node n1 and for write you provided with node n6
Now the read/write isolation can be done by choosing the right Consistency Levels from bellow options :
LOCAL_QUORUM
or
LOCAL_ONE
These basically would confine the search for the replicas to local datacenter only.
Look at these references for more :
Adding a datacenter to a cluster
and
Consistency Levels

Can we have cassandra only nodes and solr enabled nodes in same datacenter?

I just started with solr and would like your suggestion in below scenario. We have 2 data centers with 3 nodes in each data center(both in different aws regions for location advantage). We have a requirement for which they asked me if we can have 2 solr nodes in each data center. so it will be 2 solr nodes and 1 cassandra only node in each data center. I want to understand if its fine to have this kind of setup and I am little confused whether solr nodes will have data on it along with the indexes? does all 6 nodes share data and 4 solr nodes will have indexes on it along with data? Kindly provide some information on this. Thanks.
Short answer is no, this will not work. If you turn on DSE Search on one node in a DC you need to turn it on for all the nodes in the DC.
But why??
DSE Search builds lucene indexes on the data that is stored local to a node. Say you have a 3 node DC with RF1 (the node only has 1/3rd of the data) and you only turn on search on one of the nodes. 1/3 of your search queries will fail.
So I should just turn search on everywhere?
If you have a relatively small workloads with loose SLA's (both c* and search) and/or if you are over provisioned, you may be fine turning on Search on your main Cassandra nodes. However, in many cases with heavy c* workloads and tight SLA's, Search queries will negatively affect cassandra performance (because they are contending against the same hardware).
I need search nodes in both Physical DC's
If you want search enabled only in two out of your three nodes in a physical DC, the only way to do this is to actually split up your physical DC into two logical DC's. In your case you would have:
US - Cassandra
US - Search
Singapore - Cassandra
Singapore - Search
This gives you geographic locality for your search and c* queries and also provides workload isolation between your c* and search workloads since they contend against different OS Subsystems.

What is the unidirectional property and why it helps with hotspots?

In the kademlia paper it's written that the XOR metric is unidirectional. What does it mean precisely?
More importantly in what way it alleviates the problem of a frequently queried node?
Could you explain me that from the point of view of a node? I mean, if I a hotspot am requested frequently by different nodes, do they exchange cached nodes to get to the target? Can't they just exchange the target ip?
Furthermore, it doesn't seem to me that lookups converge along the same path as written, I think its more logical that each node follows a different path wile going farther and farther from itself.
The XOR metric means that A^B gives the same distance as B^A. I'm not sure that it directly alleviates the problem of a frequently query, it's more that nodes from different addresses in the network will perceive query nodes on a search path as having different distance from themselves, thereby caching different nodes after a query completes. Subsequent queries to local nodes will be given different remote nodes in response, thereby potentially spreading the load around the DHT network somewhat.
When querying the DHT network, the more common query is to ask for data regarding a particular info hash. That's stored by the nodes with the smallest distances between their node IDs and the info hash in question. It's only when you begin querying nodes that are close to the target info hash that the IP addresses of peers start to respond with IP addresses of peers for that torrent. Nodes can't just arbitrarily return peer IPs, as that would require that all nodes store all IPs for all torrents, or that nodes perform subsequent queries on your behalf, which would be lead to suboptimal network use and be open to exploitation.
Your observation that lookups don't converge on the same path is only correct when there are a surfeit of nodes at the distance being queried. Eventually as you get closer to nodes storing data for the desired info hash, there will be fewer and fewer nodes with such proximity to the target. Thus toward the end of queries, most querying nodes will converge on similar nodes. It's worth keeping in mind that this isn't a problem. Those nodes will only be "hot" for data related to that one particular info hash as the distance between info hashes is going to be very large on average on account of the enormous size of the hash space used. Additionally, were it a popular info hash to be querying for, nodes close to that hash that aren't coping with the traffic will be penalized by the network, and returned less often by nodes on the search path.

DSE SOLR OOMing

We have had a 3 node DSE SOLR cluster running and recently added a new core. After about a week of running fine, all of the SOLR nodes are now OOMing. The fill up both the JVM Heap (set at 8GB) and the system memory. Then are also constantly flushing the memtables to disk.
The cluster is DSE 3.2.5 with RF=3
here is the solrconfig from the new core:
http://pastie.org/8973780
How big is your Solr index relative to the amount of system memory available for the OS to cache file system pages. Basically, your Solr index needs to fit in the OS file system cache (the amount of system memory available after DSE is started but has not yet processed any significant amount of data.)
Also, how many Solr documents (Cassandra rows) and how many fields (Cassandra columns) are populated on each node? There is no hard limit, but 40 to 100 million is a good guideline as an upper limit - per node.
And, how much system memory and how much JVM heap is available if you restart DSE, but before you start putting load on the server?
For RF=N, where N is the total number of nodes in the cluster or at least the search data center, all of the data will be stored on all nodes, which is okay for smaller datasets, but not okay for larger datasets.
For RF=n, this means that each node will have X/N*n rows or documents, where X is the total number of rows or documents all column families in the data center. X/N*n is the number that you should try to keep below 100 million. That's not a hard limit - some datasets and hardware might be able to handle substantially more, and some datasets and hardware might not even be able to hold that much. You'll have to discover the number that works best for your own app, but the 40 million to 100 million range is a good start.
In short, the safest estimate is for X/N*n to be kept under 40 million for Solr nodes. 100 may be fine for some data sets and beefier hardware.
As far as tuning, one common source of using lots of heap is heavy use of Solr facets and filter queries.
One technique is to use "DocValues" fields for facets since DocValues can be stored off-heap.
Filter queries can be marked as cache=false to save heap memory.
Also, the various Solr caches can be reduced in size or even set to zero. That's in solrconfig.xml.

Resources