I am trying to profile cassandra on a single node cluster to see how much one node can handle inserts and then add more node as per this result.
I have changed certain parameters in cassandra.yaml. They are as follows.
memtable_offheap_space_in_mb: 4096
memtable_allocation_type: offheap_buffers
concurrent_compactors: 8
compaction_throughput_mb_per_sec: 32768
concurrent_reads: 64
concurrent_writes: 128
concurrent_counter_writes: 128
write_request_timeout_in_ms: 200000
Cassandra node: JVM heap size 12GB
I have added these parameters to the cassandra C++ driver APIs
cass_cluster_set_num_threads_io(cluster, 8);
cass_cluster_set_core_connections_per_host(cluster,8);
cass_cluster_set_write_bytes_high_water_mark(cluster,32768000);
cass_cluster_set_pending_requests_high_water_mark(cluster,16384000);
With these parameters i get a write speed of 13k/sec with data size of 1250 bytes.
I wanted to know am i missing out on anything in terms of parameter tuning to achieve a better performance.
Cassandra DB node details:
VM
CentOs 6
16GB RAM
8 cores. And is running on a separate box from the machine i am pumping data.
Any insight will be highly appreciated.
Related
While upgrading the solr from version 6.5 to 8.7, we observe the query time has been increased by 40%.
On solr 8.7 the difference between optimized and unoptimized index is also
very huge. 350 ms on optimized and 650 ms on unoptimized. The difference is
only 5 GB in size in cores of optimized and unoptimized. The segment count
in the optimized index is 1 and 20 in the unoptimized index.
I wanted to ask, Is this normal behavior on solr 8.7, or was there some
setting that we forgot to add? Pleas also tell us how can we reduce the
response time in unoptimzed core.
Specifications
We are using master slave architecture, Polling interval is 3 hours
RAM- 96 GB
CPU-14
Heap-30 GB
Index Size-95 GB
Segments size-20
Merge Policy :
mergePolicyFactory : org.apache.solr.index.TieredMergePolicyFactory
maxMergeAtOnce : 5
segmentsPerTier : 3
In Solr 8 the maxSegmentSizeMB is honored. If your index is way larger than 5GB, this means in Solr 6 the number of segments is few, but more in Solr 8 because of the size limitation per segment.
The more opened segments in runtime mean a request (search terms) must be looked up in more index segments. Furthermore, the memory allocation will be higher too and cause the GC to become more frequent.
Recently I have been facing cluster restart(outside maintenance window/arbitrary) in AWS Redshift that has been triggered from AWS end. They are not able to identify what is the exact root cause of this reboot. The error that AWS team captured is "out of object memory".
In the meantime, I am trying to scale up the cluster size to avoid this out of object memory(as a blind try), Currently I am using ds2.xlarge node type but I am not sure which of below I need to increase/choose?
Many smaller nodes (increase number of nodes in ds2.xlarge)
Few larger nodes (change to ds2.8xlarge and have less number but increased capacity)
Anyone faced similar issue in Redshift? Any advise?
Going with the configuration, for better performance in this case you should opt for ds2.8xlarge cluster type.
One ds2.xlarge cluster has 13 gb of RAM and 2 slice to perform your workload as compared with ds2.8xlarge which has 244 gb of RAM and 16 slices to perform your workloads.
Now even if you choose 8 ds2.xlarge nodes you will get max 104 GB memory against 244 GB in one node of ds2.8xlarge.
So you should go with ds2.8xlarge node type for handling memory issue along with large amount of storage
We need data capacity per node based on below System & Cluster information.
Could you please tell us what is (rough) data capacity per (solr) node? (without changing System & node number)
System Information per node (DSE)
CPU: 2CPUs/ 16 cores
Memory: 32 GB
HDD: 1TB
(DSE)Solr Heap Size: 15 GB
DSE Cluster information
Total Node# (all solr nodes): 4
Average Data Size per node: 24GB
Average (solr) index size per node: 11GB
Replication Factor: 2
DSE version: 4.8.3
Is Solr Heap Size (15GB) enough to support Big data (e.g 100GB solr data) during general Solr operation (e.g Query or indexing) ?
P.s: If you have any capacity formula / calculating tool please let me know.
Q - I am forced to set Java Xmx as high as 3.5g for my solr app.. If i keep this low, my CPU hits 100% and response time for indexing increases a lot.. And i have hit OOM Error as well when this value is low..
Is this too high? If so, can I reduce this?
Machine Details
4 G RAM, SSD
Solr App Details (Standalone solr app, no shards)
num. of Solr Cores = 5
Index Size - 2 g
num. of Search Hits per sec - 10 [IMP - All search queries have faceting..]
num. of times Re-Indexing per hour per core - 10 (it may happen at
the same time at a moment for all the 5 cores)
Query Result Cache, Document cache and Filter Cache are all default size - 4 kb.
top stats -
VIRT RES SHR S %CPU %MEM
6446600 3.478g 18308 S 11.3 94.6
iotop stats
DISK READ DISK WRITE SWAPIN IO>
0-1200 K/s 0-100 K/s 0 0-5%
Try either increasing the RAM size or increasing the frequency of Index Rebuilt. If you are rebuilding the index 10 times in an hours, then Solr may not be the right choice. Solr Index tries to give faster results by keeping the index files in the OS memory.
Solr always use more than 90% of physical memory
I know solr search is I/O bound, if I have 4 node cluster and have an index separated into 4 blocks, which architecture below will have a better search performance :
1) Have 4 solr instances running in ONE single node and put each block of index over these 4 solr instances
2) Have a solr instances running in each node, hence total of 4-node cluster, and put each block of index into each solr instance.
Thanks!
The 2nd option will probably better and I explain.
Solr cores is a java program that contains few cache objects. When you put 4 Solr cores on the same node, they will use the same JVM RAM and the same CPU.
In the 1st oprion, the same JVM will need run 4 Solr cores and to collect the garbage of 4 cores instead of 1.
When you use 4 different nodes (4 JVMs) you will probably get better performance even if you host the 4 nodes on the same physical machine.