Redshift many small nodes vs less numbers of bigger nodes

Redshift many small nodes vs less numbers of bigger nodes - database

Recently I have been facing cluster restart(outside maintenance window/arbitrary) in AWS Redshift that has been triggered from AWS end. They are not able to identify what is the exact root cause of this reboot. The error that AWS team captured is "out of object memory".
In the meantime, I am trying to scale up the cluster size to avoid this out of object memory(as a blind try), Currently I am using ds2.xlarge node type but I am not sure which of below I need to increase/choose?
Many smaller nodes (increase number of nodes in ds2.xlarge)
Few larger nodes (change to ds2.8xlarge and have less number but increased capacity)
Anyone faced similar issue in Redshift? Any advise?

Going with the configuration, for better performance in this case you should opt for ds2.8xlarge cluster type.
One ds2.xlarge cluster has 13 gb of RAM and 2 slice to perform your workload as compared with ds2.8xlarge which has 244 gb of RAM and 16 slices to perform your workloads.
Now even if you choose 8 ds2.xlarge nodes you will get max 104 GB memory against 244 GB in one node of ds2.8xlarge.
So you should go with ds2.8xlarge node type for handling memory issue along with large amount of storage

Related

Why AppEngine can't load a big dictionary in its memory?

I have a BFF service that reaches a microservice, and the second one pulls a long list from the DB, formats it and return the list to the BFF.
When I'm trying to run it through AppEngine I receive the following error:
Exceeded hard memory limit of 256 MB with XXX MB after servicing 0 requests total. Consider setting a larger instance class in app.yaml.
Where XXX is a different number each time, starting from 266 MB.
I tried to stop using pydantic (since it takes a lot of memory), to scale the instance to a huge machine, but the problem remains.
So I've copied the response (as I can run it locally) and copied it into the BFF (== skipping the whole microservice logic and store the response as a constant dictionary in the BFF).
And then, when the BFF has no logic besides loading a constant variable, I've received again the following error:
Exceeded hard memory limit of 256 MB with 919 MB after servicing 0 requests total. Consider setting a larger instance class in app.yaml.
The file that contains the data is a 9 MB file, the response that we create is around 3 MB, but it seems that the AppEngine can't really handle loading this dictionary to its memory in the BFF as well.
As there is no memory-profiling tool for AppEngine I'm not really sure what DOES take the memory and how can I make it work, any ideas?
Thank you!

Apparently in python the dictionary size is also filled with metadata, and when the dictionary is big and has a complicated hierarchy - the size might raise exponentially.
That was the reason why a 9 MB data became a over 250 MB object in runtime.

Why am I sometimes getting an OOM when getting all documents from a 800MB index with 8GB of heap?

I need to refresh an index governed by SOLR 7.4. I use SOLRJ to access it on a 64 bit Linux machine with 8 CPUs and 32GB of RAM (8GB of heap for the indexing part and 24GB for SOLR server). The index to be refreshed is around 800MB in size and counts around 36k documents (according to Luke).
Before starting the indexing process itself, I need to "clean" the index and remove the Documents that do not match an actual file on disk (e.g : a document had been indexed previously and has moved since then, so user won't be able to open it if it appears on the result page).
To do so I first need to get the list of Document in index :
final SolrQuery query = new SolrQuery("*:*"); // Content fields are not loaded to reduce memory footprint
query.addField(PATH_DESCENDANT_FIELDNAME);
query.addField(PATH_SPLIT_FIELDNAME);
query.addField(MODIFIED_DATE_FIELDNAME);
query.addField(TYPE_OF_SCANNED_DOCUMENT_FIELDNAME);
query.addField("id");
query.setRows(Integer.MAX_VALUE); // we want ALL documents in the index not only the first ones
SolrDocumentList results = this.getSolrClient().
query(query).
getResults(); // This line sometimes gives OOM
When the OOM appears on the production machine, it appears during that "index cleaning" part and the stack trace reads :
Exception in thread "Timer-0" java.lang.OutOfMemoryError: Java heap space
at org.noggit.CharArr.resize(CharArr.java:110)
at org.noggit.CharArr.reserve(CharArr.java:116)
at org.apache.solr.common.util.ByteUtils.UTF8toUTF16(ByteUtils.java:68)
at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:868)
at org.apache.solr.common.util.JavaBinCodec.readStr(JavaBinCodec.java:857)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:266)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
at org.apache.solr.common.util.JavaBinCodec.readSolrDocument(JavaBinCodec.java:541)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:305)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
at org.apache.solr.common.util.JavaBinCodec.readArray(JavaBinCodec.java:747)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:272)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
at org.apache.solr.common.util.JavaBinCodec.readSolrDocumentList(JavaBinCodec.java:555)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:307)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
at org.apache.solr.common.util.JavaBinCodec.readOrderedMap(JavaBinCodec.java:200)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:274)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:256)
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:178)
at org.apache.solr.client.solrj.impl.BinaryResponseParser.processResponse(BinaryResponseParser.java:50)
at org.apache.solr.client.solrj.impl.HttpSolrClient.executeMethod(HttpSolrClient.java:614)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:255)
at org.apache.solr.client.solrj.impl.HttpSolrClient.request(HttpSolrClient.java:244)
at org.apache.solr.client.solrj.SolrRequest.process(SolrRequest.java:194)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:942)
at org.apache.solr.client.solrj.SolrClient.query(SolrClient.java:957)
I've aleady removed the content fields from the query because there were already OOMs, so I thought only storing "small" data would avoid OOMs, but they are still there. Moreover as I started the project for the customer we had only 8GB of RAM (so heap of 2GB), then we increased it to 20GB (heap of 5GB), and now to 32GB (heap of 8GB) and the OOM still appears, although the index is not that large compared to what is described in other SO questions (featuring millions of documents).
Please note that I cannot reproduce it on my dev machine less powerful (16GB RAM so 4GB of heap) after copying the 800 MB index from the production machine to my dev machine.
So to me there could be a memory leak. That's why I followed Netbeans post on Memory Leaks on my dev machine with the 800MB index. From what I see I guess there is a memory leak since indexing after indexing the number of surviving generation keeps increasing during the "index cleaning" (steep lines below) :
What should I do, 8GB of heap is already a huge quantity heap compared to the index characteristics ? So increasing the heap does not seem to make sense because the OOM only appears during the "index cleaning" not while actually indexing large documents, and it seems to be caused by the surviving generations, doesn't it ? Would creating a query object and then applying getResults on it would help the Garbage COllector ?
Is there another method to get all document paths ? Or maybe retrieving them chunk by chunk (pagination) would help even for that small amount of documents ?
Any help appreciated

After a while I finally came across this post. It exactly describe my issue
An out of memory (OOM) error typically occurs after a query comes in with a large rows parameter. Solr will typically work just fine up until that query comes in.
So they advice (emphasize is mine):
The rows parameter for Solr can be used to return more than the default of 10 rows. I have seen users successfully set the rows parameter to 100-200 and not see any issues. However, setting the rows parameter higher has a big memory consequence and should be avoided at all costs.
And this is what I see while retrieving 100 results per page :
The number of surviving generations has decreased dramatically although garbage collector's activity is much more intensive and computation time is way greater. But if this is the cost for avoiding OOM this is OK (see the program looses some seconds per index updates which can last several hours) !
Increasing the number of rows to 500 already makes the memory leak happens again (number of surviving generations increasing) :
Please note that setting the row number to 200 did not cause the number of surviving generations to increase a lot (I did not measure it), but did not perform much better in my test case (less than 2%) than the "100" setting :
So here is the code I used to retrieve all documents from an index (from Solr's wiki) :
SolrQuery q = (new SolrQuery(some_query)).setRows(r).setSort(SortClause.asc("id"));
String cursorMark = CursorMarkParams.CURSOR_MARK_START;
boolean done = false;
while (! done) {
q.set(CursorMarkParams.CURSOR_MARK_PARAM, cursorMark);
QueryResponse rsp = solrServer.query(q);
String nextCursorMark = rsp.getNextCursorMark();
doCustomProcessingOfResults(rsp);
if (cursorMark.equals(nextCursorMark)) {
done = true;
}
cursorMark = nextCursorMark;
}
TL;DR : Don't use a number too large for query.setRows ie not greater than 100-200 as a higher number may very much likely cause an OOM.

One of the OSB node has high response time

We have two osb nodes in cluster. One of node osb1 has less ovearall response time ( 1 sec) when measured in appdynamics, another node osb2 has high response(20sec). We brought down each of this node and tested individually. We see same behavior. Any suggestions on what to look into to identify the issue.? The osb configuration across both the nodes Is identical and jvm configuration also identical. Heap usage is same. CPU bit differs.

Cassandra insert performance on a single node C APIs

I am trying to profile cassandra on a single node cluster to see how much one node can handle inserts and then add more node as per this result.
I have changed certain parameters in cassandra.yaml. They are as follows.
memtable_offheap_space_in_mb: 4096
memtable_allocation_type: offheap_buffers
concurrent_compactors: 8
compaction_throughput_mb_per_sec: 32768
concurrent_reads: 64
concurrent_writes: 128
concurrent_counter_writes: 128
write_request_timeout_in_ms: 200000
Cassandra node: JVM heap size 12GB
I have added these parameters to the cassandra C++ driver APIs
cass_cluster_set_num_threads_io(cluster, 8);
cass_cluster_set_core_connections_per_host(cluster,8);
cass_cluster_set_write_bytes_high_water_mark(cluster,32768000);
cass_cluster_set_pending_requests_high_water_mark(cluster,16384000);
With these parameters i get a write speed of 13k/sec with data size of 1250 bytes.
I wanted to know am i missing out on anything in terms of parameter tuning to achieve a better performance.
Cassandra DB node details:
VM
CentOs 6
16GB RAM
8 cores. And is running on a separate box from the machine i am pumping data.
Any insight will be highly appreciated.

What is the length of time to send a list of 200,000 integers from a client's browser to an internet sever?

Over the connections that most people in the USA have in their homes, what is the approximate length of time to send a list of 200,000 integers from a client's browser to an internet sever (say Google app engine)? Does it change much if the data is sent from an iPhone?
How does the length of time increase as the size of the integer list increases (say with a list of a million integers) ?
Context: I wasn't sure if I should write code to do some simple computations and sorting of such lists for the browser in javascript or for the server in python, so I wanted to explore this issue of how long it takes to send the output data from a browser to a server over the web in order to help me decide where (client's browser or app engine server) is the best place for such computations to be processed.
More Context:
Type of Integers: I am dealing with 2 lists of integers. One is a list of ids for the 200,000 objects whose integers look like {0,1,2,3,...,99,999}. The second list of 100,000 is just single digits {...,4,5,6,7,8,9,0,1,...} .
Type of Computations: From the browser a person will create her own custom index (or rankings) based changing the weights associated to about 10 variables referenced to the 100,000 objects. INDEX = w1*Var1 + w2*Var2 + ... wNVarN. So the computations refer to vector (array) multiplication to a scalar and addition of 2 vectors, as well as sorting the final INDEX variable vector of 100,000 values.

In a nutshell...
This is probably a bad idea,
in particular with/for mobile devices where, aside from the delay associated with transfer(s), limits and/or extra fees associated with monthly volumes exceeding various plans limits make this a lousy economical option...
A rough estimate (more info below) is that the one-way transmission takes between 0.7 and and 5 seconds.
There is a lot of variability in this estimate, due mainly to two factors
Network technology and plan
compression ratio which can be obtained for a 200k integers.
Since the network characteristics are more or less a given, the most significant improvement would come from the compression ratio. This in turn depends greatly on the statistic distribution of the 200,000 integers. For example, if most of them are smaller than say 65,000, it would be quite likely that the list would compress to about 25% of its original size (75% size reduction). The time estimates provided assumed only a 25 to 50% size reduction.
Another network consideration is the availability of binary mime extension (8 bits mime) which would avoid the 33% overhead of B64 for example.
Other considerations / idea:
This type of network usage for iPhone / mobile devices plans will not fare very well!!!
ATT will love you (maybe), your end-users will hate you at least the ones with plan limits, which many (most?) have.
Rather than sending one big list, you could split the list over 3 or 4 chunks, allowing the server-side sorting to take place [mostly] in parallel to the data transfer.
One gets better compression ratio for integers when they are [roughly] sorted, maybe you can have a first pass sorting of some kind client-side.
How do I figure? ...
1) Amount of data to transfer (one-way)
200,000 integers
= 800,000 bytes (assumes 4 bytes integers)
= 400,000 to 600,000 bytes compressed (you'll want to compress!)
= 533,000 to 800,000 bytes in B64 format for MIME encoding
2) Time to upload (varies greatly...)
Low-end home setup (ADSL) = 3 to 5 seconds
broadband (eg DOCSIS) = 0.7 to 1 second
iPhone = 0.7 to 5 seconds possibly worse;
possibly a bit better with high-end plan
3) Time to download (back from server, once list is sorted)
Assume same or slightly less than upload time.
With portable devices, the differential is more notable.
The question is unclear of what would have to be done with the resulting
(sorted) array; so I didn't worry to much about the "return trip".
==> Multiply by 2 (or 1.8) for a safe estimate of a round trip, or inquire
about specific network/technlogy.

By default, typically integers are stored in a 32-bit value, or 4 bytes. 200,000 integers would then be 800,000 bytes, or 781.25 kilobytes. It would depend on the client's upload speed, but at 640Kbps upload, that's about 10 seconds.

well that is 800000 bytes or 781.3 kb, or you could say the size of a normal jpeg photo. for broadband, that would be within seconds, and you could always consider compression (there are libraries for this)
the time increases linearly for data.

Since you're sending the data from JavaScript to the server, you'll be using a text representation. The size will depend a lot on the number of digits in each integer. Are talking about 200,000 two to three digit integers or six to eight integers? It also depends on if HTTP compression is enabled and if Safari on the iPhone supports it (I'm not sure).
The amount of time will be linear depending on the size. Typical upload speeds on an iPhone will vary a lot depending on if the user is on a business wifi, public wifi, home wifi, 3G, or Edge network.
If you're so dependent on performance perhaps this is more appropriate for a native app than an HTML app. Even if you don't do the calculations on the client, you can send/receive binary data and compress it which will reduce time.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Redshift many small nodes vs less numbers of bigger nodes - database

Related

Why AppEngine can't load a big dictionary in its memory?

Why am I sometimes getting an OOM when getting all documents from a 800MB index with 8GB of heap?

One of the OSB node has high response time

Cassandra insert performance on a single node C APIs

What is the length of time to send a list of 200,000 integers from a client's browser to an internet sever?

Categories

Resources