outOfMemoryException while reading excel data - jexcelapi

I am trying to read data from an excel file(xlsx format) which is of size 100MB. While reading the excel data I am facing outOfMemoryException. Tried by increasing the JVM heap size to 1024MB but still no use and I cant increase the size more than that. Also tried by running garbage collection too but no use. Can any one help me on this to resolve my issue.
Thanks
Pavan Kumar O V S.

By default a JVM places an upper limit on the amount of memory available to the current process in order to prevent runaway processes gobbling system resources and making the machine grind to a halt. When reading or writing large spreadsheets, the JVM may require more memory than has been allocated to the JVM by default - this normally manifests itself as a java.lang.OutOfMemoryError.
For command line processes, you can allocate more memory to the JVM using the -Xms and -Xmx options eg. to allocate an initial heap allocation of 10 MB, with 100 MB as the upper bound you can use:
java -Xms10m -Xmx100m -classpath jxl.jar spreadsheet.xls
You can refer to http://www.andykhan.com/jexcelapi/tutorial.html#introduction for further details

Related

Flink rocksdb per slot memory configuration issue

I have 32GB of managed memory and 8 task slots.
As state.backend.rocksdb.memory.managed is set to true, each rockdb in each task slot uses 4GB of memory.
Some of my tasks do not require a rocksdb backend so I want to increase this 4GB to 6GB by setting state.backend.rocksdb.memory.fixed-per-slot: 6000m
The problem when I set state.backend.rocksdb.memory.fixed-per-slot: 6000m is on Flink UI, in task manager page, I cant see the allocated managed memory anymore.
As you can see when state.backend.rocksdb.memory.fixed-per-slot is not set and state.backend.rocksdb.memory.managed: true, 4GB usage appears on managed memory for each running task which uses rocksdb backend.
But after setting state.backend.rocksdb.memory.fixed-per-slot: 6000m , Managed Memory always shows zero!
1- How can I watch the managed memory allocation after setting state.backend.rocksdb.memory.fixed-per-slot: 6000m
2- Should state.backend.rocksdb.memory.managed be set to true even I set fixed-per-slot.
Another reply we got from the hive:
"Fixed-per-slot overrides managed memory setting, so managed zero is expected (it's either fixed-per-slot or managed). As Yuval wrote you can see the memory instances by checking the LRU caches.
One more thing to check is write_buffer_manager pointer in the RocksDB log file. It will be different for each operator if neither fixed-per-slot or managed memory is used and shared between instances otherwise"
Let us know if this is useful
Shared your question with the Speedb hive on Discord and here's the "honey" we got for you:
We don't have much experience with Flink setups regarding how to configure the memory limits and their different parameters. However, RocksDB uses a shared Block Cache to control the memory limits of your state. so for question 1 - you could grep "block_cache:" and "capacity :" from all the LOG files of all the DBs (operators). the total memory limit allocated to RocksDB through the block cache would be the sum of the capacity for all the unique pointers. the same block cache (memory) can be shared across DBs.
do note that RocksDB might use more memory than the block cache capacity.
Hope this help. If you have follow-up questions or want more help with this, send us a message on Discord.

Can I force memory usage of vespa-proton-bin?

I found vespa-proton-bin already used 68GB memory of my system. I've tried to limit memory on docker level and found that it will randomly kill process, which can be a huge problem.
Is there any setting to force it just using certain amount of memory on vespa-proton-bin in Vespa setting? Thanks.
Great question!
There is no explicit way to tell Vespa to only use x GB of memory but default Vespa will block feeding if 80% of the memory is in use already, see https://docs.vespa.ai/documentation/writing-to-vespa.html#feed-block. Using docker limits is only going to cause random OOM kills which is not what you want.
I'm guessing that you have a lot of attribute fields which are in-memory structures , see https://docs.vespa.ai/documentation/performance/attribute-memory-usage.html.

Tomcat 6 Memory Consumption

Tomcat memory is keep on increasing for every minute. Currently Max limit is set as 1024 MB. If i increase the max limit then Tomcat is not starting. Please let mw know is there any way to reduce Memory usage for Tomcat.
I assume, when you say max limit, it means you have set maximum heap size (i.e., -Xmx1024M) as one of your options in catalina.bat.
It may be that your machine does not have enough RAM, and hence with a higher value of heap size, Tomcat fails to start (or other processes are running and hence Tomcat does not gets enough memory to start up).
With 1GB (the value you currently have), set below flags as well to have more information when the server goes out of memory and to enable concurrent garbage collection :
-XX:+UseConcMarkSweepGC -XX:+HeapDumpOnOutOfMemoryError
Also, 'jvisualvm.exe' is the best way to start analyzing your server as to where actually memory leak is.

Need a way to optimize heap usage C - Freebsd/glibc

I have an application that reads an external xml file, parses the content and creates a list.
The file keeps getting updated frequently by external vendors. So i do not have any control on the number of entries/contents of the file. My problem is that my application is using an inordiante amount of memory as evidenced in the ps Aux output (RSS coloumn). I traced it to the fact that while parsing the file, my application links to libxml2.so.3.
It does it for parsing the xml file and the outline of the culprit function is as below:
processxml_func( ) {
1) xmlReadFile(app_xml, NULL, 0); --> This internally allocated around 7MB of data and is a purely libxml call. So i have no control on its memory allocations.
and then code to
2) parse the xml contents --> again using libxml calls and create the list(using malloc) . List has members like name(variable length), description (again variable length etc). This list is long living and needs to live until the program terminates.
3) Free the memory allocated in 1.
processxml_func_end
Using pmap i traced that after the call to processxml_func, my programs heap memory has incresed by ~10 MB. Single tracing through gdb, i noticed that :
step1: allocated 7MB --> usage of the heap is 7MB
step2: allocates 3 MB --> usage of the heap is 7+3 MB
step3: free memory in step1 --> usage is still 10MB
The reason i feel for the heap usage to be still be 10MB after 3 is because , during 2 we allocated some of the memory for step2 from the fragmented blocks/remnants in step1 and thus
we are not able to free the entire 7MB even after step3. I commented out step2 and checked that
the heap usage is close to zero after we exit from the function.
Now given the above, i need suggestions/tweaks to reduce the heap foot print of my application.
One approach i was thinking was to create a new process (fork), for processxml_func, then
use IPC to transfer over the list to the parent process(reconstruction of the list again needed in the parent process) and kill off the child process.
Just wondering, a) if there is a better wa of doing this
b) some flags to control malloc behaviour?
Thanks for your time.
My environment:
C language/ FreeBsd , /usr/lib/libc.so.6
The general solution is not to read the complete file to memory. Instead, read a chunk, extract what you need for this, read some more, extract some more. Your XML parser must support this mode.
Googling "streaming XML parser" gives many results, which seems to indicate that such are not uncommon.
P.S. 10MB looks like almost nothing. I wouldn't worry, unless the file is expected to grow to the GB area.
Mini-XML (http://www.msweet.org/projects.php?Z3)
I recommend you this xml parser, i use this xml-lib in my docsis-NMS application (under Debian 7.0) almost since 2 years. I think its documentation is very good, easy to learn, and the API is stable.

Uploading Large(8GB) File Issue using Weka

I am trying to upload a 8GB file to weka for usage of Apriori Algorithm. The server configuration is as follows :-
Its 8 processor server with 4 cores in each physical address space = 40bits and virtual address space =48 bits. Its a 64 bits processor.
Physical Memory =26GB and SWAP =27GB
JVM = 64bit. We have allocated 32GB for JVM Heap using XmX option. Our concern is that the loading of such a huge file is taking a very long time(around 8 hours) and java is utilizing 107% CPU and 91% memory and it has not shown Out of memory exception and weka is showing reading from file.
Please help me how do I handle huge file and what exactly is happening here?
Reagards,
Aniket
I can't speak to Weka, I don't know your data set, or how many elements are in it. The number of elements matter as in a 64b JVM, the pointers are huge, and they add up.
But do NOT create a JVM larger than physical RAM. Swap is simply not an option for Java. A swapping JVM is a dead JVM. Swap is for idle processes rarely used.
Also note that the Xmx value and the physical heap size are not the same, physical size will always be larger than the Xmx size.
You should pre-allocate your JVM heap (Xms == Xmx) and try out various values until MOST of your physical RAM is consumed. This will limit full GCs and memory fragmentation. It also helps (a little) to do this on a fresh system if you're allocating such a large portion of the total memory space.
But whatever you do, do not let Java swap. Swapping and Garbage Collectors do not mix.

Resources