LIBUMEM says NO memory leaks, but PRSTAT on Solaris shows leaking? - c

I have an application I have been trying to get "memory leak free", I have been through solid testing on Linux using Totalview's MemoryScape and no leaks found. I have ported the application to Solaris (SPARC) and there is a leak I am trying to find...
I have used "LIBUMEM" on Solaris and it seems to me like it als picks up NO leaks...
Here is my startup command:
LD_PRELOAD=libumem.so UMEM_DEBUG=audit ./link_outbound config.ini
Then I immediatly checked the PRSTAT on Solaris to see what the startup memory usage was:
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
9471 root 44M 25M sleep 59 0 0:00:00 1.1% link_outbou/3
Then I started to send thousands of messages to the application...and over time the PRSTAT grew..
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
9471 root 48M 29M sleep 59 0 0:00:36 3.5% link_outbou/3
And just before I eventually stopped it:
PID USERNAME SIZE RSS STATE PRI NICE TIME CPU PROCESS/NLWP
9471 root 48M 48M sleep 59 0 0:01:05 5.3% link_outbou/3
Now the interesting part is when I use LIBUMEM on this application that it showing 48 MB memory, like follows:
pgrep link
9471
# gcore 9471
gcore: core.9471 dumped
# mdb core.9471
Loading modules: [ libumem.so.1 libc.so.1 ld.so.1 ]
> ::findleaks
BYTES LEAKED VMEM_SEG CALLER
131072 7 ffffffff79f00000 MMAP
57344 1 ffffffff7d672000 MMAP
24576 1 ffffffff7acf0000 MMAP
458752 1 ffffffff7ac80000 MMAP
24576 1 ffffffff7a320000 MMAP
131072 1 ffffffff7a300000 MMAP
24576 1 ffffffff79f20000 MMAP
------------------------------------------------------------------------
Total 7 oversized leaks, 851968 bytes
CACHE LEAKED BUFCTL CALLER
----------------------------------------------------------------------
Total 0 buffers, 0 bytes
>
The "7 oversized leaks, 851968 bytes" never changes if I send 10 messages through the application or 10000 messages...it is always "7 oversized leaks, 851968 bytes". Does that mean that the application is not leaking according to "libumem"?
What is so frustrating is that on Linux the memory stays constant, never changes....yet on Solaris I see this slow, but steady growth.
Any idea what this means? Am I using libumem correctly? What could be causing the PRSTAT to be showing memory growth here?
Any help on this would be greatly appreciated....thanks a million.

If the SIZE column doesn't grow, you're not leaking.
RSS (resident set size) is how much of that memory you are actively using, it's normal that that value changes over time. If you were leaking, SIZE would grow over time (and RSS could stay constant, or even shrink).

check out this page.
the preferred option is UMEM_DEBUG=default, UMEM_LOGGING=transaction LD_PRELOAD=libumem.so.1. that is the options that I use for debugging solaris memory leak problem, and it works fine for me.
based on my experience with RedHat REL version 5 and solaris SunOS 5.9/5.10, linux process memory footprint doesn't increase gradually, instead it seems it grabs a large chunk memory when it needs extra memory and use them for a long run. (purely based on observation, haven't done any research on its memory allocation mechanism). so you should send a lot more data (10K messages are not big).
you can try dtrace tool to check memory problem at solaris.
Jack

Related

Why program is out of memory when in OS there is still plenty of?

I added some features which require a lot of memory to a C program, it crashes everytime I run it, but it works fine when reduce the memory allocation. Using GDB to debug I find some bizarre results, like assignment would change other pointers, printf doesn't work. I thought this might be memory problem, but using top to monitor memory usage I find there is still plenty of memory.
Finally I used Valgrind to monitor memory, even valgrind ran out of memory:
==3449== Valgrind's memory management: out of memory:
==3449== memcheck:allocate new SecMap's request for 16384 bytes failed.
==3449== 7,625,605,120 bytes have already been mmap-ed ANONYMOUS.
==3449== Valgrind cannot continue. Sorry.
==3449==
==3449== There are several possible reasons for this.
==3449== - You have some kind of memory limit in place. Look at the
==3449== output of 'ulimit -a'. Is there a limit on the size of
==3449== virtual memory or address space?
==3449== - You have run out of swap space.
==3449== - Valgrind has a bug. If you think this is the case or you are
==3449== not sure, please let us know and we'll try to fix it.
==3449== Please note that programs can take substantially more memory than
==3449== normal when running under Valgrind tools, eg. up to twice or
==3449== more, depending on the tool. On a 64-bit machine, Valgrind
==3449== should be able to make use of up 32GB memory. On a 32-bit
==3449== machine, Valgrind should be able to use all the memory available
==3449== to a single process, up to 4GB if that's how you have your
==3449== kernel configured. Most 32-bit Linux setups allow a maximum of
==3449== 3GB per process.
==3449==
==3449== Whatever the reason, Valgrind cannot continue. Sorry.
But using ulimit -a there is no limit on visual space:
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 96254
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 8192
cpu time (seconds, -t) unlimited
max user processes (-u) 96254
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
Using top can tell server memory config:
KiB Mem : 24678616 total, 13461572 free, 3457528 used, 7759516 buff/cache
KiB Swap: 0 total, 0 free, 0 used. 20754480 avail Mem
So what reason may caused this result?
update:
OS:
Linux astl09 4.4.0-83-generic #106-Ubuntu SMP Mon Jun 26 17:54:43 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
program is disksim 4.0. I use it on some research.
valgrind:
valgrind-3.11.0

Why am i getting a discrepancy in memory usage in my C application?

How can I understand the discrepancy between the 21 MiB of private memory bytes reported for my C application by ps_mem and the much smaller numbers reported by valgrind for current allocations.
System monitor shows below readings.
I've also used ps_mem to check used memory.
Running proc/2101/smaps outputs the following. 4432 is the pid of my application.

How to use malloc() allocates memory more than RAM in redhat?

System information: Linux version 2.6.32-573.12.1.el6.x86_64 (mockbuild#x86-031.build.eng.bos.redhat.com) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-16) (GCC) ) #1 SMP Mon Nov 23 12:55:32 EST 2015
RAM 48 GB
Problem: I want to malloc() 100 GB memory. But it fail to allocate on redhat system.
I find that 100GB can be allocated in macOS with 8 GB RAM (clang compile). I am very confuse about that.
Maybe lazy allocation described in this link? Why malloc() doesn't stop on OS X?
But why linux system can not? I try ubuntu and redhat, both fail to to that.
Result:
After investigation, I find that the following two steps will let malloc() unlimited:
echo "1" > /proc/sys/vm/overcommit_memory
ulimit -v unlimited
Possible reasons:
The system is out of physical RAM or swap space.
In 32 bit mode, the process size limit was hit.
Possible solutions:
Reduce memory load on the system.
Increase physical memory or swap space.
Check if swap backing store is full.
I think these point can help you to understand malloc failure issue.
Input validation. For example, you've asked for many gigabytes of memory in a single allocation. The exact limit (if any) differs by malloc implementation; POSIX says nothing about a maximum value, short of its type being a size_t.
Allocation failures. With a data segment-based malloc, this means brk failed. This occurs when the new data segment is invalid, the system is low on memory, or the process has exceeded its maximum data segment size (as specified by RLIMIT_DATA). With a mapping-based malloc, this means mmap failed. This occurs when the process is out of virtual memory (specified by RLIMIT_AS), has exceeded its limit on mappings, or has requested too large a mapping. Note most modern Unix systems use a combination of brk and mmap to implement malloc, choosing between the two based on the size of the requested allocation.

pthread_create fails with ENOMEM on low free memory scenario

I have a SH4 board, here are the specs...
uname -a
Linux LINUX7109 2.6.23.17_stm23_A18B-HMP_7109-STSDK #1 PREEMPT Fri Aug 6 16:08:19 ART 2010
sh4 unknown
and suppose I have eaten pretty much all the memory, and have only 9 MB left.
free
total used free shared buffers cached
Mem: 48072 42276 5796 0 172 3264
-/+ buffers/cache: 38840 9232
Swap: 0 0 0
Now, when I try to launch a single thread with default stack size (8 MB)
the pthread_create fails with ENOMEM. If I strace my test code, I can see that the function that is failing is mmap:
old_mmap(NULL, 8388608, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = -1 ENOMEM (Cannot allocate memory)
However, when I set the stack size to a lower value using ulimit -s:
ulimit -s 7500
I can now launch 10 threads. Each thread does not allocate anything, so it
is only consuming the minimum overhead (aprox. 8 kb per thread, right?).
So, my question is:
Knowing that mmap doesnt actually consume the memory,
Why is pthread_create() (or mmap) failing when memory available is below
the thread stack size ?
The VM setting /proc/sys/vm/overcommit_memory (aka. sysctl vm.overcommit_memory) controls whether Linux is willing to hand out more address space than the combined RAM+swap of the machine. (Of course, if you actually try to access that much memory, something will crash. Try a search on "linux oom-killer"...)
The default for this setting is 0. I am going to speculate that someone set it to something else on your system.
Under glibc, the default stack size for threads is 2-10 megabytes (often 8). You should use pthread_attr_setstacksize and call pthread_create with the resulting attributes object to request a thread with a smaller stack.
mmap consume address space.
Pointers have to uniquely identify a piece of "memory" (including mmap file) in memory.
32-bit pointer can only address 2/3GB memory (32bit = 2^32 = 4GB. But some address space is reserved by kernel). This address space is limited.
All threads in the process share the same address space, but different process have separate address spaces.
This is the operating system's only chance to fail the operation gracefully. If the implementation allows this operation to succeed, it could run out of memory during an operation that it cannot return a failure code for, such as the stack growing. The operating system prefers to let an operation fail gracefully than risk having to kill a completely innocent process.

Freeing of allocated memory in Solaris/Linux

I have written a small program and compiled it under Solaris/Linux platform to measure the performance of applying this code to my application.
The program is written in such a way, initially using a sbrk(0) system call, I have taken base address of the heap region. After that I have allocated 1.5 GB of memory using a malloc system call, Then I used a memcpy system call to copy 1.5 GB of content to the allocated memory area. Then, I freed the allocated memory.
After freeing, I used the sbrk(0) system call again to view the heap size.
This is where I get a little confused. In Solaris, even though I freed the memory allocated (nearly 1.5 GB), the heap size of the process is huge. But I run the same application in Linux, after freeing, I found that the heap size of the process is equal to the size of the heap memory before allocation of 1.5 GB.
I know Solaris does not free memory immediately, but I don't know how to tune the Solaris kernel to immediately free the memory after the free() system call.
Why don't I have the same problem under Linux?
I got the answer for the question that i have asked.
Application Memory Allocators:
C and C++ developers must manually manage memory allocation and free memory. The default memory allocator is in the libc library.
Libc
Note that after free()is executed, the freed space is made available for further allocation by the application and not returned to the system. Memory is returned to the system only when the application terminates. That's why the application's process size usually never decreases. But for a long-running application, the application process size usually remains in a stable state because the freed memory can be reused. If this is not the case, then most likely the application is leaking memory, that is, allocated memory is used but never freed when no longer in use and the pointer to the allocated memory is not tracked by the application—basically lost.
The default memory allocator in libc is not good for multi-threaded applications when a concurrent malloc or free operation occurs frequently, especially for multi-threaded C++ applications. This is because creating and destroying C++ objects is part of C++ application development style. When the default libc allocator is used, the heap is protected by a single heap-lock, causing the default allocator not to be scalable for multi-threaded applications due to heavy lock contentions during malloc or free operations. It's easy to detect this problem with Solaris tools, as follows.
First, use prstat -mL -p to see if the application spends much time on locks; look at the LCK column. For example:
-bash-3.2# prstat -mL -p 14052
PID USERNAME USR SYS TRP TFL DFL LCK SLP LAT VCX ICX SCL SIG PROCESS/LWPID
14052 root 0.6 0.7 0.0 0.0 0.0 35 0.0 64 245 13 841 0 test_vector_/721
14052 root 1.0 0.0 0.0 0.0 0.0 35 0.0 64 287 5 731 0 test_vector_/941
14052 root 1.0 0.0 0.0 0.0 0.0 35 0.0 64 298 3 680 0 test_vector_/181
14052 root 1.0 0.1 0.0 0.0 0.0 35 0.0 64 298 3 1K 0 test_vector_/549
....
It shows that the application spend about 35 percent of its time waiting for locks.
Then, using the plockstat(1M) tool, find what locks the application is waiting for. For example, trace the application for 5 seconds with process ID 14052, and then filter the output with the c++filt utility for demangling C++ symbol names. (The c++filt utility is provided with the Sun Studio software.) Filtering through c++filt is not needed if the application is not a C++ application.
-bash-3.2# plockstat -e 5 -p 14052 | c++filt
Mutex block
Count nsec Lock Caller
-------------------------------------------------------------------------------
9678 166540561 libc.so.1‘libc_malloc_lock libCrun.so.1‘void operator
delete(void*)+0x26
5530 197179848 libc.so.1‘libc_malloc_lock libCrun.so.1‘void*operator
new(unsigned)+0x38
......
From the preceding, you can see that the heap-lock libc_malloc_lock is heavily contended for and is a likely cause for the scaling issue. The solution for this scaling problem of the libc allocator is to use an improved memory allocator like the libumem library.
Also visit: http://developers.sun.com/solaris/articles/solaris_memory.html
Thanks for all who tried to answer my question,
Santhosh.

Resources