I'm running valgrind on my program and I'm getting the following output (I'm gonna omit the 83 errors above this, let me know if I should include them in the log):
==9723== LEAK SUMMARY:
==9723== definitely lost: 0 bytes in 0 blocks
==9723== indirectly lost: 0 bytes in 0 blocks
==9723== possibly lost: 4,676 bytes in 83 blocks
==9723== still reachable: 88,524 bytes in 579 blocks
==9723== suppressed: 0 bytes in 0 blocks
==9723== Reachable blocks (those to which a pointer was found) are not shown.
==9723== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==9723==
==9723== For counts of detected and suppressed errors, rerun with: -v
==9723== ERROR SUMMARY: 83 errors from 83 contexts (suppressed: 3 from 3)
This is the output I get from valgrind no matter how long I run my program, whether it's 2 seconds or 2 minutes.
Since 'possibly lost' doesn't increase over time, is it safe to assume that I do not have a memory leak?
The errors all seem to come from libglib and revolve around g_malloc0 and g_realloc.
Possibly lost errors in valgrind cover a subset of scenarios involving pointer chains. I would definitely chase the cause of this down, until you can confirm it's not an issue (at the very least, your memory footprint shouldn't be growing), since it can indicate other logic problems in your code.
This post has an answer that addresses it in more detail.
For more information, you can also have a look at the relevant section in the valgrind manual.
Related
I am currently working on unit tests with glib-Testing for a C library I am writing. Part of these tests check that code fails on expected occasions (I am used to these sort of tests from Python where you would assert a certain Exception was raised). I am using the recipe in the manual for glib-Testing for g_test_trap_subprocess () (see minimal example below) which works fine from the unit-testing point of view and gives the correct tests.
My problem is when I run valgrind on the following minimal example (test_glib.c):
#include <glib.h>
void test_possibly_lost(){
if (g_test_subprocess()){
g_assert(1 > 2);
}
g_test_trap_subprocess(NULL, 0, 0);
g_test_trap_assert_failed();
}
int main(int argc, char **argv){
g_test_init(&argc, &argv, NULL);
g_test_add_func("/set1/test", test_possibly_lost);
return g_test_run();
}
compiled with
gcc `pkg-config --libs --cflags glib-2.0` test_glib.c
The output of valgrind --leak-check=full ./a.out is then
==15260== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==15260== Using Valgrind-3.14.0 and LibVEX; rerun with -h for copyright info
==15260== Command: ./a.out
==15260==
/set1/test: OK
==15260==
==15260== HEAP SUMMARY:
==15260== in use at exit: 24,711 bytes in 40 blocks
==15260== total heap usage: 2,507 allocs, 2,467 frees, 235,121 bytes allocated
==15260==
==15260== 272 bytes in 1 blocks are possibly lost in loss record 36 of 40
==15260== at 0x483AB65: calloc (vg_replace_malloc.c:752)
==15260== by 0x4012AC1: allocate_dtv (in /usr/lib/ld-2.29.so)
==15260== by 0x4013431: _dl_allocate_tls (in /usr/lib/ld-2.29.so)
==15260== by 0x4BD51AD: pthread_create##GLIBC_2.2.5 (in /usr/lib/libpthread-2.29.so)
==15260== by 0x48BE42A: ??? (in /usr/lib/libglib-2.0.so.0.6000.6)
==15260== by 0x48BE658: g_thread_new (in /usr/lib/libglib-2.0.so.0.6000.6)
==15260== by 0x48DCBF0: ??? (in /usr/lib/libglib-2.0.so.0.6000.6)
==15260== by 0x48DCC43: ??? (in /usr/lib/libglib-2.0.so.0.6000.6)
==15260== by 0x48DCD11: g_child_watch_source_new (in /usr/lib/libglib-2.0.so.0.6000.6)
==15260== by 0x48B7DF4: ??? (in /usr/lib/libglib-2.0.so.0.6000.6)
==15260== by 0x48BEA93: g_test_trap_subprocess (in /usr/lib/libglib-2.0.so.0.6000.6)
==15260== by 0x1091DD: test_possibly_lost (in /dir/to/aout/a.out)
==15260==
==15260== LEAK SUMMARY:
==15260== definitely lost: 0 bytes in 0 blocks
==15260== indirectly lost: 0 bytes in 0 blocks
==15260== possibly lost: 272 bytes in 1 blocks
==15260== still reachable: 24,439 bytes in 39 blocks
==15260== suppressed: 0 bytes in 0 blocks
==15260== Reachable blocks (those to which a pointer was found) are not shown.
==15260== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==15260==
==15260== For counts of detected and suppressed errors, rerun with: -v
==15260== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
The possibly lost memory bothers me, coincidentally my code also possibly loses 272 bytes so I think this might be a problem with the way I use glib rather than my own structs. Personally, I would treat possibly lost memory as definitely lost and I would like to get rid of it.
So my question is whether there is a free that I could cleverly insert to free the memory, a different recipe to check for failed asserts or are these lost 272 bytes just something I will have to live with?
That's a somewhat odd stack trace for the allocation. g_test_trap_subprocess() is supposed to run the specified test(s) in a subprocess, but it is creating a thread. These are not mutually exclusive -- a subprocess may well also be forked -- but mixing threads with forking is a tricky, finicky business.
In any event, the trace seems to indicate that the problem arises from glib starting a thread that is not properly terminated and cleaned up before your program exits. Since the issue is with an internal thread, the best solution would involve calling an appropriate shutdown function. I don't see such a function documented either specifically for g_test or more generally among the GLib utility functions, nor do I see any documentation of a need to call such a function, so I'm going to attribute the issue to a minor flaw in Glib.
Unless you can find a glib-based solution that I missed, your best alternative is probably to accept that what you're seeing is a glib quirk, and to write a Valgrind suppression file that you can then use to instruct Valgrind not to report on it. Note that although you can write such a file by hand, using the information provided in the leak report, the easiest way to get one is to run Valgrind with the --gen-suppressions=yes option. However you get it, you can instruct valgrind to use it on subsequent runs by using a --suppressions=/path/to/file.supp option on the Valgrind command line.
Do consult the Valgrind manual (linked above) for details on suppression files, including format, how to create and modify them, and how to use them.
Here is a very simple program I wrote to show the differences between the valgrind outputs on Mac (El Capitan) and Linux Mint 17.2.
Is there a way get the same type of output on the mac? I dont understand why it shows more heap usage on the mac than it does on Linux?
For a strange reason Linux Mint shows the memory being freed, whereas OSX doesn't
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char const *argv[]) {
char *str = (char *)malloc(15);
strcpy(str, "Hello World!");
printf("%s\n", str);
free(str);
return 0;
}
Linux Mint 17.2
Mac OSX El Capitan
The C standard library and runtime do stuff before main is called, and printf also does stuff internally. This "stuff" may include memory allocations. This is implementation-specific, so it's no surprise that completely different implementations show different amount of allocations.
Then, when program exits, it might not actually be necessary to free any heap allocations, because when process terminates, the heap will be gone, poof, removed by the OS (in a desktop operating system like both of above are). The application code should free all memory it allocates (because of portability, and because then you can actually use tools like valgrind, and because it's "clean"). But for platform and compiler specific library code, it would just slow down the exit of every program, for no gain. So library not doing it is basically an optimization, which you normally shouldn't do in your own program (unless you can actually measure that it makes a difference somewhere).
So tools like valgrind generally contain suppression lists for known un-freed memory blocks. You can also configure your own suppression lists for any libraries which you use, and which don't release all memory on program exit. But when working with suppressions, better be sure you are suppressing safe cases, and not hiding actual memory leaks.
Speculation: Because here the difference in number of allocations is quite big, one might hazard a guess, that Linux implementation uses only static/global variables, where Mac implementation also uses heap allocations. And the actual data stored there might include things like stdin/stdout/stderr buffers. Now this is just a guess, I did not check the source code, but the purpose is to give an idea of what the allocations might be needed for.
You need to learn how to interpret the results, especially on Mac OS X.
Your Mac output says (I wish I didn't have to type this — dammit, screen images are such a pain!):
definitely lost: 0 bytes in 0 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 0 bytes in 0 blocks
suppressed: 26,091 bytes in 184 blocks
That means what it say — you have no memory leaks. The suppressed stuff is from the Mac C runtime library start-up code. It allocates quite a lot of space (most of 26 KiB on your machine, with 184 separate allocations) and doesn't explicitly free it before the program executes. That's why they're suppressed — they're not a fault of your program, and there's essentially nothing you can do about it. That's the way of life on Mac. FWIW, I just ran a program of mine and got:
==57081==
==57081== HEAP SUMMARY:
==57081== in use at exit: 38,858 bytes in 419 blocks
==57081== total heap usage: 550 allocs, 131 frees, 46,314 bytes allocated
==57081==
==57081== LEAK SUMMARY:
==57081== definitely lost: 0 bytes in 0 blocks
==57081== indirectly lost: 0 bytes in 0 blocks
==57081== possibly lost: 0 bytes in 0 blocks
==57081== still reachable: 0 bytes in 0 blocks
==57081== suppressed: 38,858 bytes in 419 blocks
==57081==
==57081== For counts of detected and suppressed errors, rerun with: -v
==57081== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I don't know why I have 12 KiB more space and 235 more allocations from the run-time system than you do. However, this is absolutely the normal behaviour on Mac.
If there's a major upgrade to the o/s, the old suppressions may stop being effective, and you may suddenly get a whole lot more 'still reachable' or other memory 'problems'; at that point, you examine the reports carefully and then generate new suppressions. I have a file with 84 suppressions in it that I was using at one point — then I got a new version of Valgrind and they were already in place.
(Update: Problem solved. It all came down to a stupid typo on my part, causing me to write the wrong part of memory, which in turn caused some pointer to point to someplace that was off limits.)
So, I'm taking a course that involves some programming, and we've essentially been thrown in the deep in of the C pool. I've programmed in other languages before, so it's not all new, but I don't have a solid set of tools to debug my code when the proverbial shit hits the fan.
I had, essentially, the following
int nParticles = 32;
int nSteps = 10000;
double u[nParticles], v[nParticles];
for (i = 0; i < nSteps; i++) {
...
for (j = 0; j < nParticles; j++) {
u[j] = 0.001 * v[j];
}
...
}
as one part of a bigger program, and I was getting segmentation fault. In order to pinpoint the problem, I added a bunch of
printf("foo\n");
and eventually it turned out that I got to step i = 209, and particle j = 31 before the segmentation fault occured.
After a bit of googling, I realised there's a tool called gdb, and with the extra printfs in there, doing bt in gdb tells me that now it's printf that segfaulting. Keep in mind though, I got segfaults before adding the diagnostic printfs as well.
This doesn't make much sense to me. How do I proceed from here?
Update:
valgrind gives me the following
==18267== Invalid read of size 8
==18267== at 0x400EA6: main (in [path redacted])
==18267== Address 0x7ff001000 is not stack'd, malloc'd or (recently) free'd
==18267==
==18267==
==18267== Process terminating with default action of signal 11 (SIGSEGV)
==18267== Access not within mapped region at address 0x7FF001000
==18267== at 0x400EA6: main (in [path redacted])
==18267== If you believe this happened as a result of a stack
==18267== overflow in your program's main thread (unlikely but
==18267== possible), you can try to increase the size of the
==18267== main thread stack using the --main-stacksize= flag.
==18267== The main thread stack size used in this run was 10485760.
==18267==
==18267== HEAP SUMMARY:
==18267== in use at exit: 1,136 bytes in 2 blocks
==18267== total heap usage: 2 allocs, 0 frees, 1,136 bytes allocated
==18267==
==18267== LEAK SUMMARY:
==18267== definitely lost: 0 bytes in 0 blocks
==18267== indirectly lost: 0 bytes in 0 blocks
==18267== possibly lost: 0 bytes in 0 blocks
==18267== still reachable: 1,136 bytes in 2 blocks
==18267== suppressed: 0 bytes in 0 blocks
==18267== Rerun with --leak-check=full to see details of leaked memory
==18267==
==18267== For counts of detected and suppressed errors, rerun with: -v
==18267== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 6)
Segmentation fault (core dumped)
I don't know what that means.
Update:
I tried commenting out the array assignment that initially caused the segfault. When I do that, but leave most of the diagnostic printfs in, I get a segfault at i = 207 instead.
Update:
Problem solved. In the outer loop (where i is the counter, representing time steps), I had a couple of inner loops (all of which reused j as a counter, iterating over a bunch of particles). In one of the inner loops (not the one that segfaulted, though), I was accidentaly assigning values to E[i] where E is an array of nParticles size, so I was running way out of bounds. Fixing this stops the segfault from happening.
So, it all came down to a silly silly typo on my part.
Update:
I spoke to my brother, and he explained the problem in a way that at least satisfies my limited understanding of the situation.
By accidentaly writing things in E way beyond the end of that array, I probably overwrote the pointers associated with my other arrays, and then when I go to access those other arrays, I try to access memory that's not mine, and I get a segfault.
Thank you all so much for helping me out and putting up with my lack of knowledge!
This got too long for a comment, but is not a complete answer. It's impossible to tell why your program dies without seeing a minimal example of the code.
printf normally segfaults for one of three reasons: Either
you are passing it a parameter it can't access (for instance a char * that does not point to correctly allocated memory containing a zero terminated string), or
you've run out of stack space (stack overflow) due to too much recursion, or
you have corrupted the heap by writing beyond a dynamic memory allocation.
My advice would be:
Compile your program with the -Wall option. Fix every warning, and understand why the warning occurs rather than just hiding the problem. Compiling with -Wall is a good habit to get into, and finds a large number of errors.
Download valgrind and run your program in valgrind. This will catch a lot of stuff.
Learn to use gdb. Tutorial here.
Comment or #if out large chunks of your program until you come down to the simplest case that does not error. Add bits back in and find out what the issue is.
See these kind of errors mostly happens when the stack is corrupted.
please check if you have some where crossed the boundaries of an array(specifically local arrays)
I'm using Valgrind version 3.8.0 on OS X 10.8.1, Mountain Lion. Regarding compatibility with 10.8.1, Valgrind's site says (italics mine):
Valgrind 3.8.0 works on {x86,amd64}-darwin (Mac OS X 10.6 and 10.7, with limited support for 10.8).
I know, then, that there is only "limited support" for 10.8.1. Nonetheless, this bug report says (italics mine):
This (the latest 3.8.0 release) makes Valgrind
compile and able to run small programs on OSX 10.8. Be warned however
that it still asserts with bigger apps, and 32 bit programs are not
checked properly at all (most errors are missed by Memcheck).
Ok, that's fine. So Valgrind should work, if temperamentally, on 10.8.1. So now my question:
I was able to get Valgrind to compile on 10.8.1 with little trouble, but I saw some weird results when I ran it on a couple small C programs. To try and reduce the possible causes of the issue, I eventually wrote the following "program":
int main () {
return 0;
}
Not very exciting, and little room for bugs, I'd say. Then, I compiled and ran it through Valgrind:
gcc testC.c
valgrind ./a.out
Here's my output:
==45417== Command: ./a.out
==45417==
==45417== WARNING: Support on MacOS 10.8 is experimental and mostly broken.
==45417== WARNING: Expect incorrect results, assertions and crashes.
==45417== WARNING: In particular, Memcheck on 32-bit programs will fail to
==45417== WARNING: detect any errors associated with heap-allocated data.
==45417==
--45417-- ./a.out:
--45417-- dSYM directory is missing; consider using --dsymutil=yes
==45417==
==45417== HEAP SUMMARY:
==45417== in use at exit: 58,576 bytes in 363 blocks
==45417== total heap usage: 514 allocs, 151 frees, 62,442 bytes allocated
==45417==
==45417== LEAK SUMMARY:
==45417== definitely lost: 8,624 bytes in 14 blocks
==45417== indirectly lost: 1,168 bytes in 5 blocks
==45417== possibly lost: 4,925 bytes in 68 blocks
==45417== still reachable: 43,859 bytes in 276 blocks
==45417== suppressed: 0 bytes in 0 blocks
==45417== Rerun with --leak-check=full to see details of leaked memory
==45417==
==45417== For counts of detected and suppressed errors, rerun with: -v
==45417== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I know that Valgrind is not ready for prime-time on 10.8.1. Nonetheless, I would love to be able to use it here – I only need to use it on small programs, and nothing is mission-critical about the results being spot-on. But clearly, it is reporting a ton of leaks in a program that seems very unlikely to be leaking. Thus:
What should I do to fix this?
Other info:
Adding an intentionally leaked integer does increment the "definitely lost" count by the appropriate 4 bytes.
Similarly, intentionally leaking a call to malloc by not freeing the memory does increment the heap alloc count appropriately.
Compiling with the -g flag and then running to Valgrind (to address the dSYM directory is missing error) does cause that error to disappear, but does not change the issue of tons of memory leaks being reported.
It tells you right there:
Expect incorrect results, assertions and crashes.
If you still want to run it, print detailed information about spurious leaks (--leak-check=full) and use it to suppress messages about them.
Valgrind trunk seems to have improved to the point where it is usable now. I haven't see it crash yet, but do have lots of false positives, which can be dealt with using a suppression file.
Right now, my suppression file looks like this:
# OS X 10.8 isn't supported, has a bunch of 'leaks' in the loader
{
osx_1080_loader_false_positive_1
Memcheck:Leak
...
fun:_ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEjRNS_21InitializerTimingListE
...
}
{
osx_1080_loader_false_positive_2
Memcheck:Leak
...
fun:_ZN16ImageLoaderMachO16doInitializationERKN11ImageLoader11LinkContextE
...
}
{
osx_1080_loader_false_positive_3
Memcheck:Leak
...
fun:map_images_nolock
...
}
{
osx_1080_loader_false_positive_4
Memcheck:Leak
...
fun:_objc_fetch_pthread_data
fun:_ZL27_fetchInitializingClassLista
fun:_class_initialize
fun:_class_initialize
fun:_class_initialize
fun:_class_initialize
fun:prepareForMethodLookup
fun:lookUpMethod
fun:objc_msgSend
fun:_libxpc_initializer
fun:libSystem_initializer
}
I am also running valgrind from macports on mac osx 10.8. It runs without crashing but does produce some crazy results like the ones in this stackoverflow post, Confusing output from Valgrind shows indirectly lost memory leaks but no definitely lost or possibly lost.
This question already has answers here:
Still Reachable Leak detected by Valgrind
(5 answers)
Closed 7 years ago.
I made a post earlier asking about checking for memory leaks, etc. I did say I wasn’t to familiar with the terminal in Linux, but someone said to me it was easy with Valgrind.
I have managed to get it running, etc., but I am not to sure what the output means. Glancing over, all looks good to me, but would like to run it past you experience folk for confirmation if possible. The output is as follows:
^C==2420==
==2420== HEAP SUMMARY:
==2420== in use at exit: 2,240 bytes in 81 blocks
==2420== total heap usage: 82 allocs, 1 frees, 2,592 bytes allocated
==2420==
==2420== LEAK SUMMARY:
==2420== definitely lost: 0 bytes in 0 blocks
==2420== indirectly lost: 0 bytes in 0 blocks
==2420== possibly lost: 0 bytes in 0 blocks
==2420== still reachable: 2,240 bytes in 81 blocks
==2420== suppressed: 0 bytes in 0 blocks
==2420== Reachable blocks (those to which a pointer was found) are not shown.
==2420== To see them, rerun with: --leak-check=full --show-reachable=yes
==2420==
==2420== For counts of detected and suppressed errors, rerun with: -v
==2420== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 13 from 8)
Is all good here? The only thing concerning me is the still reachable part. Is that OK?
I suggest you stop, and read the Valgrind Quick Start, paying particular attention to section 4, "Interpreting Memcheck's output," and look over the FAQ.
Afterward, I think you could benefit from reading How to Ask Questions The Smart Way (aka Smart Questions) to improve your problem solving skills, and improve your asking for assistance in community sites like Stack Overflow, where better questions are rewarded with better answers.
This is not intended to be an insult or personal attack, but a suggestion on how you can ask questions better, so you can get better answers. You will also learn how to answer your own basic questions yourself more often in the process, speeding up your overall efforts. Good luck.
The output you pasted shows:
==2420== total heap usage: 82 allocs, 1 frees, 2,592 bytes allocated
...
==2420== still reachable: 2,240 bytes in 81 blocks
82 allocations and only one free, so in the end there are 81 blocks still 'reachable' on the heap. As the Valgrind FAQ states, this may indicate that the code uses some memory pool allocator and therefore does not free memory as soon as it's unused, but rather keeps it for later use, or it may actually be a memory leak (unlikely, though). Follow the steps in the link to check whether this is due to the STLs use of memory caching.
This may be of use to you:
5.2. With Memcheck's memory leak detector, what's the difference between "definitely lost", "indirectly lost", "possibly lost", "still reachable", and "suppressed"?