(Update: Problem solved. It all came down to a stupid typo on my part, causing me to write the wrong part of memory, which in turn caused some pointer to point to someplace that was off limits.)
So, I'm taking a course that involves some programming, and we've essentially been thrown in the deep in of the C pool. I've programmed in other languages before, so it's not all new, but I don't have a solid set of tools to debug my code when the proverbial shit hits the fan.
I had, essentially, the following
int nParticles = 32;
int nSteps = 10000;
double u[nParticles], v[nParticles];
for (i = 0; i < nSteps; i++) {
...
for (j = 0; j < nParticles; j++) {
u[j] = 0.001 * v[j];
}
...
}
as one part of a bigger program, and I was getting segmentation fault. In order to pinpoint the problem, I added a bunch of
printf("foo\n");
and eventually it turned out that I got to step i = 209, and particle j = 31 before the segmentation fault occured.
After a bit of googling, I realised there's a tool called gdb, and with the extra printfs in there, doing bt in gdb tells me that now it's printf that segfaulting. Keep in mind though, I got segfaults before adding the diagnostic printfs as well.
This doesn't make much sense to me. How do I proceed from here?
Update:
valgrind gives me the following
==18267== Invalid read of size 8
==18267== at 0x400EA6: main (in [path redacted])
==18267== Address 0x7ff001000 is not stack'd, malloc'd or (recently) free'd
==18267==
==18267==
==18267== Process terminating with default action of signal 11 (SIGSEGV)
==18267== Access not within mapped region at address 0x7FF001000
==18267== at 0x400EA6: main (in [path redacted])
==18267== If you believe this happened as a result of a stack
==18267== overflow in your program's main thread (unlikely but
==18267== possible), you can try to increase the size of the
==18267== main thread stack using the --main-stacksize= flag.
==18267== The main thread stack size used in this run was 10485760.
==18267==
==18267== HEAP SUMMARY:
==18267== in use at exit: 1,136 bytes in 2 blocks
==18267== total heap usage: 2 allocs, 0 frees, 1,136 bytes allocated
==18267==
==18267== LEAK SUMMARY:
==18267== definitely lost: 0 bytes in 0 blocks
==18267== indirectly lost: 0 bytes in 0 blocks
==18267== possibly lost: 0 bytes in 0 blocks
==18267== still reachable: 1,136 bytes in 2 blocks
==18267== suppressed: 0 bytes in 0 blocks
==18267== Rerun with --leak-check=full to see details of leaked memory
==18267==
==18267== For counts of detected and suppressed errors, rerun with: -v
==18267== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 6 from 6)
Segmentation fault (core dumped)
I don't know what that means.
Update:
I tried commenting out the array assignment that initially caused the segfault. When I do that, but leave most of the diagnostic printfs in, I get a segfault at i = 207 instead.
Update:
Problem solved. In the outer loop (where i is the counter, representing time steps), I had a couple of inner loops (all of which reused j as a counter, iterating over a bunch of particles). In one of the inner loops (not the one that segfaulted, though), I was accidentaly assigning values to E[i] where E is an array of nParticles size, so I was running way out of bounds. Fixing this stops the segfault from happening.
So, it all came down to a silly silly typo on my part.
Update:
I spoke to my brother, and he explained the problem in a way that at least satisfies my limited understanding of the situation.
By accidentaly writing things in E way beyond the end of that array, I probably overwrote the pointers associated with my other arrays, and then when I go to access those other arrays, I try to access memory that's not mine, and I get a segfault.
Thank you all so much for helping me out and putting up with my lack of knowledge!
This got too long for a comment, but is not a complete answer. It's impossible to tell why your program dies without seeing a minimal example of the code.
printf normally segfaults for one of three reasons: Either
you are passing it a parameter it can't access (for instance a char * that does not point to correctly allocated memory containing a zero terminated string), or
you've run out of stack space (stack overflow) due to too much recursion, or
you have corrupted the heap by writing beyond a dynamic memory allocation.
My advice would be:
Compile your program with the -Wall option. Fix every warning, and understand why the warning occurs rather than just hiding the problem. Compiling with -Wall is a good habit to get into, and finds a large number of errors.
Download valgrind and run your program in valgrind. This will catch a lot of stuff.
Learn to use gdb. Tutorial here.
Comment or #if out large chunks of your program until you come down to the simplest case that does not error. Add bits back in and find out what the issue is.
See these kind of errors mostly happens when the stack is corrupted.
please check if you have some where crossed the boundaries of an array(specifically local arrays)
Related
Here is a very simple program I wrote to show the differences between the valgrind outputs on Mac (El Capitan) and Linux Mint 17.2.
Is there a way get the same type of output on the mac? I dont understand why it shows more heap usage on the mac than it does on Linux?
For a strange reason Linux Mint shows the memory being freed, whereas OSX doesn't
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(int argc, char const *argv[]) {
char *str = (char *)malloc(15);
strcpy(str, "Hello World!");
printf("%s\n", str);
free(str);
return 0;
}
Linux Mint 17.2
Mac OSX El Capitan
The C standard library and runtime do stuff before main is called, and printf also does stuff internally. This "stuff" may include memory allocations. This is implementation-specific, so it's no surprise that completely different implementations show different amount of allocations.
Then, when program exits, it might not actually be necessary to free any heap allocations, because when process terminates, the heap will be gone, poof, removed by the OS (in a desktop operating system like both of above are). The application code should free all memory it allocates (because of portability, and because then you can actually use tools like valgrind, and because it's "clean"). But for platform and compiler specific library code, it would just slow down the exit of every program, for no gain. So library not doing it is basically an optimization, which you normally shouldn't do in your own program (unless you can actually measure that it makes a difference somewhere).
So tools like valgrind generally contain suppression lists for known un-freed memory blocks. You can also configure your own suppression lists for any libraries which you use, and which don't release all memory on program exit. But when working with suppressions, better be sure you are suppressing safe cases, and not hiding actual memory leaks.
Speculation: Because here the difference in number of allocations is quite big, one might hazard a guess, that Linux implementation uses only static/global variables, where Mac implementation also uses heap allocations. And the actual data stored there might include things like stdin/stdout/stderr buffers. Now this is just a guess, I did not check the source code, but the purpose is to give an idea of what the allocations might be needed for.
You need to learn how to interpret the results, especially on Mac OS X.
Your Mac output says (I wish I didn't have to type this — dammit, screen images are such a pain!):
definitely lost: 0 bytes in 0 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 0 bytes in 0 blocks
still reachable: 0 bytes in 0 blocks
suppressed: 26,091 bytes in 184 blocks
That means what it say — you have no memory leaks. The suppressed stuff is from the Mac C runtime library start-up code. It allocates quite a lot of space (most of 26 KiB on your machine, with 184 separate allocations) and doesn't explicitly free it before the program executes. That's why they're suppressed — they're not a fault of your program, and there's essentially nothing you can do about it. That's the way of life on Mac. FWIW, I just ran a program of mine and got:
==57081==
==57081== HEAP SUMMARY:
==57081== in use at exit: 38,858 bytes in 419 blocks
==57081== total heap usage: 550 allocs, 131 frees, 46,314 bytes allocated
==57081==
==57081== LEAK SUMMARY:
==57081== definitely lost: 0 bytes in 0 blocks
==57081== indirectly lost: 0 bytes in 0 blocks
==57081== possibly lost: 0 bytes in 0 blocks
==57081== still reachable: 0 bytes in 0 blocks
==57081== suppressed: 38,858 bytes in 419 blocks
==57081==
==57081== For counts of detected and suppressed errors, rerun with: -v
==57081== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I don't know why I have 12 KiB more space and 235 more allocations from the run-time system than you do. However, this is absolutely the normal behaviour on Mac.
If there's a major upgrade to the o/s, the old suppressions may stop being effective, and you may suddenly get a whole lot more 'still reachable' or other memory 'problems'; at that point, you examine the reports carefully and then generate new suppressions. I have a file with 84 suppressions in it that I was using at one point — then I got a new version of Valgrind and they were already in place.
I'm using Valgrind version 3.8.0 on OS X 10.8.1, Mountain Lion. Regarding compatibility with 10.8.1, Valgrind's site says (italics mine):
Valgrind 3.8.0 works on {x86,amd64}-darwin (Mac OS X 10.6 and 10.7, with limited support for 10.8).
I know, then, that there is only "limited support" for 10.8.1. Nonetheless, this bug report says (italics mine):
This (the latest 3.8.0 release) makes Valgrind
compile and able to run small programs on OSX 10.8. Be warned however
that it still asserts with bigger apps, and 32 bit programs are not
checked properly at all (most errors are missed by Memcheck).
Ok, that's fine. So Valgrind should work, if temperamentally, on 10.8.1. So now my question:
I was able to get Valgrind to compile on 10.8.1 with little trouble, but I saw some weird results when I ran it on a couple small C programs. To try and reduce the possible causes of the issue, I eventually wrote the following "program":
int main () {
return 0;
}
Not very exciting, and little room for bugs, I'd say. Then, I compiled and ran it through Valgrind:
gcc testC.c
valgrind ./a.out
Here's my output:
==45417== Command: ./a.out
==45417==
==45417== WARNING: Support on MacOS 10.8 is experimental and mostly broken.
==45417== WARNING: Expect incorrect results, assertions and crashes.
==45417== WARNING: In particular, Memcheck on 32-bit programs will fail to
==45417== WARNING: detect any errors associated with heap-allocated data.
==45417==
--45417-- ./a.out:
--45417-- dSYM directory is missing; consider using --dsymutil=yes
==45417==
==45417== HEAP SUMMARY:
==45417== in use at exit: 58,576 bytes in 363 blocks
==45417== total heap usage: 514 allocs, 151 frees, 62,442 bytes allocated
==45417==
==45417== LEAK SUMMARY:
==45417== definitely lost: 8,624 bytes in 14 blocks
==45417== indirectly lost: 1,168 bytes in 5 blocks
==45417== possibly lost: 4,925 bytes in 68 blocks
==45417== still reachable: 43,859 bytes in 276 blocks
==45417== suppressed: 0 bytes in 0 blocks
==45417== Rerun with --leak-check=full to see details of leaked memory
==45417==
==45417== For counts of detected and suppressed errors, rerun with: -v
==45417== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I know that Valgrind is not ready for prime-time on 10.8.1. Nonetheless, I would love to be able to use it here – I only need to use it on small programs, and nothing is mission-critical about the results being spot-on. But clearly, it is reporting a ton of leaks in a program that seems very unlikely to be leaking. Thus:
What should I do to fix this?
Other info:
Adding an intentionally leaked integer does increment the "definitely lost" count by the appropriate 4 bytes.
Similarly, intentionally leaking a call to malloc by not freeing the memory does increment the heap alloc count appropriately.
Compiling with the -g flag and then running to Valgrind (to address the dSYM directory is missing error) does cause that error to disappear, but does not change the issue of tons of memory leaks being reported.
It tells you right there:
Expect incorrect results, assertions and crashes.
If you still want to run it, print detailed information about spurious leaks (--leak-check=full) and use it to suppress messages about them.
Valgrind trunk seems to have improved to the point where it is usable now. I haven't see it crash yet, but do have lots of false positives, which can be dealt with using a suppression file.
Right now, my suppression file looks like this:
# OS X 10.8 isn't supported, has a bunch of 'leaks' in the loader
{
osx_1080_loader_false_positive_1
Memcheck:Leak
...
fun:_ZN11ImageLoader23recursiveInitializationERKNS_11LinkContextEjRNS_21InitializerTimingListE
...
}
{
osx_1080_loader_false_positive_2
Memcheck:Leak
...
fun:_ZN16ImageLoaderMachO16doInitializationERKN11ImageLoader11LinkContextE
...
}
{
osx_1080_loader_false_positive_3
Memcheck:Leak
...
fun:map_images_nolock
...
}
{
osx_1080_loader_false_positive_4
Memcheck:Leak
...
fun:_objc_fetch_pthread_data
fun:_ZL27_fetchInitializingClassLista
fun:_class_initialize
fun:_class_initialize
fun:_class_initialize
fun:_class_initialize
fun:prepareForMethodLookup
fun:lookUpMethod
fun:objc_msgSend
fun:_libxpc_initializer
fun:libSystem_initializer
}
I am also running valgrind from macports on mac osx 10.8. It runs without crashing but does produce some crazy results like the ones in this stackoverflow post, Confusing output from Valgrind shows indirectly lost memory leaks but no definitely lost or possibly lost.
I'm writing a compiler that produces C code. The programs produced consist only of the main function, and they use a lot of memory, that is allocated with malloc(). Most of the memory allocated is used only in a small part of the program, and I thought it would be a good idea to free() it after use, since it's not going to be used again. I would be glad, then, if valgrind would report to me about memory not free()d in the end of the program, that is, still reachable memory. I'm using valgrind with --error-exitcode=1 inside a Makefile, to check for this kind of problem automatically.
The question is: is there a way to make valgrind exit with 1 in case there are still reachable allocs?
An alternative to grepping through Valgrind output: modify your compiler so it emits:
int main() { return foo_main(); }
int foo_main() { /* whatever you've emitted before */ }
Assuming you are not assigning allocated blocks to global variables (which would make no sense since you only have one function), you've just transformed "still reachable" into "definitely leaked".
Possibly even better transformation: don't call exit(0) in your main; change it to return 0; instead. The net effect should be same as above -- __libc_main will now call exit for you, and all local variables in main will be out of scope by that time.
The valgrind manual says:
Indirectly lost and still reachable
blocks are not counted as true
"errors", even if --show-reachable=yes
is specified and they are printed;
this is because such blocks don't need
direct fixing by the programmer.
I have found no way to make valgrind report "still reachable"s as error. It seems to be that your only option to do this (other than patching valgrind) is to capture the output of valgrind and parse the "still reachable" line.
The poroper options to use to exit with error when there is a reachable block at exit:
valgrind --tool=memcheck --leak-check=full --show-reachable=yes --errors-for-leak-kinds=all
From Valgrind manual:
Because there are different kinds of leaks with different severities, an interesting question is: which leaks should be counted as true "errors" and which should not?
The answer to this question affects the numbers printed in the ERROR SUMMARY line, and also the effect of the --error-exitcode option. First, a leak is only counted as a true "error" if --leak-check=full is specified. Then, the option --errors-for-leak-kinds= controls the set of leak kinds to consider as errors. The default value is --errors-for-leak-kinds=definite,possible
Alternatively you can have a small shell script in your makefile to grep through output logs of valgrind and exit accordingly.
hello i got a problem using valgrind
when i use it with valgrind --leak-check=full and afterwards the name of excution file it tells me in which blocks the memory leak is but when i cant find to which pointer i did use free.
is there some sort of flag that tells the name of the pointer.
if there is anyway to tell me where the leak is on visual studio i would very much like to hear about it too
It can't tell you the name of the pointer, because the whole idea of a memory leak is that no pointer points at the memory any more (at least, for the kinds of leaks that Valgrind describes as "definitely lost").
What it can tell you is the source file and line number where the memory was allocated - you then will need to look up that line in your source to figure out where the memory is supposed to be deallocated. For example, if the Valgrind loss record looks like:
==17110== 49 bytes in 1 blocks are definitely lost in loss record 17 of 35
==17110== at 0x4023D6E: malloc (vg_replace_malloc.c:207)
==17110== by 0x80C4CF8: do_foo (foo.c:1161)
==17110== by 0x80AE325: xyzzy (bar.c:466)
==17110== by 0x8097C46: io (bar.c:950)
==17110== by 0x8098163: main (quux.c:1291)
Then you need to look at line 1161 in foo.c, which is within the function do_foo(). That's where the memory was allocated (with malloc()), and only you can say where it should have been freed.
You didn't say which compiler you are using, I suppose gcc?
Do you use -g to have debugging symbols included?
Valgrind gives me the following leak summary on my code. However, I have freed all malloc'ed memory. Is this a bad thing, or is this normal? My program is in c.
==3513== LEAK SUMMARY:
==3513== definitely lost: 0 bytes in 0 blocks.
==3513== possibly lost: 0 bytes in 0 blocks.
==3513== still reachable: 568 bytes in 1 blocks.
==3513== suppressed: 0 bytes in 0 blocks.
The valgrind message still reachable: 568 bytes in 1 blocks. means that there was memory freed in your application which is still "reachable", which means that you still have a pointer to it somewhere. At shutdown, this probably means a global variable of some kind. However, since the number of bytes "definitely leaked" or "probably leaked" is zero, this condition is completely benign. Don't worry about it.
Still reachable memory means that it is being pointed to by a global or static pointer. What you want to do is run valgrind with --show-reachable=yes to see whether it's a problem.
Often times it's harmless and comes from a function like this:
void foo()
{
static char *buffer = 0;
if (buffer == 0)
{
buffer = (char *)malloc(...);
}
}
That malloc will still be reachable. But no matter how many times foo is called, you allocate the buffer exactly once so there is no harm here in not freeing it.
But consider a function like this:
void foo()
{
static node_t *head = 0;
node_t *node = (node_t *)malloc(sizeof(node_t));
if (node)
{
node->next = head;
head = node;
}
...
}
Every time this function is called, another node will be allocated. While you may only leak a few nodes for your test runs, in a production run, you could leak enough that you run out of memory.
One way to tell the difference is to see if different runs always leak the same memory or you leak more memory are test runs with larger inputs.
But again, if you want to be safe, use --show-reachable=yes and see what's happening.
These are not leaked and nothing to be concerned about. The memory was probably allocated by the C library. If you really want to know where they were allocated run with --leak-check=full --show-reachable=yes.
If you are sure that you "have freed all malloc'ed memory", then, no there's nothing wrong. You are not directly responsible for memory leaks in components from other parties even though you may often have to work around them.
The reports from valgrind don't really give enough information for us to help you out.
I've seen memory checking tools come up with false positives many times but I don't have any direct experience with valgrind itself.
Does it give you the address of the block? sometimes you can learn a lot by looking at what sort of data is in those 568 bytes.
Hmm, 568 bytes, that's about the size of a MAX_PATH unicode string.
It would be a good idea to zero out pointers that were free()'ed, which would cause a crash upon (incorrectly) trying to dereference them again.
Personally, I always keep forgetting and checking like valgrind make test which always adds at least couple of additional still reachable bytes... Make sure that Your application is being ran directly by valgrind... :)