Valgrind reports memory leaks of SDL2 - c

This is my first time working with vlagrind and I am wondering if those errors are something sirious, I should worry about or just ignore them. My program is just simple SDL2 2D space game and i have no clue where those memory leaks could come from.
==9173== Conditional jump or move depends on uninitialised value(s)
==9173== at 0xA0E1343: ??? (in /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1)
==9173== by 0xA0215E7: llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (in /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1)
==9173== by 0x9E8BD75: llvm::FPPassManager::runOnFunction(llvm::Function&) (in /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1)
==9173== by 0x9E8BFF2: llvm::FPPassManager::runOnModule(llvm::Module&) (in /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1)
==9173== by 0x9E8C49F: llvm::legacy::PassManagerImpl::run(llvm::Module&) (in /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1)
==9173== by 0xAFD7B34: llvm::MCJIT::emitObject(llvm::Module*) (in /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1)
==9173== by 0xAFD7F1D: llvm::MCJIT::generateCodeForModule(llvm::Module*) (in /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1)
==9173== by 0xAFD86AD: llvm::MCJIT::finalizeObject() (in /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1)
==9173== by 0xAF9C87F: LLVMGetPointerToGlobal (in /usr/lib/x86_64-linux-gnu/libLLVM-10.so.1)
==9173== by 0x84B0041: ??? (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==9173== by 0x84A49EF: ??? (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
==9173== by 0x8490937: ??? (in /usr/lib/x86_64-linux-gnu/dri/swrast_dri.so)
And here it is mentioning some memory leak. But i have checked my code for leaks so many times that i think it must be in SDL library.
17 bytes in 1 blocks are definitely lost in loss record 10 of 1,977
==9173== at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==9173== by 0x4EC85A6: _XlcDefaultMapModifiers (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==9173== by 0x4EC897A: XSetLocaleModifiers (in /usr/lib/x86_64-linux-gnu/libX11.so.6.3.0)
==9173== by 0x4923824: ??? (in /home/coder/Desktop/game/libSDL2-2.0.so.0)
==9173== by 0x492A45A: ??? (in /home/coder/Desktop/game/libSDL2-2.0.so.0)
==9173== by 0x48FCF6A: ??? (in /home/coder/Desktop/game/libSDL2-2.0.so.0)
==9173== by 0x486C8E6: ??? (in /home/coder/Desktop/game/libSDL2-2.0.so.0)
==9173== by 0x10972A: main (projekt.c:115)
==9173==
==9173== 112 (56 direct, 56 indirect) bytes in 1 blocks are definitely lost in loss record 1,922 of 1,977
==9173== at 0x483DD99: calloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==9173== by 0x880D07E: ???
==9173== by 0x8488C3B: ???
==9173== by 0x84737A5: ???
==9173== by 0x847386C: ???
==9173== by 0x8474479: ???
==9173== by 0x8437A33: ???
==9173== by 0x843A35C: ???
==9173== by 0x84352BC: ???
==9173== by 0x83F8357: ???
==9173== by 0x841C33D: ???
==9173== by 0x8419C76: ???
==9173==
==9173== LEAK SUMMARY:
==9173== definitely lost: 73 bytes in 2 blocks
==9173== indirectly lost: 56 bytes in 1 blocks
==9173== possibly lost: 0 bytes in 0 blocks
==9173== still reachable: 330,333 bytes in 2,678 blocks
==9173== suppressed: 0 bytes in 0 blocks
Could someone explain me what those errors mean?

This kind of library will always have some leaks, unfortunately. You can check this post for further details or find more answers on the SDL / OpenGL / Any graphic library you want, but long story short, it will almost always happen.
All the leaks you should focus on are the ones which are traced back to the code you wrote yourself.
I recommend launching valgrind --leak-check=full --show-reachable=yes instead of just valgrind, it will display your errors more precisely.

Related

PulseAudio-related leaks in SDL2 program under Valgrind's Memcheck?

I'm currently on Kubuntu and I write a code with SDL 2.
My goal is to do ray-casting.
So no problem in my code - gdb said no problem and exit normally but valgrind said one error
==1894== Memcheck, a memory error detector
==1894== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==1894== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==1894== Command: ./ray
==1894==
==1894== Conditional jump or move depends on uninitialised value(s)
==1894== at 0x50B8565: pa_shm_cleanup (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-13.99.so)
==1894== by 0x50B87A1: pa_shm_create_rw (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-13.99.so)
==1894== by 0x50A84B6: pa_mempool_new (in /usr/lib/x86_64-linux-gnu/pulseaudio/libpulsecommon-13.99.so)
==1894== by 0x4E149B1: pa_context_new_with_proplist (in /usr/lib/x86_64-linux-gnu/libpulse.so.0.21.2)
==1894== by 0x493ED5E: ??? (in /usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0.10.0)
==1894== by 0x493F65A: ??? (in /usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0.10.0)
==1894== by 0x4891D9B: ??? (in /usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0.10.0)
==1894== by 0x488D906: ??? (in /usr/lib/x86_64-linux-gnu/libSDL2-2.0.so.0.10.0)
==1894== by 0x10941D: main (main.c:9)
==1894==
==1894==
==1894== HEAP SUMMARY:
==1894== in use at exit: 349,601 bytes in 2,981 blocks
==1894== total heap usage: 220,203 allocs, 217,222 frees, 32,111,232 bytes allocated
==1894==
==1894== LEAK SUMMARY:
==1894== definitely lost: 377 bytes in 3 blocks
==1894== indirectly lost: 3,071 bytes in 24 blocks
==1894== possibly lost: 0 bytes in 0 blocks
==1894== still reachable: 346,153 bytes in 2,954 blocks
==1894== suppressed: 0 bytes in 0 blocks
==1894== Rerun with --leak-check=full to see details of leaked memory
==1894==
==1894== Use --track-origins=yes to see where uninitialised values come from
==1894== For lists of detected and suppressed errors, rerun with: -s
==1894== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 2 from 2)
If I understand, my code is great but there is a problem with a pulseAudio lib?
To test, I just write SDL_Init(SDL_INIT_EVERYTHING) SDL_Quit() in the main function and valgrind said the same thing. So that by SDL with a pulseAudio lib.
Can someone help me to track and remove that error?
The problem is likely in the SDL2 or PulseAudio lib. Though having exact test code with the compilation command would ensure that you are not doing something wrong, it is unlikely a bug from you and I would ignore it.
Valgrind can and have error suppression lists to remove these annoyances. How do you tell Valgrind to completely suppress a particular .so file? might help you.
Also avoid using SDL_INIT_EVERYTHING, you likely only need SDL_INIT_VIDEO or SDL_INIT_VIDEO|SDL_INIT_TIMER depending on what you do, checkout the SDL_Init() documentation.

Valgrind - understanding output for conditional jump

I've been testing a C shared library for memory leaks. I got the output below, and I'd like to make sure my understanding of the output is correct.
I'm fairly well-acquainted with valgrind, but I'm used to the output having just one line below the "heap allocation" section, so I'd like to make sure I get this right. I tried to find more info in the valgrind manuals and here and other forums, but couldn't find anything.
Anyway, I ran with these parameters:
valgrind --leak-check=full --log-fd=1 --keep-debuginfo=yes --track-origins=yes
And got this output:
==28303== Conditional jump or move depends on uninitialised value(s)
==28303== at 0x82C80C0: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x830AA05: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x8301896: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0xB4E7E9A: func1 (in /opt/MyDir/MyExit.so_r)
==28303== by 0xB4E9FFD: func2 (in /opt/MyDir/MyExit.so_r)
==28303== by 0xB4EC62E: func3 (in /opt/MyDir/MyExit.so_r)
==28303== by 0xB4ECB7C: func4 (in /opt/MyDir/MyExit.so_r)
==28303== by 0x83AD1B7: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x831DA90: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x8301A1C: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x838DBCA: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x606E8A: ??? (in /opt/mqm/bin/dmpmqmsg)
==28303== Uninitialised value was created by a heap allocation
==28303== at 0x6C29F73: malloc (vg_replace_malloc.c:309)
==28303== by 0x8200401: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x848331B: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x839DC40: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x83167C4: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x83197D1: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x830092D: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x83833B4: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x838EE0C: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x60316B: ??? (in /opt/mqm/bin/dmpmqmsg)
==28303== by 0x7680554: (below main) (in /usr/lib64/libc-2.17.so)
(changed func names for security reasons)
Here's my understanding:
Memory was first allocated in "libc-2.17.so" (or was it in "dmpmqmsg"?), and was not initialized.
After that, the call sequence was as follows:
libc-2.17.so >> dmpmqmsg >> libmqe_r.so >> MyExit.so_r (func4) >> ... >> MyExit.so_r (func1) >> libmqe_r.so
Finally the "conditional jump" which valgrind notified about was in libmqe_r.so at 0x82C80C0, probably an if(pmem != NULL) somewhere in this library
Is this the right interpretation?
In case it's relevant - I'm running on RedHat linux. My code is an MQ Exit which I compiled as a shared library.
First, looking at the "heap allocation" callstack. At the top is malloc (which has been intercepted by memcheck). Then you have a series of 8 calls in libmqe_r.so, all without debug info. Then there's the call from the guest executable, dmpmqmsg. The last line in the callstack, in libc, is the startup function that calls main.
Next, the actual error. Without debug info, it will be difficult to be certain. It looks like your func1 is calling a chain of 3 functions in libmqe_r.so, and passing in some uninitialized heap memory. It's also possible that your code is innocent and the topmost function is accessing some uninitialized static or global object.
I have almost no experience with MQ. There may be a package related to MQ that contains debugging symbols. Installing that would probably help.

Valgrind reports errors for a very simple C program

I'm learning C language from Learn C The Hard Way. I'm on exercise 6 and while I can make it work, valgrind repots a lot of errors.
Here's the stripped down minimal program from a file ex6.c:
#include <stdio.h>
int main(int argc, char *argv[])
{
char initial = 'A';
float power = 2.345f;
printf("Character is %c.\n", initial);
printf("You have %f levels of power.\n", power);
return 0;
}
Content of Makefile is just CFLAGS=-Wall -g.
I compile the program with $ make ex6 (there are no compiler warnings or errors). Executing with $ ./ex6 produces the expected output.
When I run the program with $ valgrind ./ex6 I get errors which I can't solve. Here's the full output:
==69691== Memcheck, a memory error detector
==69691== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==69691== Using Valgrind-3.11.0.SVN and LibVEX; rerun with -h for copyright info
==69691== Command: ./ex6
==69691==
--69691-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option
--69691-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 2 times)
--69691-- UNKNOWN mach_msg unhandled MACH_SEND_TRAILER option (repeated 4 times)
==69691== Conditional jump or move depends on uninitialised value(s)
==69691== at 0x1003FBC3F: _platform_memchr$VARIANT$Haswell (in /usr/lib/system/libsystem_platform.dylib)
==69691== by 0x1001EFBB6: __sfvwrite (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x1001FA005: __vfprintf (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x10021F9CE: __v2printf (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x10021FCA0: __xvprintf (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x1001F5B91: vfprintf_l (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x1001F39F7: printf (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x100000F1B: main (ex6.c:8)
==69691==
Character is A.
==69691== Invalid read of size 32
==69691== at 0x1003FBC1D: _platform_memchr$VARIANT$Haswell (in /usr/lib/system/libsystem_platform.dylib)
==69691== by 0x1001EFBB6: __sfvwrite (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x1001FA005: __vfprintf (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x10021F9CE: __v2printf (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x10021FCA0: __xvprintf (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x1001F5B91: vfprintf_l (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x1001F39F7: printf (in /usr/lib/system/libsystem_c.dylib)
==69691== by 0x100000F31: main (ex6.c:9)
==69691== Address 0x100809680 is 32 bytes before a block of size 32 in arena "client"
==69691==
You have 2.345000 levels of power.
==69691==
==69691== HEAP SUMMARY:
==69691== in use at exit: 39,365 bytes in 429 blocks
==69691== total heap usage: 510 allocs, 81 frees, 45,509 bytes allocated
==69691==
==69691== LEAK SUMMARY:
==69691== definitely lost: 16 bytes in 1 blocks
==69691== indirectly lost: 0 bytes in 0 blocks
==69691== possibly lost: 13,090 bytes in 117 blocks
==69691== still reachable: 26,259 bytes in 311 blocks
==69691== suppressed: 0 bytes in 0 blocks
==69691== Rerun with --leak-check=full to see details of leaked memory
==69691==
==69691== For counts of detected and suppressed errors, rerun with: -v
==69691== Use --track-origins=yes to see where uninitialised values come from
==69691== ERROR SUMMARY: 5 errors from 2 contexts (suppressed: 0 from 0)
I'm on OS X yosemite. Valgrind is installed via brew with this command $ brew install valgrind --HEAD.
So, does anyone know what's the issue here? How do I fix the valgrind errors?
If the programme you are running through Valgrind is exactly the one you posted in your question, it clearly doesn't have any memory leaks. In fact, you don't even use malloc/free yourself!
It looks to me like these are spurious errors / false positives that Valgrind detects on OS X (only!), similar to what happened to myself some time ago.
If you have access to a different operating system, e.g. a Linux machine, try to analyze the programme using Valgrind on that system.
EDIT: I haven't tried this myself, since I don't have access to a Mac right now, but you should try what
M Oehm suggested: try to use a supressions file as mentioned in this other SO question.
This issue is fixed for Darwin 14.3.0 (Mac OS X 10.10.2) using Valgrind r14960 with VEX r3124 for Xcode6.2 and Valgrind r15088 for Xcode 6.3.
If you are using Macports (at this time of writing), sudo port install valgrind-devel will give you Valgrind r14960 with VEX r3093.
Here's my build script to install Valgrind r14960 with VEX r3124:
#! /usr/bin/env bash
mkdir -p buildvalgrind
cd buildvalgrind
svn co svn://svn.valgrind.org/valgrind/trunk/#14960 valgrind
cd valgrind
./autogen.sh
./configure --prefix=/usr/local
make && sudo make install
# check that we have our valgrind installed
/usr/local/bin/valgrind --version
(reference: http://calvinx.com/2015/04/10/valgrind-on-mac-os-x-10-10-yosemite/)
My macports-installed valgrind is located at /opt/local/bin/valgrind.
If I now run
/opt/local/bin/valgrind --leak-check=yes --suppressions=`pwd`/objc.supp ./ex6
I will get exactly the same errors you described above. (Using my objc.supp file here https://gist.github.com/calvinchengx/0b1d45f67be9fdca205b)
But if I run
/usr/local/bin/valgrind --leak-check=yes --suppressions=`pwd`/objc.supp ./ex6
Everything works as expected and I do not get the system level memory leak errors showing up.
Judging from this topic, I assume that valgrind is not guaranteed to give correct results on your platform. If you can, try this code on another platform.
The culprit is either in valgrid itself or in your system's implementation of printf, both of which would be impractical for you to fix.
Rerun with --leak-check=full to see details of leaked memory. This should give you some more information about the leak you are experiencing. If nothing helps, you can create a suppression file to stop the errors from being displayed.

Valgrind indentation (and below main)

Here is the valgrind output from a project:
==2433== Invalid free() / delete / delete[] / realloc()
==2433== at 0x402B06C: free (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==2433== by 0x43F345B: av_freep (mem.c:172)
==2433== by 0x5A6F4D2: (below main) (libc-start.c:226)
==2433== Address 0xb3fd830 is 48 bytes inside a block of size 111,634 alloc'd
==2433== at 0x402BE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==2433== by 0x80BB6B5: _talloc_realloc (talloc.c:997)
The line starting with Address is indented by one space more compared to the line starting with Invalid. Does that mean one leads onto another? Or are they seperate?
If the are seperate, where does the by 0x5A6F4D2: (below main) (libc-start.c:226) come from? I get the feeling (below main) has something to do with it, but I can't find libc-start.c anywhere on my hard drive.
Yes, it is providing you with additional details on the invalid free. The first four lines describe the invalid call (free in this case) and the call stack at the time of the free. The following three lines provide additional data. In this case, valgrind recognizes that the address passed to free is contained within an allocated region, and it provides the offset, size of the block, and call stack of that allocation.
According to valgrind.org, the hierarchy should be flat, as shown below:
==3016== Invalid write of size 1
==3016== at 0x80484DA: main (in /jfs/article/sample2)
==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016== by 0x80483B1: free##GLIBC_2.0 (in /jfs/article/sample2)
==3016== Address 0x40CA0224 is 0 bytes after a block of size 512 alloc'd
==3016== at 0x400483E4: malloc (vg_clientfuncs.c:100)
==3016== by 0x80484AA: main (in /jfs/article/sample2)
==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016== by 0x80483B1: free##GLIBC_2.0 (in /jfs/article/sample2)
I would treat the indentation of Address in your output as the above, as it may be a version-specific change to make the output more readable.

Tips on debugging segmentation faults when no leaks are found

I wrote a C-based application that appears to run fine, except on very large datasets as input.
With large input, I get a segmentation fault at the end steps of the binary's functionality.
I ran the binary (with the test input) with valgrind:
valgrind --tool=memcheck --leak-check=yes /foo/bar/baz inputDataset > outputAnalysis
This job normally takes a few hours, but with valgrind it took seven days.
Unfortunately, at this point, I don't know how to read the results I am getting from this run.
I get a lot of these warnings:
...
==4074== Conditional jump or move depends on uninitialised value(s)
==4074== at 0x435900: ??? (in /foo/bar/baz)
==4074== by 0x439CC5: ??? (in /foo/bar/baz)
==4074== by 0x400BF2: ??? (in /foo/bar/baz)
==4074== by 0x402086: ??? (in /foo/bar/baz)
==4074== by 0x402A0F: ??? (in /foo/bar/baz)
==4074== by 0x41684F: ??? (in /foo/bar/baz)
==4074== by 0x4001B8: ??? (in /foo/bar/baz)
==4074== by 0x7FEFFFF57: ???
==4074== Uninitialised value was created
==4074== at 0x461D3A: ??? (in /foo/bar/baz)
==4074== by 0x43F926: ??? (in /foo/bar/baz)
==4074== by 0x416B9B: ??? (in /foo/bar/baz)
==4074== by 0x416725: ??? (in /foo/bar/baz)
==4074== by 0x4001B8: ??? (in /foo/bar/baz)
==4074== by 0x7FEFFFF57: ???
...
There are no parts of code hinted at, no names of variables, etc. What can I do with this information?
At the end, I finally get the following error, but — as with smaller datasets that do not crash — valgrind finds no leaks:
...
==4074== Process terminating with default action of signal 11 (SIGSEGV)
==4074== Access not within mapped region at address 0x7158E7F7
==4074== at 0x7158E7F7: ???
==4074== by 0x4020B8: ??? (in /foo/bar/baz)
==4074== by 0x6322203A22656D6E: ???
==4074== by 0x306C675F6E557267: ???
==4074== by 0x202C22373232302F: ???
==4074== by 0x6D616E656C696621: ???
==4074== by 0x72686322203A2264: ???
==4074== by 0x3030306C675F6E54: ???
==4074== by 0x346469702E373231: ???
==4074== by 0x646469662E34372F: ???
==4074== by 0x722E64616568656B: ???
==4074== by 0x63656D6F6C756764: ???
==4074== If you believe this happened as a result of a stack
==4074== overflow in your program's main thread (unlikely but
==4074== possible), you can try to increase the size of the
==4074== main thread stack using the --main-stacksize= flag.
==4074== The main thread stack size used in this run was 10485760.
==4074==
==4074== HEAP SUMMARY:
==4074== in use at exit: 0 bytes in 0 blocks
==4074== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==4074==
==4074== All heap blocks were freed -- no leaks are possible
==4074==
==4074== For counts of detected and suppressed errors, rerun with: -v
==4074== ERROR SUMMARY: 1603141870 errors from 86 contexts (suppressed: 0 from 0)
Segmentation fault
Everything I allocate space for gets an equivalent free statement, after which I set pointers to NULL.
At this point, how can I best debug this application, to determine what else is causing the segmentation fault?
22 Dec 2011 - Edit
I compiled a debug-version of my binary, called debug-binary, using the following compilation flags:
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE=1 -DUSE_ZLIB -g -O0 -Wformat -Wall -pedantic -std=gnu99
When I run it with valgrind, I don't get much more information:
valgrind -v --tool=memcheck --leak-check=yes --error-limit=no --track-origins=yes debug-binary input > output
Here's a snippet of output:
==25116== 2 errors in context 14 of 14:
==25116== Invalid read of size 4
==25116== at 0x4045E8: ??? (in /foo/bar/debug-binary)
==25116== by 0x40682F: ??? (in /foo/bar/debug-binary)
==25116== by 0x404F0C: ??? (in /foo/bar/debug-binary)
==25116== by 0x401FA4: ??? (in /foo/bar/debug-binary)
==25116== by 0x402016: ??? (in /foo/bar/debug-binary)
==25116== by 0x403B27: ??? (in /foo/bar/debug-binary)
==25116== by 0x40295E: ??? (in /foo/bar/debug-binary)
==25116== by 0x31A021D993: (below main) (in /lib64/libc-2.5.so)
==25116== Address 0x539f188 is 24 bytes inside a block of size 48 free'd
==25116== at 0x4A05D21: free (vg_replace_malloc.c:325)
==25116== by 0x401F6B: ??? (in /foo/bar/debug-binary)
==25116== by 0x402016: ??? (in /foo/bar/debug-binary)
==25116== by 0x403B27: ??? (in /foo/bar/debug-binary)
==25116== by 0x40295E: ??? (in /foo/bar/debug-binary)
==25116== by 0x31A021D993: (below main) (in /lib64/libc-2.5.so)
Is this an issue with my binary, or with a system library (libc) that my application is dependent upon?
I also don't know what to do about interpreting the ??? entries. Is there another compilation flag I need to get valgrind to provide more information?
Valgrind basically says there are no notable heap management issues. The program is segfaulting from a less complex programming fault.
If it were me, I would
compile it with gcc -g,
enable core dump files (ulimit -c unlimited),
run the program normally,
and let it fault
use gdb to examine the core file and look at what it was doing when it faulted:
gdb (programfile) (corefile)
bt
I don't believe valgrind is able to find all errors where you've overrun a value on the stack (but not overrun the stack itself). So, you may want to try gcc's -f-stack-protector-all option.
You should also try mudflap, with -fmudflap (single-threaded) or -fmudflapth (multi-threaded).
Both mudflap and stack protector should be much faster than valgrind.
In additional, it looks like you don't have debug symbols, making reading backtraces difficult. Add -ggdb.
You probably also want to enable core-file generation (try ulimit -c unlimited). This way, you can try to debug the process post-crash by using gdb program core.
As #wallyk indicates, your segfault may actually be something fairly easy to find—e.g., maybe you're dereferencing NULL, and gdb can point you to the exact line (or, well, close unless you compile with -O0). This would make sense, for example, if you're just running of memory for your larger datasets, and thus malloc returns NULL, and you forgot to check that somewhere.
Lastly, if nothing else makes sense, there is always the possibility of hardware issues. But those would be expected to be fairly random, e.g., different values getting corrupted different runs. If you try a different machine, and it happens there, its extremely unlikely to be a hardware issue.
The "Conditional jump or move depends on uninitialised value" is a serious bug you need to fix. It indicates that the behaviour of your program is affected by the contents of an uninitialised variable (including an uninitialised memory region returned by malloc()).
To get readable backtraces from valgrind you need to compile with -g.

Resources