Before my program even starts I am receiving uninitialized value messages that reference function calls that are not in my program. I am confused as to why I am receiving these messages and how I can clean them up?
==24266== Conditional jump or move depends on uninitialised value(s)
==24266== at 0x809098A: __linkin_atfork (in /home/mbarry/workspace/datapup/src/plugin)
==24266== by 0x80919EB: _dl_non_dynamic_init (in /home/mbarry/workspace/datapup/src/plugin)
==24266== by 0x80921B1: __libc_init_first (in /home/mbarry/workspace/datapup/src/plugin)
==24266== by 0x805F60B: (below main) (in /home/mbarry/workspace/datapup/src/plugin)
==24266== Uninitialised value was created
==24266== at 0x8091662: _dl_sysinfo_int80 (in /home/mbarry/workspace/datapup/src/plugin)
==24266== by 0x80BE31F: brk (in /home/mbarry/workspace/datapup/src/plugin)
==24266== by 0x808DE99: sbrk (in /home/mbarry/workspace/datapup/src/plugin)
==24266== by 0x805F96B: __libc_setup_tls (in /home/mbarry/workspace/datapup/src/plugin)
==24266== by 0x805FB66: __pthread_initialize_minimal (in /home/mbarry/workspace/datapup/src/plugin)
==24266== by 0x805F5A3: (below main) (in /home/mbarry/workspace/datapup/src/plugin)
It was incorrect use of -D_THREAD_SAFE -D_REENTRANT -static flags in my gcc makefile causing the memory issue.
Related
I've been testing a C shared library for memory leaks. I got the output below, and I'd like to make sure my understanding of the output is correct.
I'm fairly well-acquainted with valgrind, but I'm used to the output having just one line below the "heap allocation" section, so I'd like to make sure I get this right. I tried to find more info in the valgrind manuals and here and other forums, but couldn't find anything.
Anyway, I ran with these parameters:
valgrind --leak-check=full --log-fd=1 --keep-debuginfo=yes --track-origins=yes
And got this output:
==28303== Conditional jump or move depends on uninitialised value(s)
==28303== at 0x82C80C0: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x830AA05: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x8301896: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0xB4E7E9A: func1 (in /opt/MyDir/MyExit.so_r)
==28303== by 0xB4E9FFD: func2 (in /opt/MyDir/MyExit.so_r)
==28303== by 0xB4EC62E: func3 (in /opt/MyDir/MyExit.so_r)
==28303== by 0xB4ECB7C: func4 (in /opt/MyDir/MyExit.so_r)
==28303== by 0x83AD1B7: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x831DA90: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x8301A1C: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x838DBCA: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x606E8A: ??? (in /opt/mqm/bin/dmpmqmsg)
==28303== Uninitialised value was created by a heap allocation
==28303== at 0x6C29F73: malloc (vg_replace_malloc.c:309)
==28303== by 0x8200401: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x848331B: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x839DC40: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x83167C4: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x83197D1: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x830092D: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x83833B4: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x838EE0C: ??? (in /opt/mqm/lib64/libmqe_r.so)
==28303== by 0x60316B: ??? (in /opt/mqm/bin/dmpmqmsg)
==28303== by 0x7680554: (below main) (in /usr/lib64/libc-2.17.so)
(changed func names for security reasons)
Here's my understanding:
Memory was first allocated in "libc-2.17.so" (or was it in "dmpmqmsg"?), and was not initialized.
After that, the call sequence was as follows:
libc-2.17.so >> dmpmqmsg >> libmqe_r.so >> MyExit.so_r (func4) >> ... >> MyExit.so_r (func1) >> libmqe_r.so
Finally the "conditional jump" which valgrind notified about was in libmqe_r.so at 0x82C80C0, probably an if(pmem != NULL) somewhere in this library
Is this the right interpretation?
In case it's relevant - I'm running on RedHat linux. My code is an MQ Exit which I compiled as a shared library.
First, looking at the "heap allocation" callstack. At the top is malloc (which has been intercepted by memcheck). Then you have a series of 8 calls in libmqe_r.so, all without debug info. Then there's the call from the guest executable, dmpmqmsg. The last line in the callstack, in libc, is the startup function that calls main.
Next, the actual error. Without debug info, it will be difficult to be certain. It looks like your func1 is calling a chain of 3 functions in libmqe_r.so, and passing in some uninitialized heap memory. It's also possible that your code is innocent and the topmost function is accessing some uninitialized static or global object.
I have almost no experience with MQ. There may be a package related to MQ that contains debugging symbols. Installing that would probably help.
I have RaspberryPi 4B with latest Raspbian fully updated. I am trying to make things work like on RaspberryPi 3B, but even simple hello_world.c executed via valgrind is not without errors. Installed valgrind version is valgrind-3.7.0.
When I run this hello world program:
#include <stdio.h>
int main() {
puts("Hello, World!");
return 0;
}
compiled using gcc t.c -o t -g and executed using valgrind ./t I get tons of errors like this:
==5542== Memcheck, a memory error detector
==5542== Copyright (C) 2002-2011, and GNU GPL'd, by Julian Seward et al.
==5542== Using Valgrind-3.7.0 and LibVEX; rerun with -h for copyright info
==5542== Command: ./t
==5542==
--5542-- WARNING: Serious error when reading debug info
--5542-- When reading debug info from /lib/arm-linux-gnueabihf/ld-2.28.so:
--5542-- Ignoring non-Dwarf2/3/4 block in .debug_info
--5542-- WARNING: Serious error when reading debug info
--5542-- When reading debug info from /lib/arm-linux-gnueabihf/ld-2.28.so:
--5542-- Last block truncated in .debug_info; ignoring
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x401A5D0: index (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x401A5D4: index (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x4008040: _dl_dst_count (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x4008288: expand_dynamic_string_token (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x401AA80: strlen (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x401AA84: strlen (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x4017F68: malloc (in /lib/arm-linux-gnueabihf/ld-2.28.so)
.....
==5542== Use of uninitialised value of size 4
==5542== at 0x40103D4: _dl_init (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Use of uninitialised value of size 4
==5542== at 0x400FA00: _dl_fixup (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x400FA8C: _dl_fixup (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Use of uninitialised value of size 4
==5542== at 0x400FA8C: _dl_fixup (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Use of uninitialised value of size 4
==5542== at 0x4015B4C: _dl_runtime_resolve (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
Hello, World!
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x40105D0: _dl_fini (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x4016178: _dl_sort_maps (in /lib/arm-linux-gnueabihf/ld-2.28.so)
==5542==
==5542== Use of uninitialised value of size 4
==5542== at 0x48F8824: free (in /lib/arm-linux-gnueabihf/libc-2.28.so)
==5542==
==5542== Use of uninitialised value of size 4
==5542== at 0x499F050: free_mem (in /lib/arm-linux-gnueabihf/libc-2.28.so)
==5542==
==5542== Conditional jump or move depends on uninitialised value(s)
==5542== at 0x499F0D0: free_mem (in /lib/arm-linux-gnueabihf/libc-2.28.so)
==5542==
==5542== Use of uninitialised value of size 4
==5542== at 0x499EF64: free_slotinfo (in /lib/arm-linux-gnueabihf/libc-2.28.so)
==5542==
==5542==
==5542== HEAP SUMMARY:
==5542== in use at exit: 0 bytes in 0 blocks
==5542== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==5542==
==5542== All heap blocks were freed -- no leaks are possible
==5542==
==5542== For counts of detected and suppressed errors, rerun with: -v
==5542== Use --track-origins=yes to see where uninitialised values come from
==5542== ERROR SUMMARY: 4732 errors from 193 contexts (suppressed: 87 from 1)
Does anyone know what to do about this?
Thanks for advance!
This question already has answers here:
Valgrind reports errors for a very simple C program
(3 answers)
Closed 7 years ago.
I am a C newbie and learning valgrind.
The following is my program. It compiles fine, but when I run valgrind, I see a "Conditional jump or move depends on uninitialised value(s)" stack trace.
I am trying to find where is the uninitialised value in the program. I am seeing a similar output when I use "--track-origins=yes" as well.
I tried looking at other questions on stack overflow, but could not find a definitive answer on this.
Where is uninitialised value?
Code:
1 #include <stdio.h>
2
3 int main()
4 {
5 int x = 5;
6 x = 6;
7 printf ("Hello World %d\n", x);
8
9 return 0;
10 }
Valgrind output is below.
==10154== Conditional jump or move depends on uninitialised value(s)
==10154== at 0x1003FAC3F: _platform_memchr$VARIANT$Haswell (in /usr/lib/system/libsystem_platform.dylib)
==10154== by 0x1001EEB96: __sfvwrite (in /usr/lib/system/libsystem_c.dylib)
==10154== by 0x1001F8FE5: __vfprintf (in /usr/lib/system/libsystem_c.dylib)
==10154== by 0x10021E9AE: __v2printf (in /usr/lib/system/libsystem_c.dylib)
==10154== by 0x10021EC80: __xvprintf (in /usr/lib/system/libsystem_c.dylib)
==10154== by 0x1001F4B71: vfprintf_l (in /usr/lib/system/libsystem_c.dylib)
==10154== by 0x1001F29D7: printf (in /usr/lib/system/libsystem_c.dylib)
==10154== by 0x100000F5D: main (ex1.c:7)
==10154==
Hello World 6
Valgrind might be indicating that the uninitialised value is used within a system library beyond the scope of your program, as evidenced by the number of times "(in /usr/lib/system/libsystem_c.dylib)" appears in the trace you've quoted
It might or might not be a Valgrind error. Valgrind has historically had some serious problems when running on OS X. There are other stable options! I've heard quite a bit of commotion about XCode Instruments, /usr/bin/leaks and /usr/bin/malloc_history, for example...
Here is the valgrind output from a project:
==2433== Invalid free() / delete / delete[] / realloc()
==2433== at 0x402B06C: free (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==2433== by 0x43F345B: av_freep (mem.c:172)
==2433== by 0x5A6F4D2: (below main) (libc-start.c:226)
==2433== Address 0xb3fd830 is 48 bytes inside a block of size 111,634 alloc'd
==2433== at 0x402BE68: malloc (in /usr/lib/valgrind/vgpreload_memcheck-x86-linux.so)
==2433== by 0x80BB6B5: _talloc_realloc (talloc.c:997)
The line starting with Address is indented by one space more compared to the line starting with Invalid. Does that mean one leads onto another? Or are they seperate?
If the are seperate, where does the by 0x5A6F4D2: (below main) (libc-start.c:226) come from? I get the feeling (below main) has something to do with it, but I can't find libc-start.c anywhere on my hard drive.
Yes, it is providing you with additional details on the invalid free. The first four lines describe the invalid call (free in this case) and the call stack at the time of the free. The following three lines provide additional data. In this case, valgrind recognizes that the address passed to free is contained within an allocated region, and it provides the offset, size of the block, and call stack of that allocation.
According to valgrind.org, the hierarchy should be flat, as shown below:
==3016== Invalid write of size 1
==3016== at 0x80484DA: main (in /jfs/article/sample2)
==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016== by 0x80483B1: free##GLIBC_2.0 (in /jfs/article/sample2)
==3016== Address 0x40CA0224 is 0 bytes after a block of size 512 alloc'd
==3016== at 0x400483E4: malloc (vg_clientfuncs.c:100)
==3016== by 0x80484AA: main (in /jfs/article/sample2)
==3016== by 0x40271507: __libc_start_main (../sysdeps/generic/libc-start.c:129)
==3016== by 0x80483B1: free##GLIBC_2.0 (in /jfs/article/sample2)
I would treat the indentation of Address in your output as the above, as it may be a version-specific change to make the output more readable.
I wrote a C-based application that appears to run fine, except on very large datasets as input.
With large input, I get a segmentation fault at the end steps of the binary's functionality.
I ran the binary (with the test input) with valgrind:
valgrind --tool=memcheck --leak-check=yes /foo/bar/baz inputDataset > outputAnalysis
This job normally takes a few hours, but with valgrind it took seven days.
Unfortunately, at this point, I don't know how to read the results I am getting from this run.
I get a lot of these warnings:
...
==4074== Conditional jump or move depends on uninitialised value(s)
==4074== at 0x435900: ??? (in /foo/bar/baz)
==4074== by 0x439CC5: ??? (in /foo/bar/baz)
==4074== by 0x400BF2: ??? (in /foo/bar/baz)
==4074== by 0x402086: ??? (in /foo/bar/baz)
==4074== by 0x402A0F: ??? (in /foo/bar/baz)
==4074== by 0x41684F: ??? (in /foo/bar/baz)
==4074== by 0x4001B8: ??? (in /foo/bar/baz)
==4074== by 0x7FEFFFF57: ???
==4074== Uninitialised value was created
==4074== at 0x461D3A: ??? (in /foo/bar/baz)
==4074== by 0x43F926: ??? (in /foo/bar/baz)
==4074== by 0x416B9B: ??? (in /foo/bar/baz)
==4074== by 0x416725: ??? (in /foo/bar/baz)
==4074== by 0x4001B8: ??? (in /foo/bar/baz)
==4074== by 0x7FEFFFF57: ???
...
There are no parts of code hinted at, no names of variables, etc. What can I do with this information?
At the end, I finally get the following error, but — as with smaller datasets that do not crash — valgrind finds no leaks:
...
==4074== Process terminating with default action of signal 11 (SIGSEGV)
==4074== Access not within mapped region at address 0x7158E7F7
==4074== at 0x7158E7F7: ???
==4074== by 0x4020B8: ??? (in /foo/bar/baz)
==4074== by 0x6322203A22656D6E: ???
==4074== by 0x306C675F6E557267: ???
==4074== by 0x202C22373232302F: ???
==4074== by 0x6D616E656C696621: ???
==4074== by 0x72686322203A2264: ???
==4074== by 0x3030306C675F6E54: ???
==4074== by 0x346469702E373231: ???
==4074== by 0x646469662E34372F: ???
==4074== by 0x722E64616568656B: ???
==4074== by 0x63656D6F6C756764: ???
==4074== If you believe this happened as a result of a stack
==4074== overflow in your program's main thread (unlikely but
==4074== possible), you can try to increase the size of the
==4074== main thread stack using the --main-stacksize= flag.
==4074== The main thread stack size used in this run was 10485760.
==4074==
==4074== HEAP SUMMARY:
==4074== in use at exit: 0 bytes in 0 blocks
==4074== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==4074==
==4074== All heap blocks were freed -- no leaks are possible
==4074==
==4074== For counts of detected and suppressed errors, rerun with: -v
==4074== ERROR SUMMARY: 1603141870 errors from 86 contexts (suppressed: 0 from 0)
Segmentation fault
Everything I allocate space for gets an equivalent free statement, after which I set pointers to NULL.
At this point, how can I best debug this application, to determine what else is causing the segmentation fault?
22 Dec 2011 - Edit
I compiled a debug-version of my binary, called debug-binary, using the following compilation flags:
-D_FILE_OFFSET_BITS=64 -D_LARGEFILE64_SOURCE=1 -DUSE_ZLIB -g -O0 -Wformat -Wall -pedantic -std=gnu99
When I run it with valgrind, I don't get much more information:
valgrind -v --tool=memcheck --leak-check=yes --error-limit=no --track-origins=yes debug-binary input > output
Here's a snippet of output:
==25116== 2 errors in context 14 of 14:
==25116== Invalid read of size 4
==25116== at 0x4045E8: ??? (in /foo/bar/debug-binary)
==25116== by 0x40682F: ??? (in /foo/bar/debug-binary)
==25116== by 0x404F0C: ??? (in /foo/bar/debug-binary)
==25116== by 0x401FA4: ??? (in /foo/bar/debug-binary)
==25116== by 0x402016: ??? (in /foo/bar/debug-binary)
==25116== by 0x403B27: ??? (in /foo/bar/debug-binary)
==25116== by 0x40295E: ??? (in /foo/bar/debug-binary)
==25116== by 0x31A021D993: (below main) (in /lib64/libc-2.5.so)
==25116== Address 0x539f188 is 24 bytes inside a block of size 48 free'd
==25116== at 0x4A05D21: free (vg_replace_malloc.c:325)
==25116== by 0x401F6B: ??? (in /foo/bar/debug-binary)
==25116== by 0x402016: ??? (in /foo/bar/debug-binary)
==25116== by 0x403B27: ??? (in /foo/bar/debug-binary)
==25116== by 0x40295E: ??? (in /foo/bar/debug-binary)
==25116== by 0x31A021D993: (below main) (in /lib64/libc-2.5.so)
Is this an issue with my binary, or with a system library (libc) that my application is dependent upon?
I also don't know what to do about interpreting the ??? entries. Is there another compilation flag I need to get valgrind to provide more information?
Valgrind basically says there are no notable heap management issues. The program is segfaulting from a less complex programming fault.
If it were me, I would
compile it with gcc -g,
enable core dump files (ulimit -c unlimited),
run the program normally,
and let it fault
use gdb to examine the core file and look at what it was doing when it faulted:
gdb (programfile) (corefile)
bt
I don't believe valgrind is able to find all errors where you've overrun a value on the stack (but not overrun the stack itself). So, you may want to try gcc's -f-stack-protector-all option.
You should also try mudflap, with -fmudflap (single-threaded) or -fmudflapth (multi-threaded).
Both mudflap and stack protector should be much faster than valgrind.
In additional, it looks like you don't have debug symbols, making reading backtraces difficult. Add -ggdb.
You probably also want to enable core-file generation (try ulimit -c unlimited). This way, you can try to debug the process post-crash by using gdb program core.
As #wallyk indicates, your segfault may actually be something fairly easy to find—e.g., maybe you're dereferencing NULL, and gdb can point you to the exact line (or, well, close unless you compile with -O0). This would make sense, for example, if you're just running of memory for your larger datasets, and thus malloc returns NULL, and you forgot to check that somewhere.
Lastly, if nothing else makes sense, there is always the possibility of hardware issues. But those would be expected to be fairly random, e.g., different values getting corrupted different runs. If you try a different machine, and it happens there, its extremely unlikely to be a hardware issue.
The "Conditional jump or move depends on uninitialised value" is a serious bug you need to fix. It indicates that the behaviour of your program is affected by the contents of an uninitialised variable (including an uninitialised memory region returned by malloc()).
To get readable backtraces from valgrind you need to compile with -g.