What could cause malloc to initialise memory? - c

I am writing code to use a library called SCIP (solves optimisation problems). The library itself can be compiled in two ways: create a set of .a files, then the binary, OR create a set of shared objects. In both cases, SCIP is compiled with it's own, rather large, Makefile.
I have two implementations, one which compiles with the .a files (I'll call this program 1), the other links with the shared objects (I'll call this program 2). Program 1 is compiled using a SCIP-provided makefile, whereas program 2 is compiled using my own, much simpler makefile.
The behaviour I'm encountering occurs in the SCIP code, not in code that I wrote. The code extract is as follows:
void* BMSallocMemory_call(size_t size)
{
void* ptr;
size = MAX(size, 1);
ptr = malloc(size);
// This is where I call gdb print statements.
if( ptr == NULL )
{
printf("ERROR - unable to allocate memory for a SCIP*.\n");
}
return ptr;
}
void SCIPcreate(SCIP** A)
{
*A = (SCIP*)BMSallocMemory_call(sizeof(**(A)))
.
.
.
}
If I debug this code in gdb, and step through BMSallocMemory_call() in order to see what's happening, and view the contents of *((SCIP*)(ptr)), I get the following output:
Program 1 gdb output:
289 size = MAX(size, 1);
(gdb) step
284 {
(gdb)
289 size = MAX(size, 1);
(gdb)
290 ptr = malloc(size);
(gdb) print ptr
$1 = <value optimised out>
(gdb) step
292 if( ptr == NULL )
(gdb) print ptr
$2 = <value optimised out>
(gdb) step
290 ptr = malloc(size);
(gdb) print ptr
$3 = (void *) 0x8338448
(gdb) print *((SCIP*)(ptr))
$4 = {mem = 0x0, set = 0x0, interrupt = 0x0, dialoghdlr = 0x0, totaltime = 0x0, stat = 0x0, origprob = 0x0, eventfilter = 0x0, eventqueue = 0x0, branchcand = 0x0, lp = 0x0, nlp = 0x0, relaxation = 0x0, primal = 0x0, tree = 0x0, conflict = 0x0, cliquetable = 0x0, transprob = 0x0, pricestore = 0x0, sepastore = 0x0, cutpool = 0x0}
Program 2 gdb output:
289 size = MAX(size, 1);
(gdb) step
290 ptr = malloc(size);
(gdb) print ptr
$1 = (void *) 0xb7fe450c
(gdb) print *((SCIP*)(ptr))
$2 = {mem = 0x1, set = 0x8232360, interrupt = 0x1, dialoghdlr = 0xb7faa6f8, totaltime = 0x0, stat = 0xb7fe45a0, origprob = 0xb7fe4480, eventfilter = 0xfffffffd, eventqueue = 0x1, branchcand = 0x826e6a0, lp = 0x8229c20, nlp = 0xb7fdde80, relaxation = 0x822a0d0, primal = 0xb7f77d20, tree = 0xb7fd0f20, conflict = 0xfffffffd, cliquetable = 0x1, transprob = 0x8232360, pricestore = 0x1, sepastore = 0x822e0b8, cutpool = 0x0}
The only reason I can think of is that in either program 1's or SCIP's makefile, there is some sort of option that forces malloc to initialise memory it allocates. I simply must learn why the structure is initialised in the compiled implementation, and is not in the shared object implementation.

I doubt the difference has to do with how the two programs are built.
malloc does not initialize the memory it allocates. It may so happen by chance that the memory you get back is filled with zeroes. For example, a program that's just started is more likely to get zero-filled memory from malloc than a program that's been running for a while and allocating/deallocating memory.
edit You may find the following past questions of interest:
malloc zeroing out memory?
Create a wrapper function for malloc and free in C
When and why will an OS initialise memory to 0xCD, 0xDD, etc. on malloc/free/new/delete?

Initialization of malloc-ed memory may be implementation dependent. Implementations are free not to do so for performance reasons, but they could initialize the memory for example in debug mode.
One more note. Even uninitialized memory may contain zeros.

On Linux, according to this thread, memory will be zero-filled when first handed to the application. Thus, if your call to malloc() caused the program's heap to grow, the "new" memory will be zero-filled.
One way to verify is of course to just step into malloc() from your routine, that should make it pretty clear whether or not it contains code to initialize the memory, directly.

Related

gdb - how to call memset for the array of pointers

I debug an example program which defines the array of pointers:
int a = 1, b = 2, c = 3;
int* t[] = {&a, &b, &c};
I would like to set all pointers in the array to NULL during debugging. When I use the following command:
call memset(t, 0x0, sizeof(int*)*3)
I get this output:
$3 = (void *(*)(void *, int, size_t)) 0x7ffff77e7e10 <__memset_avx2_unaligned_erms>
When I print the array pointers are not set to NULL:
(gdb) print t
$4 = {0x7fffffffddc0, 0x7fffffffddc4, 0x7fffffffddc8}
What is wrong ?
I get this output:
You get this output because in your version of GLIBC memset is a GNU indirect function. It doesn't write any memory, it returns an address of the actual implementation (__memset_avx2_unaligned_erms in your case).
You can verify that this is the case:
$ readelf -Ws /lib64/libc.so.6 | grep ' memset'
1233: 00000000000b2df0 241 IFUNC GLOBAL DEFAULT 14 memset##GLIBC_2.2.5
557: 00000000000b2df0 241 FUNC LOCAL DEFAULT 14 memset_ifunc
6000: 00000000000b2df0 241 IFUNC GLOBAL DEFAULT 14 memset
To actually set the memory, you need to call the implementation function, such as __memset_avx2_unaligned_erms.
P.S. To memset an array of 3 pointers, it's easier to simply set each one individually: (gdb) t[0]=0. But I assume the object you actually want to zero out is larger.
For ease of debugging, you may write a trivial local_memset() and call it instead.
Building on Employed Russian's answer, insert () and use
call memset()(t, 0x0, sizeof(int*)*3)
That works because memset() returns the function you actually want to call.

How do I assign an array to a set of GtkWidget * in C?

I have an array of label names FlagEnt[NFLAGS] and I want to assign a GtkWidget * to an array flentry[NFLAGS]. I've tried to do it like this:
for (i = 0; i < NFLAGS; i++)
flentry[i] = GTK_WIDGET(gtk_builder_get_object(builder, FlagEnt[i]));
That didn't work and I ended up with:
(gdb) p flentry
$1 = {0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
not the array of pointers I was expecting, so I looked at what flentry should look like:
(gdb) p &flentry
$3 = (GtkWidget *(*)[8]) 0x5555557cdb60 <flentry>
I noticed the extra * in the definition; that must be from the fact that I declared flentry as:
static GtkWidget * flentry[NFLAGS];
ie as an array. I tested this hypothesis with:
static GtkWidget * tflentry
tflentry = GTK_WIDGET(gtk_builder_get_object(builder, FlagEnt[0]));
(gdb) p tflentry
$1 = 0x555555aa1d60
(gdb) p *tflentry
$2 = {parent_instance = {g_type_instance = {g_class = 0x5555558e9110}, ref_count = 1, qdata = 0x555555aabdf0}, priv = 0x555555aa1c70}
And, Hey presto! tflentry comes up as a valid pointer. Array problems it is, then.
Then I tried to directly assign an element of the array to a pointer I knew worked:
static GtkWidget * tflentry
tflentry = GTK_WIDGET(gtk_builder_get_object(builder, FlagEnt[0]));
flentry[0] = tflentry;
(gdb) p tflentry
$2 = 0x555555ab2a00
(gdb) p flentry[0]
$8 = 0x0
I don't know how to assign the pointer tflentry to the array flentry[].
Anyone able to give me a clue? :)

Extern variable seems to have two addresses?

I am working with a library that declares a uint32_t in a .h and defines it in a .c. The program can't read the variable and crashes.
Declaration :
extern uint32_t SystemCoreClock;
Definition :
uint32_t SystemCoreClock = 4000000;
Using the variable :
uint32_t tickNum=123, *ptr1=NULL, *ptr2=NULL; // added by me to debug
ptr1 = &tickNum; // added by me to debug
ptr2 = &SystemCoreClock; // added by me to debug
tickNum = SystemCoreClock; // added by me to debug
Then I check the addresses with gdb.
Execute first 3 lines.
Normal stuff (ptr1 == &ticknum and *ptr1 == ticknum):
(gdb) p &tickNum
$3 = (uint32_t *) 0x20017fdc
(gdb) p ptr1
$4 = (uint32_t *) 0x20017fdc
(gdb) p *ptr1
$5 = 123
Now execute the last line of code.
Strange stuff (ptr2 != &SystemCoreClock and *ptr2 != SystemCoreClock) :
(gdb) p &SystemCoreClock
$6 = (uint32_t *) 0x20000004 <SystemCoreClock>
(gdb) p ptr2
$7 = (uint32_t *) 0x681b4b20 // should be 0x20000004, beyond end of memory (see memory map bellow)
(gdb) p SystemCoreClock
$8 = 4000000
(gdb) p *ptr2
$9 = 0 // should be 4000000
(gdb) p tickNum
$10 = 0 // should be 4000000
How can that be ?!
Memory map, my guess from the linker script :
0x20000000 - 0x20018000, LENGTH = 96K RAM
0x10000000 - 0x10008000, LENGTH = 32K RAM
0x08000000 - 0x08100000, LENGTH = 1024K FLASH
Real linker script :
/* Specify the memory areas */
MEMORY
{
RAM (xrw) : ORIGIN = 0x20000000, LENGTH = 96K
RAM2 (xrw) : ORIGIN = 0x10000000, LENGTH = 32K
FLASH (rx) : ORIGIN = 0x8000000, LENGTH = 1024K
}
Target : STM32L4 ARM-CortexM4 MCU
Host : Linux, Qt Creator, arm-none-eabi-gcc, OpenOCD
Versions : CubeMX v4.18.0 , STM32L4 package v1.6.0, gcc v5.4.1

Memory Corruption in C linked list

My program after running for around few hours randomly crashes because of a segmentation fault. My environment is Ubuntu (Linux)
When I try to print the data structure thats being accessed when it crashed the pointer is always pointing to invalid memory.
(gdb) p *xxx_info[8]
**Cannot access memory at address 0x7fd200000000**
(gdb)
In order to detect data corruption I add two fence variables that were hardcoded with a well defined value so that I can detect a memory corruption. I see that my fence variables for the linked list node which caused my process to crash had been violated from my earlier logs.
(gdb) p xxx_info
$2 = {0x7fd248000c30, 0x7fd248001050, 0x7fd248000b30, 0x7fd248000d40, 0x7fd248000f50, 0x0,
0x7fd248001160, 0x7fd248001280, 0x7fd200000000, 0x7fd2480008c0,
0x7fd248003000, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0}
(gdb) p xxx_info[8]
$3 = (xxx_info_t *) 0x7fd200000000
(gdb) p *xxx_info[8]
**Cannot access memory at address 0x7fd200000000**
(gdb)
I added the two fence variables at the start and the end of the data structures.
(gdb) pt xxx_info_t
type = struct xxx_info_t {
uint32_t begin_fence;
char *str;
int cluster_id;
uint32_t end_fence;
}
Under normal circumstances the fence variables MUST always be 0xdeaddead as shown below:
(gdb) p/x *xxx_info[7]
$4 = {**begin_fence= 0xdeaddead**, str= 0x7fd248002f10, cluster_id= 0x1f2, **end_fence= 0xdeaddead**}
(gdb)
Whenever I access the array of pointers xxx_info I check for the fence values for each index. I noticed that I would get error messages saying that the fence variables for index 8 look corrupted. Later the code crashed when accessing index 8.
This means that somewhere I am overwriting the memory address pointed to by the index 8 of the array xxx_info .
My question is:
How can I debug these errors? Can I dynamically set some breakpoints so that whenever somebody overwrites that memory address I complain and assert. If this were a global variable that somebody were corrupting I could have set a HW breakpoint. Since this is a list created on heap (i malloc) the addresses that I will get will be dynamic and hence I can't use memory breakpoints/watchpoints.
Any ideas on what I can do?

mudflap error while using socket()

When compiling like this I get the following mudflap violation and I have no clue what it means:
(I am using Debian squeeze, gcc 4.4.5 and eglibc 2.11.2)
mudflap:
myuser#linux:~/Desktop$ export MUDFLAP_OPTIONS="-mode-check -viol-abort -internal-checking -print-leaks -check-initialization -verbose-violations -crumple-zone=32"
myuser#linux:~/Desktop$ gcc -std=c99 -D_POSIX_C_SOURCE=200112L -ggdb3 -O0 -fmudflap -funwind-tables -lmudflap -rdynamic myprogram.c
myuser#linux:~/Desktop$ ./a.out
*******
mudflap violation 1 (check/read): time=1303221485.951128 ptr=0x70cf10 size=16
pc=0x7fc51c9b1cc1 location=`myprogram.c:22:18 (main)'
/usr/lib/libmudflap.so.0(__mf_check+0x41) [0x7fc51c9b1cc1]
./a.out(main+0x113) [0x400b97]
/lib/libc.so.6(__libc_start_main+0xfd) [0x7fc51c665c4d]
Nearby object 1: checked region begins 0B into and ends 15B into
mudflap object 0x70cf90: name=`malloc region'
bounds=[0x70cf10,0x70cf5b] size=76 area=heap check=1r/0w liveness=1
alloc time=1303221485.949881 pc=0x7fc51c9b1431
/usr/lib/libmudflap.so.0(__mf_register+0x41) [0x7fc51c9b1431]
/usr/lib/libmudflap.so.0(__wrap_malloc+0xd2) [0x7fc51c9b2a12]
/lib/libc.so.6(+0xaada5) [0x7fc51c6f1da5]
/lib/libc.so.6(getaddrinfo+0x162) [0x7fc51c6f4782]
Nearby object 2: checked region begins 640B before and ends 625B before
mudflap dead object 0x70d3f0: name=`malloc region'
bounds=[0x70d190,0x70d3c7] size=568 area=heap check=0r/0w liveness=0
alloc time=1303221485.950059 pc=0x7fc51c9b1431
/usr/lib/libmudflap.so.0(__mf_register+0x41) [0x7fc51c9b1431]
/usr/lib/libmudflap.so.0(__wrap_malloc+0xd2) [0x7fc51c9b2a12]
/lib/libc.so.6(+0x6335b) [0x7fc51c6aa35b]
/lib/libc.so.6(+0xac964) [0x7fc51c6f3964]
dealloc time=1303221485.950696 pc=0x7fc51c9b0fe6
/usr/lib/libmudflap.so.0(__mf_unregister+0x36) [0x7fc51c9b0fe6]
/usr/lib/libmudflap.so.0(__real_free+0xa0) [0x7fc51c9b2f40]
/lib/libc.so.6(fclose+0x14d) [0x7fc51c6a9a1d]
/lib/libc.so.6(+0xacc1a) [0x7fc51c6f3c1a]
number of nearby objects: 2
Aborted (core dumped)
myuser#linux:~/Desktop$
gdb:
(gdb) bt
#0 0x00007fd30f18136e in __libc_waitpid (pid=, stat_loc=0x7fff3689d75c, options=) at ../sysdeps/unix/sysv/linux/waitpid.c:32
#1 0x00007fd30f11f299 in do_system (line=) at ../sysdeps/posix/system.c:149
#2 0x00007fd30f44a9c3 in __mf_violation (ptr=, sz=, pc=0, location=0x7fff3689d880 "\360\323p", type=)
at ../../../src/libmudflap/mf-runtime.c:2174
#3 0x00007fd30f44ba5d in __mfu_check (ptr=0x70cf10, sz=, type=, location=)
at ../../../src/libmudflap/mf-runtime.c:1037
#4 0x00007fd30f44bcc1 in __mf_check (ptr=0x70cf10, sz=16, type=0, location=0x400e5a "myprogram.c:22:18 (main)") at ../../../src/libmudflap/mf-runtime.c:816
#5 0x0000000000400b97 in main () at myprogram.c:5
(gdb) bt full
#0 0x00007fd30f18136e in __libc_waitpid (pid=, stat_loc=0x7fff3689d75c, options=) at ../sysdeps/unix/sysv/linux/waitpid.c:32
oldtype =
result =
#1 0x00007fd30f11f299 in do_system (line=) at ../sysdeps/posix/system.c:149
__result = -512
_buffer = {__routine = 0x7fd30f11f5f0 , __arg = 0x7fff3689d758, __canceltype = 915003406, __prev = 0x7fd30f459348}
_avail = 0
status =
save =
pid = 5385
sa = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = {__val = {65536, 0 }}, sa_flags = 0, sa_restorer = 0x7fd30f0ec578}
omask = {__val = {0, 4294967295, 206158430240, 1, 2212816, 0, 140734108391560, 3, 140544470949888, 140544474854386, 140544214827009, 0, 7394247, 140544467453304,
140544471045644, 140734108391424}}
#2 0x00007fd30f44a9c3 in __mf_violation (ptr=, sz=, pc=0, location=0x7fff3689d880 "\360\323p", type=)
at ../../../src/libmudflap/mf-runtime.c:2174
buf = "gdb --pid=5384\000\000\037\317p\000\000\000\000\000\377\377\377\377\000\000\000\000(\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000`\306!", '\000' , "\037\317p\000\000\000\000\000\020\317p\000\000\000\000\000\000 D\017\323\177\000\000\362\263\177\017\323\177\000\000\001\000\000\000\377\177\000\000\000\000\000\000\000\000\000\000\340Pp\000\000\000\000\000hHD\017\323\177\000"
violation_number = 1
#3 0x00007fd30f44ba5d in __mfu_check (ptr=0x70cf10, sz=, type=, location=)
at ../../../src/libmudflap/mf-runtime.c:1037
entry_idx = 1
entry = 0x604ec0
judgement = -512
ptr_high = 140734108391840
__PRETTY_FUNCTION__ = "__mfu_check"
#4 0x00007fd30f44bcc1 in __mf_check (ptr=0x70cf10, sz=16, type=0, location=0x400e5a "myprogram.c:22:18 (main)") at ../../../src/libmudflap/mf-runtime.c:816
__PRETTY_FUNCTION__ = "__mf_check"
#5 0x0000000000400b97 in main () at myprogram.c:5
hints = {ai_flags = 0, ai_family = 0, ai_socktype = 1, ai_protocol = 6, ai_addrlen = 0, ai_addr = 0x0, ai_canonname = 0x0, ai_next = 0x0}
result = 0x70cf10
newsocket = 0
(gdb) quit
source code:
#include "stdio.h" // quotes inserted instead of usual chars for correct website view
#include "sys/socket.h"
#include "netdb.h"
int main(void)
{
struct addrinfo hints, *result;
hints.ai_flags = 0;
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
hints.ai_protocol = IPPROTO_TCP;
hints.ai_addrlen = 0;
hints.ai_canonname = NULL;
hints.ai_addr = NULL;
hints.ai_next = NULL;
if(getaddrinfo("localhost", "25", &hints, &result) != 0)
{
return -1;
}
int newsocket = socket(result->ai_family, result->ai_socktype, result->ai_protocol); // line 22
if(newsocket == -1)
{
freeaddrinfo(result);
return -2;
}
return 0;
}
It appears to be complaining about a read of ununitialized data ("mudflap violation 1 (check/read)"). It looks like there are a couple known regions near the bad address. One a bit further on ("checked region begins 640B before and ends 625B before") has already been freed ("mudflap dead object"). The other actually begins in the same place as the bad read ("checked region begins 0B into and ends 15B into mudflap object 0x70cf90: name=`malloc region'").
Why don't you set -viol-gdb in MUDFLAP_OPTIONS and use GDB to examine the erroneous code?
ETA: The violation occurs because the access history for this region is "check=1r/0w". This indicates that are reading from it, but, as far as libmudflap knows, the region has never been written to. The read thus represents a "use before initialization" error. This is exactly what the -check-initialization flag you supplied to libmudflap is intended to catch.
Of course, the problem is just that your libc is not instrumented by libmudflap, so while libmudflap can intercept the malloc call, it cannot intercept the pointer accesses that are used to initialize the memory. When your program tries to work with the pointer, it thus looks like all its memory has been allocated but never written to (indeed, never accessed at all).
You can ignore this error, drop -check-initialization so it stops being flagged as an error, or build a libc instrumented for libmudflap and link your executable against that version of libc.

Resources