When compiling like this I get the following mudflap violation and I have no clue what it means:
(I am using Debian squeeze, gcc 4.4.5 and eglibc 2.11.2)
mudflap:
myuser#linux:~/Desktop$ export MUDFLAP_OPTIONS="-mode-check -viol-abort -internal-checking -print-leaks -check-initialization -verbose-violations -crumple-zone=32"
myuser#linux:~/Desktop$ gcc -std=c99 -D_POSIX_C_SOURCE=200112L -ggdb3 -O0 -fmudflap -funwind-tables -lmudflap -rdynamic myprogram.c
myuser#linux:~/Desktop$ ./a.out
*******
mudflap violation 1 (check/read): time=1303221485.951128 ptr=0x70cf10 size=16
pc=0x7fc51c9b1cc1 location=`myprogram.c:22:18 (main)'
/usr/lib/libmudflap.so.0(__mf_check+0x41) [0x7fc51c9b1cc1]
./a.out(main+0x113) [0x400b97]
/lib/libc.so.6(__libc_start_main+0xfd) [0x7fc51c665c4d]
Nearby object 1: checked region begins 0B into and ends 15B into
mudflap object 0x70cf90: name=`malloc region'
bounds=[0x70cf10,0x70cf5b] size=76 area=heap check=1r/0w liveness=1
alloc time=1303221485.949881 pc=0x7fc51c9b1431
/usr/lib/libmudflap.so.0(__mf_register+0x41) [0x7fc51c9b1431]
/usr/lib/libmudflap.so.0(__wrap_malloc+0xd2) [0x7fc51c9b2a12]
/lib/libc.so.6(+0xaada5) [0x7fc51c6f1da5]
/lib/libc.so.6(getaddrinfo+0x162) [0x7fc51c6f4782]
Nearby object 2: checked region begins 640B before and ends 625B before
mudflap dead object 0x70d3f0: name=`malloc region'
bounds=[0x70d190,0x70d3c7] size=568 area=heap check=0r/0w liveness=0
alloc time=1303221485.950059 pc=0x7fc51c9b1431
/usr/lib/libmudflap.so.0(__mf_register+0x41) [0x7fc51c9b1431]
/usr/lib/libmudflap.so.0(__wrap_malloc+0xd2) [0x7fc51c9b2a12]
/lib/libc.so.6(+0x6335b) [0x7fc51c6aa35b]
/lib/libc.so.6(+0xac964) [0x7fc51c6f3964]
dealloc time=1303221485.950696 pc=0x7fc51c9b0fe6
/usr/lib/libmudflap.so.0(__mf_unregister+0x36) [0x7fc51c9b0fe6]
/usr/lib/libmudflap.so.0(__real_free+0xa0) [0x7fc51c9b2f40]
/lib/libc.so.6(fclose+0x14d) [0x7fc51c6a9a1d]
/lib/libc.so.6(+0xacc1a) [0x7fc51c6f3c1a]
number of nearby objects: 2
Aborted (core dumped)
myuser#linux:~/Desktop$
gdb:
(gdb) bt
#0 0x00007fd30f18136e in __libc_waitpid (pid=, stat_loc=0x7fff3689d75c, options=) at ../sysdeps/unix/sysv/linux/waitpid.c:32
#1 0x00007fd30f11f299 in do_system (line=) at ../sysdeps/posix/system.c:149
#2 0x00007fd30f44a9c3 in __mf_violation (ptr=, sz=, pc=0, location=0x7fff3689d880 "\360\323p", type=)
at ../../../src/libmudflap/mf-runtime.c:2174
#3 0x00007fd30f44ba5d in __mfu_check (ptr=0x70cf10, sz=, type=, location=)
at ../../../src/libmudflap/mf-runtime.c:1037
#4 0x00007fd30f44bcc1 in __mf_check (ptr=0x70cf10, sz=16, type=0, location=0x400e5a "myprogram.c:22:18 (main)") at ../../../src/libmudflap/mf-runtime.c:816
#5 0x0000000000400b97 in main () at myprogram.c:5
(gdb) bt full
#0 0x00007fd30f18136e in __libc_waitpid (pid=, stat_loc=0x7fff3689d75c, options=) at ../sysdeps/unix/sysv/linux/waitpid.c:32
oldtype =
result =
#1 0x00007fd30f11f299 in do_system (line=) at ../sysdeps/posix/system.c:149
__result = -512
_buffer = {__routine = 0x7fd30f11f5f0 , __arg = 0x7fff3689d758, __canceltype = 915003406, __prev = 0x7fd30f459348}
_avail = 0
status =
save =
pid = 5385
sa = {__sigaction_handler = {sa_handler = 0x1, sa_sigaction = 0x1}, sa_mask = {__val = {65536, 0 }}, sa_flags = 0, sa_restorer = 0x7fd30f0ec578}
omask = {__val = {0, 4294967295, 206158430240, 1, 2212816, 0, 140734108391560, 3, 140544470949888, 140544474854386, 140544214827009, 0, 7394247, 140544467453304,
140544471045644, 140734108391424}}
#2 0x00007fd30f44a9c3 in __mf_violation (ptr=, sz=, pc=0, location=0x7fff3689d880 "\360\323p", type=)
at ../../../src/libmudflap/mf-runtime.c:2174
buf = "gdb --pid=5384\000\000\037\317p\000\000\000\000\000\377\377\377\377\000\000\000\000(\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000`\306!", '\000' , "\037\317p\000\000\000\000\000\020\317p\000\000\000\000\000\000 D\017\323\177\000\000\362\263\177\017\323\177\000\000\001\000\000\000\377\177\000\000\000\000\000\000\000\000\000\000\340Pp\000\000\000\000\000hHD\017\323\177\000"
violation_number = 1
#3 0x00007fd30f44ba5d in __mfu_check (ptr=0x70cf10, sz=, type=, location=)
at ../../../src/libmudflap/mf-runtime.c:1037
entry_idx = 1
entry = 0x604ec0
judgement = -512
ptr_high = 140734108391840
__PRETTY_FUNCTION__ = "__mfu_check"
#4 0x00007fd30f44bcc1 in __mf_check (ptr=0x70cf10, sz=16, type=0, location=0x400e5a "myprogram.c:22:18 (main)") at ../../../src/libmudflap/mf-runtime.c:816
__PRETTY_FUNCTION__ = "__mf_check"
#5 0x0000000000400b97 in main () at myprogram.c:5
hints = {ai_flags = 0, ai_family = 0, ai_socktype = 1, ai_protocol = 6, ai_addrlen = 0, ai_addr = 0x0, ai_canonname = 0x0, ai_next = 0x0}
result = 0x70cf10
newsocket = 0
(gdb) quit
source code:
#include "stdio.h" // quotes inserted instead of usual chars for correct website view
#include "sys/socket.h"
#include "netdb.h"
int main(void)
{
struct addrinfo hints, *result;
hints.ai_flags = 0;
hints.ai_family = AF_UNSPEC;
hints.ai_socktype = SOCK_STREAM;
hints.ai_protocol = IPPROTO_TCP;
hints.ai_addrlen = 0;
hints.ai_canonname = NULL;
hints.ai_addr = NULL;
hints.ai_next = NULL;
if(getaddrinfo("localhost", "25", &hints, &result) != 0)
{
return -1;
}
int newsocket = socket(result->ai_family, result->ai_socktype, result->ai_protocol); // line 22
if(newsocket == -1)
{
freeaddrinfo(result);
return -2;
}
return 0;
}
It appears to be complaining about a read of ununitialized data ("mudflap violation 1 (check/read)"). It looks like there are a couple known regions near the bad address. One a bit further on ("checked region begins 640B before and ends 625B before") has already been freed ("mudflap dead object"). The other actually begins in the same place as the bad read ("checked region begins 0B into and ends 15B into mudflap object 0x70cf90: name=`malloc region'").
Why don't you set -viol-gdb in MUDFLAP_OPTIONS and use GDB to examine the erroneous code?
ETA: The violation occurs because the access history for this region is "check=1r/0w". This indicates that are reading from it, but, as far as libmudflap knows, the region has never been written to. The read thus represents a "use before initialization" error. This is exactly what the -check-initialization flag you supplied to libmudflap is intended to catch.
Of course, the problem is just that your libc is not instrumented by libmudflap, so while libmudflap can intercept the malloc call, it cannot intercept the pointer accesses that are used to initialize the memory. When your program tries to work with the pointer, it thus looks like all its memory has been allocated but never written to (indeed, never accessed at all).
You can ignore this error, drop -check-initialization so it stops being flagged as an error, or build a libc instrumented for libmudflap and link your executable against that version of libc.
Related
I use gdb test core and get this:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 0x0000557ce64b63f8 in _create (str=str#entry=0x557ce80a8820 "SEND")
at system.c:708
708 data->res = command->data->res;
(gdb) bt
#0 0x0000557ce64b63f8 in _create (str=str#entry=0x557ce80a8820 "SEND")
at system.c:708
#1 0x0000557ce64b2ef1 in make_command (s=<optimized out>, cmd=0x557ce809cb70) at command.c:121
#2 0x0000557ce63aefdf in main (argc=<optimized out>, argv=0x7fff19053278) at main.c:394
(gdb) p *command
$1 = {status = 1, data = 0x7f21027e9a80, sum = 1543465568, time = 0, msg = { str = 0x7f20fd19f080 "GOOD", len = 4}, id = 2}
(gdb) p *command->data
$2 = {status = 1, item = 0x7f21027eb780, res = 0x7f2100990b00, sum = 1133793665}
(gdb) p *command->data->res
$3 = {msg = { str = 0x7f21010a5500 "Hi, test, test"..., len = 14}, status = 1}
(gdb) p *data
$4 = {status = 1, type = 5, res = 0x0, id = 2}
as you can see, the pointer command and command->data and data are all valid, why this SIGSEGV happened?
why this SIGSEGV happened?
We can't tell.
One possible reason: some other code is actually executing and crashing.
This could happen if system.c has been edited or updated, but the program has not been rebuilt with the new source. Or if the compiler mapping of program counter to file/line is inaccurate (this often happens with optimized code).
If you edit your question to show the output from list _create, disas $pc and info registers, we may be able to tell you more.
I have an issue with my pointer to a structure variable. I just started using GDB to debug the issue. The application stops when it hits on the line of code below due to segmentation fault. ptr_var is a pointer to a structure
ptr_var->page = 0;
I discovered that ptr_var is set to an invalid memory 0x0 after a series of function calls which caused the segmentation fault when assigning the value "0" to struct member "page". The series of function calls does not have a reference to ptr_var. The old address that used to be assigned to ptr_var is still in memory. I can still still print the values of members from the struct ptr_var using the old address. GDB session below shows that I am printing a string member of the struct ptr_var using its address
(gdb) x /s *0x7e11c0
0x7e0810: "Sample String"
I couldn't tell when the variable ptr_var gets assigned an invalid address 0x0. I'm a newbie to GDB and an average C programmer. Your assistance in this matter is greatly appreciated. Thank you.
What you want to do is set a watchpoint, GDB will then stop execution every time a member of a struct is modified.
With the following example code
typedef struct {
int val;
} Foo;
int main(void) {
Foo foo;
foo.val = 5;
foo.val = 10;
}
Drop a breakpoint at the creation of the struct and execute watch -l foo.val Then every time that member is changed you will get a break. The following is my GDB session, with my input
(gdb) break test.c:8
Breakpoint 3 at 0x4006f9: file test.c, line 8.
(gdb) run
Starting program: /usr/home/sean/a.out
Breakpoint 3, main () at test.c:9
9 foo.val = 5;
(gdb) watch -l foo.val
Hardware watchpoint 4: -location foo.val
(gdb) cont
Continuing.
Hardware watchpoint 4: -location foo.val
Old value = 0
New value = 5
main () at test.c:10
(gdb) cont
Continuing.
Hardware watchpoint 4: -location foo.val
Old value = 5
New value = 10
main () at test.c:11
(gdb) cont
If you can rerun, then break at a point where ptr_var is correct you can set a watch point on ptr_var like this: (gdb) watch ptr_var. Now when you continue every time ptr_var is modified gdb should stop.
Here's an example. This does contain undefined behaviour, as I'm trying to reproduce a bug, but hopefully it should be good enough to show you what I'm suggesting:
#include <stdio.h>
#include <stdint.h>
int target1;
int target2;
void
bad_func (int **bar)
{
/* Set contents of bar. */
uintptr_t ptr = (uintptr_t) bar;
printf ("Should clear %p\n", (void *) ptr);
ptr += sizeof (int *);
printf ("Will clear %p\n", (void *) ptr);
/* Bad! We just corrupted foo (maybe). */
*((int **) ptr) = NULL;
}
int
main ()
{
int *foo = &target1;
int *bar = &target2;
printf ("&foo = %p\n", (void *) &foo);
printf ("&boo = %p\n", (void *) &bar);
bad_func (&bar);
return *foo;
}
And here's a gdb session:
(gdb) break bad_func
Breakpoint 1 at 0x400542: file watch.c, line 11.
(gdb) r
&foo = 0x7fffffffdb88
&boo = 0x7fffffffdb80
Breakpoint 1, bad_func (bar=0x7fffffffdb80) at watch.c:11
11 uintptr_t ptr = (uintptr_t) bar;
(gdb) up
#1 0x00000000004005d9 in main () at watch.c:27
27 bad_func (&bar);
(gdb) watch foo
Hardware watchpoint 2: foo
(gdb) c
Continuing.
Should clear 0x7fffffffdb80
Will clear 0x7fffffffdb88
Hardware watchpoint 2: foo
Old value = (int *) 0x60103c <target1>
New value = (int *) 0x0
bad_func (bar=0x7fffffffdb80) at watch.c:18
18 }
(gdb)
For some reason the watchpoint appears to trigger on the line after the change was made, even though I compiled this with -O0, which is a bit of a shame. Still, it's usually close enough to help identify the problem.
For such kind of problems I often use the old electric fence library, it can be used to find bug in "software that overruns the boundaries of a malloc() memory allocation". You will find all the instructions and basic usage at this page:
http://elinux.org/Electric_Fence
(At the very end of the page linked above you will find the download link)
Background
I am writing a tool to boot up an embedded ARM system over USB. This particular ARM system has a boot loader which can load a system over USB by emulating a Mass storage device and implementing some vendor SCSI opcodes which allow the host to write information to memory. My tool, which runs on the host to which the embedded ARM system is attached, is to send a zImage or other binary to the device using these vendor commands.
I use the Linux generic SCSI interface to send the commands.
After sending a few commands to write values into the registers that control the RAM controller, my program opens a file, then enters a loop within which it reads 4096 bytes at a time from the file, then sends them to the device.
I do not have any documentation for the SCSI commands that need to be sent. I have determined the protocol to use by capturing and analyzing the USB traffic which is sent by an equivalent windows-only tool that the vendor provides. There are some strange aspects to this protocol, particularly that it accepts addresses and values in little endian format and that 32 bit values within the SCSI commands are not word aligned, however I don't think these have any bearing to the problem at hand.
The Problem
After sending the first 7 buffers, the program segfaults.
The section that segfaults is as follows:
int ak_usbboot_writefile(ak_usbboot_dev* dev, const char *filename, uint32_t addr) {
uint8_t dataBuff[DATABUFF_SIZE];
size_t len;
printf("STOREFILE: FILENAME=%s ADDR=%08x\n", filename, addr);
ak_usbboot_errno = AK_USBBOOT_OK;
FILE *f = fopen(filename, "rb");
if (f==NULL) {
ak_usbboot_errno = errno;
return errno;
}
/* Segfault occurs on the next line */
while ( (len = fread(dataBuff, 1, DATABUFF_SIZE, f)) > 0) {
printf("read len=%ld\n", len);
int r = ak_usbboot_storemem(dev, dataBuff, len, addr);
if (r!=AK_USBBOOT_OK) {
goto EXIT;
}
addr += len;
}
The segfault occurs calling fread. The backtrace looks like this:
#0 __memcpy_sse2 () at ../sysdeps/x86_64/memcpy.S:272
#1 0x00007f92907b9233 in __GI__IO_file_xsgetn (fp=0x1f10030, data=<optimized out>, n=4096) at fileops.c:1427
#2 0x00007f92907ae9d8 in __GI__IO_fread (buf=<optimized out>, size=1, count=4096, fp=0x1f10030) at iofread.c:42
#3 0x0000000000401492 in ak_usbboot_writefile (dev=0x1f10010, filename=0x7fff078b0718 "/home/harmic/git/Lamobo-D1s/tool/burntool/zImage", addr=2174808064) at ak_usbboot.c:217
#4 0x0000000000400c4d in ak_boot (dev_name=0x7fff078b070f "/dev/sg2", file=0x7fff078b0718 "/home/harmic/git/Lamobo-D1s/tool/burntool/zImage") at main.c:86
#5 0x0000000000400d68 in cmd_boot (argc=2, argv=0x7fff078af538) at main.c:114
#6 0x0000000000400dfc in main (argc=4, argv=0x7fff078af528) at main.c:130
I can't see anything wrong with the way the file is being handled, and if I comment out the call to ak_usbboot_storemem then the loop completes with no problems.
ak_usbboot_storemem looks like this:
int ak_usbboot_storemem(ak_usbboot_dev* dev, const void* buffer, uint32_t len, uint32_t addr) {
uint8_t cmdBuff[16] = {
0xf1, 0x3f, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0x68, 0, 0
};
printf("STORE: INBUFF=%p LEN=%08x ADDR=%08x\n", buffer, len, addr);
memcpy(&cmdBuff[5], &addr, 4);
memcpy(&cmdBuff[9], &len, 4);
return _sendCmd(dev, &cmdBuff, sizeof(cmdBuff), (void*)buffer, len, SG_DXFER_TO_DEV);
}
_sendCmd looks like this:
int _sendCmd(ak_usbboot_dev* dev, const void* cmdBuff, int cmdLen, void* dataBuff, int dataLen, int sg_dir) {
fputs("CMD: ", stdout);
const uint8_t* p = (const uint8_t*)cmdBuff;
for (int i=0; i<cmdLen; i++) {
printf("%02x ", *p++);
}
fputs("\n", stdout);
sg_io_hdr_t io_hdr = {
.interface_id = 'S',
.dxfer_direction = sg_dir,
.cmd_len = cmdLen,
.mx_sb_len = sizeof(dev->sense_buffer),
.iovec_count = 0,
.dxfer_len = dataLen,
.dxferp = dataBuff,
.cmdp = (void*)cmdBuff,
.sbp = dev->sense_buffer,
.timeout = 10000,
.flags = 0,
.pack_id = 0,
};
if (ioctl(dev->fd, SG_IO, &io_hdr) < 0) {
ak_usbboot_errno = errno;
return ak_usbboot_errno;
}
if ((io_hdr.info & SG_INFO_OK_MASK) != SG_INFO_OK) {
dev->sb_len = io_hdr.sb_len_wr;
dev->driver_status = io_hdr.driver_status;
dev->masked_status = io_hdr.masked_status;
dev->host_status = io_hdr.host_status;
ak_usbboot_errno = AK_USBBOOT_SCSIERR;
return AK_USBBOOT_SCSIERR;
} else {
dev->err = AK_USBBOOT_OK;
return AK_USBBOOT_OK;
}
}
I am guessing something I am doing with the SCSI Generic IOCTL is causing this, but I have not been able to spot anything so far.
Any insights welcomed!
The comment from #Andrew Medico put me on the right track. I should have thought of using valgrind earlier.
Valgrind reported multiple errors like this:
==28114== Invalid write of size 4
==28114== at 0x400FF5: _sendCmd (ak_usbboot.c:73)
==28114== by 0x4010D7: ak_usbboot_open (ak_usbboot.c:104)
==28114== by 0x400B7E: ak_boot (main.c:70)
==28114== by 0x400D67: cmd_boot (main.c:114)
==28114== by 0x400DFB: main (main.c:130)
==28114== Address 0x51f3074 is not stack'd, malloc'd or (recently) free'd
When running under valgrind, the program completed normally, booting the device as it should!
ak_usbboot.c:73 is this line:
dev->err = AK_USBBOOT_OK;
That lead me to look more closely at where dev was being allocated:
ak_usbboot_dev* dev = malloc(sizeof(dev));
Oops. I was allocating enough space for a pointer to a struct, rather than to the struct itself. As a result writing to the struct was corrupting the heap.
Of course it should have been:
ak_usbboot_dev* dev = malloc(sizeof(*dev));
This answer is probably not much use to anyone else, other than as a tip as to how to track down such problems - valgrind is a godsend.
Ok, this is really freaking me out. I have a following function that just reads input and returns a string
unsigned char* readFromIn() {
unsigned char* text = malloc(1024);
if (fgets(text, 1024, stdin) != NULL) { <--This is what's causing segmentation fault
int textLen = strlen(text);
if (textLen > 0 && text[textLen - 1] == '\n')
text[textLen - 1] = '\0'; // getting rid of newline character
return text;
}
else {
free(text);
return NULL;
}
}
The thing is, this function isn't called anywhere and just to confirm, I changed the name of the function to something crazy like 9rawiohawr90awrhiokawrioawr and put printf statement on the top of the function.
I'm genuinely not sure why an uncalled function might cause a segmentation fault error.
I'm using gcc 4.6.3 on ubuntu.
Edit: I know that the line
if (fgets(text, 1024, stdin) != NULL) {
is the offending code because as soon as i comment out that conditional, no segmentation error occurs.
I know that the function is NOT being called because i'm seeing no output of the printf debug statement I put.
Edit2: I've tried changing the type from unsigned char to char. Still segmentation error. I will try to get gdb output.
Edit3: gdb backtrace produced the following
#0 0xb7fa5ac2 in _IO_2_1_stdin_ () from /lib/i386-linux-gnu/libc.so.6
#1 0xb7faf2fb in libwebsocket_create_context (info=0xbffff280) at libwebsockets.c:2125
#2 0x0804a5bb in main()
doing frame 0,1,2 doesn't output anything interesting in particular.
Edit4: I've tried all of the suggestions in the comment, but to no avail, I still get the same segmentation fault.
So I installed a fresh copy of Ubuntu on a virtual OS and recompiled my code. Still the same issue occurs.
It seems to me the problem is in either some obscurity going on in my code or the library itself. I've created a minimal example demonstrating the problem:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <libwebsockets.h>
unsigned char* readFromIn() {
unsigned char* text = malloc(1024);
if (fgets(text, 1024, stdin) != NULL) { <--SEGMENTATION FAULT HERE
int textLen = strlen(text);
if (textLen > 0 && text[textLen - 1] == '\n')
text[textLen - 1] = '\0';
return text;
}
else {
free(text);
return NULL;
}
}
int callback_http(struct libwebsocket_context *context,
struct libwebsocket *wsi,
enum libwebsocket_callback_reasons reason, void *user,
void *in, size_t len)
{
return 0;
}
static struct libwebsocket_protocols protocols[] = {
/* first protocol must always be HTTP handler */
{
"http-only", // name
callback_http, // callback
0 // per_session_data_size
}
};
int main(void) {
printf("Initializing Web Server\n");
// server url will be http://localhost:8081
int port = 8081;
const char *interface = NULL;
struct libwebsocket_context *context;
// we're not using ssl
const char *cert_path = NULL;
const char *key_path = NULL;
// no special options
int opts = 0;
struct lws_context_creation_info info;
memset(&info, 0, sizeof info);
info.port = port;
info.iface = interface;
info.protocols = protocols;
info.extensions = libwebsocket_get_internal_extensions();
info.ssl_cert_filepath = NULL;
info.ssl_private_key_filepath = NULL;
info.gid = -1;
info.uid = -1;
info.options = opts;
context = libwebsocket_create_context(&info);
if (context == NULL) {
fprintf(stderr, "libwebsocket init failed\n");
return 0;
}
printf("starting server...\n");
while (1) {
libwebsocket_service(context, 50);
}
printf("Shutting server down...\n");
libwebsocket_context_destroy(context);
return 0;
}
And here's how I compiled my code
gcc -g testbug.c -o test -lwebsockets
Here's the library I'm using
http://git.libwebsockets.org/cgi-bin/cgit/libwebsockets/tag/?id=v1.23-chrome32-firefox24
You will see that I'm not calling the function readFromIn() yet, segmentation fault occurs as soon as you try to run the executable.
I've re-ran gdb and this time, backtrace and the frames tell me a little bit more info.
(gdb) run
Starting program: /home/l46kok/Desktop/websocketserver/test
Initializing Web Server
[1384002761:2270] NOTICE: Initial logging level 7
[1384002761:2270] NOTICE: Library version: 1.3 unknown-build-hash
[1384002761:2271] NOTICE: Started with daemon pid 0
[1384002761:2271] NOTICE: static allocation: 4448 + (12 x 1024 fds) = 16736 bytes
[1384002761:2271] NOTICE: canonical_hostname = ubuntu
[1384002761:2271] NOTICE: Compiled with OpenSSL support
[1384002761:2271] NOTICE: Using non-SSL mode
[1384002761:2271] NOTICE: per-conn mem: 124 + 1360 headers + protocol rx buf
[1384002761:2294] NOTICE: Listening on port 8081
Program received signal SIGSEGV, Segmentation fault.
0xb7fb1ac0 in _IO_2_1_stdin_ () from /lib/i386-linux-gnu/libc.so.6
(gdb) backtrace
#0 0xb7fb1ac0 in _IO_2_1_stdin_ () from /lib/i386-linux-gnu/libc.so.6
#1 0xb7fcc2c6 in libwebsocket_create_context () from /usr/local/lib/libwebsockets.so.4.0.0
#2 0x080488c4 in main () at testbug.c:483
(gdb) frame 1
#1 0xb7fcc2c6 in libwebsocket_create_context () from /usr/local/lib/libwebsockets.so.4.0.0
(gdb) frame 2
#2 0x080488c4 in main () at testbug.c:483
483 context = libwebsocket_create_context(&info);
So yeah.. I think I gave all the information at hand.. but I'm genuinely not sure what the issue is. The program causes segmentation fault at line 483 but the issue is gone when I comment out the offending function that's not being called.
Probably you're missing something when initializing libwebsockets.
Indeed, recompiling libwebsockets with debug reveals that:
GNU gdb (GDB) 7.6.1 (Debian 7.6.1-1)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law. Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>...
Reading symbols from /home/vili/x...done.
(gdb) r
Starting program: /home/vili/./x
warning: Could not load shared library symbols for linux-vdso.so.1.
Do you need "set solib-search-path" or "set sysroot"?
Initializing Web Server
[1384020141:5692] NOTICE: Initial logging level 7
[1384020141:5692] NOTICE: Library version: 1.2
[1384020141:5693] NOTICE: Started with daemon pid 0
[1384020141:5693] NOTICE: static allocation: 5512 + (16 x 1024 fds) = 21896 bytes
[1384020141:5693] NOTICE: canonical_hostname = x220
[1384020141:5693] NOTICE: Compiled with OpenSSL support
[1384020141:5693] NOTICE: Using non-SSL mode
[1384020141:5693] NOTICE: per-conn mem: 248 + 1328 headers + protocol rx buf
[1384020141:5713] NOTICE: Listening on port 8081
Program received signal SIGSEGV, Segmentation fault.
0x00007ffff7bc2080 in _IO_2_1_stderr_ () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007ffff7bc2080 in _IO_2_1_stderr_ () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007ffff7bcd83c in libwebsocket_create_context (info=0x7fffffffe580)
at libwebsockets.c:2093
#2 0x0000000000400918 in main () at x.c:66
(gdb) up
#1 0x00007ffff7bcd83c in libwebsocket_create_context (info=0x7fffffffe580)
at libwebsockets.c:2093
2093 info->protocols[context->count_protocols].callback(context,
(gdb) p context->count_protocols
$1 = 1
(gdb) p info->protocols[1]
$2 = {
name = 0x7ffff7bc2240 <_IO_2_1_stdin_> "\210 \255", <incomplete sequence \373>, callback = 0x7ffff7bc2080 <_IO_2_1_stderr_>,
per_session_data_size = 140737349689696, rx_buffer_size = 0,
owning_server = 0x602010, protocol_index = 1}
(gdb)
Quite likely you need to close the array of libwebsocket_protocols with a special entry (NULL) so that the lib will know how many entries it got via info->protocols.
Edit: yep, check the docs: http://jsk.pp.ua/knowledge/libwebsocket.html
Array of structures listing supported protocols and a protocol- specific callback for each one. The list is ended with an entry that
has a NULL callback pointer.
I am writing code to use a library called SCIP (solves optimisation problems). The library itself can be compiled in two ways: create a set of .a files, then the binary, OR create a set of shared objects. In both cases, SCIP is compiled with it's own, rather large, Makefile.
I have two implementations, one which compiles with the .a files (I'll call this program 1), the other links with the shared objects (I'll call this program 2). Program 1 is compiled using a SCIP-provided makefile, whereas program 2 is compiled using my own, much simpler makefile.
The behaviour I'm encountering occurs in the SCIP code, not in code that I wrote. The code extract is as follows:
void* BMSallocMemory_call(size_t size)
{
void* ptr;
size = MAX(size, 1);
ptr = malloc(size);
// This is where I call gdb print statements.
if( ptr == NULL )
{
printf("ERROR - unable to allocate memory for a SCIP*.\n");
}
return ptr;
}
void SCIPcreate(SCIP** A)
{
*A = (SCIP*)BMSallocMemory_call(sizeof(**(A)))
.
.
.
}
If I debug this code in gdb, and step through BMSallocMemory_call() in order to see what's happening, and view the contents of *((SCIP*)(ptr)), I get the following output:
Program 1 gdb output:
289 size = MAX(size, 1);
(gdb) step
284 {
(gdb)
289 size = MAX(size, 1);
(gdb)
290 ptr = malloc(size);
(gdb) print ptr
$1 = <value optimised out>
(gdb) step
292 if( ptr == NULL )
(gdb) print ptr
$2 = <value optimised out>
(gdb) step
290 ptr = malloc(size);
(gdb) print ptr
$3 = (void *) 0x8338448
(gdb) print *((SCIP*)(ptr))
$4 = {mem = 0x0, set = 0x0, interrupt = 0x0, dialoghdlr = 0x0, totaltime = 0x0, stat = 0x0, origprob = 0x0, eventfilter = 0x0, eventqueue = 0x0, branchcand = 0x0, lp = 0x0, nlp = 0x0, relaxation = 0x0, primal = 0x0, tree = 0x0, conflict = 0x0, cliquetable = 0x0, transprob = 0x0, pricestore = 0x0, sepastore = 0x0, cutpool = 0x0}
Program 2 gdb output:
289 size = MAX(size, 1);
(gdb) step
290 ptr = malloc(size);
(gdb) print ptr
$1 = (void *) 0xb7fe450c
(gdb) print *((SCIP*)(ptr))
$2 = {mem = 0x1, set = 0x8232360, interrupt = 0x1, dialoghdlr = 0xb7faa6f8, totaltime = 0x0, stat = 0xb7fe45a0, origprob = 0xb7fe4480, eventfilter = 0xfffffffd, eventqueue = 0x1, branchcand = 0x826e6a0, lp = 0x8229c20, nlp = 0xb7fdde80, relaxation = 0x822a0d0, primal = 0xb7f77d20, tree = 0xb7fd0f20, conflict = 0xfffffffd, cliquetable = 0x1, transprob = 0x8232360, pricestore = 0x1, sepastore = 0x822e0b8, cutpool = 0x0}
The only reason I can think of is that in either program 1's or SCIP's makefile, there is some sort of option that forces malloc to initialise memory it allocates. I simply must learn why the structure is initialised in the compiled implementation, and is not in the shared object implementation.
I doubt the difference has to do with how the two programs are built.
malloc does not initialize the memory it allocates. It may so happen by chance that the memory you get back is filled with zeroes. For example, a program that's just started is more likely to get zero-filled memory from malloc than a program that's been running for a while and allocating/deallocating memory.
edit You may find the following past questions of interest:
malloc zeroing out memory?
Create a wrapper function for malloc and free in C
When and why will an OS initialise memory to 0xCD, 0xDD, etc. on malloc/free/new/delete?
Initialization of malloc-ed memory may be implementation dependent. Implementations are free not to do so for performance reasons, but they could initialize the memory for example in debug mode.
One more note. Even uninitialized memory may contain zeros.
On Linux, according to this thread, memory will be zero-filled when first handed to the application. Thus, if your call to malloc() caused the program's heap to grow, the "new" memory will be zero-filled.
One way to verify is of course to just step into malloc() from your routine, that should make it pretty clear whether or not it contains code to initialize the memory, directly.