I usually love good explained questions and answers. But in this case I really can't give any more clues.
The question is: why malloc() is giving me SIGSEGV? The debug bellow show the program has no time to test the returned pointer to NULL and exit. The program quits INSIDE MALLOC!
I'm assuming my malloc in glibc is just fine. I have a debian/linux wheezy system, updated, in an old pentium (i386/i486 arch).
To be able to track, I generated a core dump. Lets follow it:
iguana$gdb xadreco core-20131207-150611.dump
Core was generated by `./xadreco'.
Program terminated with signal 11, Segmentation fault.
#0 0xb767fef5 in ?? () from /lib/i386-linux-gnu/libc.so.6
(gdb) bt
#0 0xb767fef5 in ?? () from /lib/i386-linux-gnu/libc.so.6
#1 0xb76824bc in malloc () from /lib/i386-linux-gnu/libc.so.6
#2 0x080529c3 in enche_pmovi (cabeca=0xbfd40de0, pmovi=0x...) at xadreco.c:4519
#3 0x0804b93a in geramov (tabu=..., nmovi=0xbfd411f8) at xadreco.c:1473
#4 0x0804e7b7 in minimax (atual=..., deep=1, alfa=-105000, bet...) at xadreco.c:2778
#5 0x0804e9fa in minimax (atual=..., deep=0, alfa=-105000, bet...) at xadreco.c:2827
#6 0x0804de62 in compjoga (tabu=0xbfd41924) at xadreco.c:2508
#7 0x080490b5 in main (argc=1, argv=0xbfd41b24) at xadreco.c:604
(gdb) frame 2
#2 0x080529c3 in enche_pmovi (cabeca=0xbfd40de0, pmovi=0x ...) at xadreco.c:4519
4519 movimento *paux = (movimento *) malloc (sizeof (movimento));
(gdb) l
4516
4517 void enche_pmovi (movimento **cabeca, movimento **pmovi, int c0, int c1, int c2, int c3, int p, int r, int e, int f, int *nmovi)
4518 {
4519 movimento *paux = (movimento *) malloc (sizeof (movimento));
4520 if (paux == NULL)
4521 exit(1);
Of course I need to look at frame 2, the last on stack related to my code. But the line 4519 gives SIGSEGV! It does not have time to test, on line 4520, if paux==NULL or not.
Here it is "movimento" (abbreviated):
typedef struct smovimento
{
int lance[4]; //move in integer notation
int roque; // etc. ...
struct smovimento *prox;// pointer to next
} movimento;
This program can load a LOT of memory. And I know the memory is in its limits. But I thought malloc would handle better when memory is not available.
Doing a $free -h during execution, I can see memory down to as low as 1MB! Thats ok. The old computer only has 96MB. And 50MB is used by the OS.
I don't know to where start looking. Maybe check available memory BEFORE a malloc call? But that sounds a wast of computer power, as malloc would supposedly do that. sizeof (movimento) is about 48 bytes. If I test before, at least I'll have some confirmation of the bug.
Any ideas, please share. Thanks.
Any crash inside malloc (or free) is an almost sure sign of heap corruption, which can come in many forms:
overflowing or underflowing a heap buffer
freeing something twice
freeing a non-heap pointer
writing to freed block
etc.
These bugs are very hard to catch without tool support, because the crash often comes many thousands of instructions, and possibly many calls to malloc or free later, in code that is often in a completely different part of the program and very far from where the bug is.
The good news is that tools like Valgrind or AddressSanitizer usually point you straight at the problem.
Related
Why does the code below work without any crash # runtime ?
And also the size is completely dependent on machine/platform/compiler!!. I can even give upto 200 in a 64-bit machine. how would a segmentation fault in main function get detected in the OS?
int main(int argc, char* argv[])
{
int arr[3];
arr[4] = 99;
}
Where does this buffer space come from? Is this the stack allocated to a process ?
Something I wrote sometime ago for education-purposes...
Consider the following c-program:
int q[200];
main(void) {
int i;
for(i=0;i<2000;i++) {
q[i]=i;
}
}
after compiling it and executing it, a core dump is produced:
$ gcc -ggdb3 segfault.c
$ ulimit -c unlimited
$ ./a.out
Segmentation fault (core dumped)
now using gdb to perform a post mortem analysis:
$ gdb -q ./a.out core
Program terminated with signal 11, Segmentation fault.
[New process 7221]
#0 0x080483b4 in main () at s.c:8
8 q[i]=i;
(gdb) p i
$1 = 1008
(gdb)
huh, the program didn’t segfault when one wrote outside the 200 items allocated, instead it crashed when i=1008, why?
Enter pages.
One can determine the page size in several ways on UNIX/Linux, one way is to use the system function sysconf() like this:
#include <stdio.h>
#include <unistd.h> // sysconf(3)
int main(void) {
printf("The page size for this system is %ld bytes.\n",
sysconf(_SC_PAGESIZE));
return 0;
}
which gives the output:
The page size for this system is 4096 bytes.
or one can use the commandline utility getconf like this:
$ getconf PAGESIZE
4096
post mortem
It turns out that the segfault occurs not at i=200 but at i=1008, lets figure out why. Start gdb to do some post mortem ananlysis:
$gdb -q ./a.out core
Core was generated by `./a.out'.
Program terminated with signal 11, Segmentation fault.
[New process 4605]
#0 0x080483b4 in main () at seg.c:6
6 q[i]=i;
(gdb) p i
$1 = 1008
(gdb) p &q
$2 = (int (*)[200]) 0x804a040
(gdb) p &q[199]
$3 = (int *) 0x804a35c
q ended at at address 0x804a35c, or rather, the last byte of q[199] was at that location. The page size is as we saw earlier 4096 bytes and the 32-bit word size of the machine gives that an virtual address breaks down into a 20-bit page number and a 12-bit offset.
q[] ended in virtual page number:
0x804a = 32842
offset:
0x35c = 860
so there were still:
4096 - 864 = 3232
bytes left on that page of memory on which q[] was allocated. That space can hold:
3232 / 4 = 808
integers, and the code treated it as if it contained elements of q at position 200 to 1008.
We all know that those elements don’t exists and the compiler didn’t complain, neither did the hw since we have write permissions to that page. Only when i=1008 did q[] refer to an address on a different page for which we didn’t have write permission, the virtual memory hw detected this and triggered a segfault.
An integer is stored in 4 bytes, meaning that this page contains 808 (3236/4) additional fake elements meaning that it is still perfectly legal to access these elements from q[200], q[201] all the way up to element 199+808=1007 (q[1007]) without triggering a seg fault. When accessing q[1008] you enter a new page for which the permission are different.
Since you're writing outside the boundaries of your array, the behaviour of your code in undefined.
It is the nature of undefined behaviour that anything can happen, including lack of segfaults (the compiler is under no obligation to perform bounds checking).
You're writing to memory you haven't allocated but that happens to be there and that -- probably -- is not being used for anything else. Your code might behave differently if you make changes to seemingly unrelated parts of the code, to your OS, compiler, optimization flags etc.
In other words, once you're in that territory, all bets are off.
Regarding exactly when / where a local variable buffer overflow crashes depends on a few factors:
The amount of data on the stack already at the time the function is called which contains the overflowing variable access
The amount of data written into the overflowing variable/array in total
Remember that stacks grow downwards. I.e. process execution starts with a stackpointer close to the end of the memory to-be-used as stack. It doesn't start at the last mapped word though, and that's because the system's initialization code may decide to pass some sort of "startup info" to the process at creation time, and often do so on the stack.
That is the usual failure mode - a crash when returning from the function that contained the overflow code.
If the total amount of data written into a buffer on the stack is larger than the total amount of stackspace used previously (by callers / initialization code / other variables) then you'll get a crash at whatever memory access first runs beyond the top (beginning) of the stack. The crashing address will be just past a page boundary - SIGSEGV due to accessing memory beyond the top of the stack, where nothing is mapped.
If that total is less than the size of the used part of the stack at this time, then it'll work just ok and crash later - in fact, on platforms that store return addresses on the stack (which is true for x86/x64), when returning from your function. That's because the CPU instruction ret actually takes a word from the stack (the return address) and redirects execution there. If instead of the expected code location this address contains whatever garbage, an exception occurs and your program dies.
To illustrate this: When main() is called, the stack looks like this (on a 32bit x86 UNIX program):
[ esp ] <return addr to caller> (which exits/terminates process)
[ esp + 4 ] argc
[ esp + 8 ] argv
[ esp + 12 ] envp <third arg to main() on UNIX - environment variables>
[ ... ]
[ ... ] <other things - like actual strings in argv[], envp[]
[ END ] PAGE_SIZE-aligned stack top - unmapped beyond
When main() starts, it will allocate space on the stack for various purposes, amongst others to host your to-be-overflowed array. This will make it look like:
[ esp ] <current bottom end of stack>
[ ... ] <possibly local vars of main()>
[ esp + X ] arr[0]
[ esp + X + 4 ] arr[1]
[ esp + X + 8 ] arr[2]
[ esp + X + 12 ] <possibly other local vars of main()>
[ ... ] <possibly other things (saved regs)>
[ old esp ] <return addr to caller> (which exits/terminates process)
[ old esp + 4 ] argc
[ old esp + 8 ] argv
[ old esp + 12 ] envp <third arg to main() on UNIX - environment variables>
[ ... ]
[ ... ] <other things - like actual strings in argv[], envp[]
[ END ] PAGE_SIZE-aligned stack top - unmapped beyond
This means you can happily access way beyond arr[2].
For a taster of different crashes resulting from buffer overflows, attempt this one:
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char **argv)
{
int i, arr[3];
for (i = 0; i < atoi(argv[1]); i++)
arr[i] = i;
do {
printf("argv[%d] = %s\n", argc, argv[argc]);
} while (--argc);
return 0;
}
and see how different the crash will be when you overflow the buffer by a little (say, 10) bit, compared to when you overflow it beyond the end of the stack. Try it with different optimization levels and different compilers. Quite illustrative, as it shows both misbehaviour (won't always print all argv[] correctly) as well as crashes in various places, maybe even endless loops (if, e.g., the compiler places i or argc into the stack and the code overwrites it during the loop).
By using an array type, which C++ has inherited from C, you are implicitly asking not to have a range check.
If you try this instead
void main(int argc, char* argv[])
{
std::vector<int> arr(3);
arr.at(4) = 99;
}
you will get an exception thrown.
So C++ offers both a checked and an unchecked interface. It is up to you to select the one you want to use.
That's undefined behavior - you simply don't observe any problems. The most likely reason is you overwrite an area of memory the program behavior doesn't depend on earlier - that memory is technically writable (stack size is about 1 megabyte in size in most cases) and you see no error indication. You shouldn't rely on this.
To answer your question why it is "undetected": Most C compilers do not analyse at compile time what you are doing with pointers and with memory, and so nobody notices at compile time that you've written something dangerous. At runtime, there is also no controlled, managed environment that babysits your memory references, so nobody stops you from reading memory that you aren't entitled to. The memory happens to be allocated to you at that point (because its just part of the stack not far from your function), so the OS doesn't have a problem with that either.
If you want hand-holding while you access your memory, you need a managed environment like Java or CLI, where your entire program is run by another, managing program that looks out for those transgressions.
Your code has Undefined Behavior. That means it can do anything or nothing. Depending on your compiler and OS etc., it could crash.
That said, with many if not most compilers your code will not even compile.
That's because you have void main, while both the C standard and the C++ standard requires int main.
About the only compiler that's happy with void main is Microsoft’s, Visual C++.
That's a compiler defect, but since Microsoft has lots of example documentation and even code generation tools that generate void main, they will likely never fix it. However, consider that writing Microsoft-specific void main is one character more to type than standard int main. So why not go with the standards?
Cheers & hth.,
A segmentation fault occurs when a process tries to overwrite a page in memory which it doesn't own; Unless you run a long way over the end of you're buffer you aren't going to trigger a seg fault.
The stack is located somewhere in one of the blocks of memory owned by your application. In this instance you have just been lucky if you haven't overwritten something important. You have overwritten perhaps some unused memory. If you were a bit more unlucky you might have overwritten the stack frame of another function on the stack.
So apparently when you're asking the computer for a certain amount of bytes to allocate in memory, say:
char array[10]
it gives us a few extra bytes in order not to bump into segfault, however it's still not safe to use those, and trying to reach further memory will eventually cause the program to crash.
I wrote some code to see how malloc() and memset() behave, and I found a case where I don't know what's going on.
I used malloc() to allocate 15 bytes of memory for a character array, and I
wanted to see what would happen if I used memset() incorrectly to set 100 bytes of memory in the pointer I created. I expected to see that memset() had set 15 bytes (and possibly trash some other memory). What I'm seeing when I run the program is that it's setting 26 bytes of memory to the character that I coded.
Any idea why there are 26 bytes allocated for the pointer I created? I'm compiling with gcc and glibc. Here's the code:
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <stdio.h>
#define ARRLEN 14
int main(void) {
/* + 1 for the null terminator */
char *charptr = malloc((sizeof(*charptr) * ARRLEN) + 1);
if (!charptr)
exit(EXIT_FAILURE);
memset(charptr, '\0', (sizeof(*charptr) * ARRLEN) + 1);
/* here's the intentionally incorrect call to memset() */
memset(charptr, 'a', 100);
printf("sizeof(char) ------ %ld\n", sizeof(char));
printf("sizeof(charptr) --- %ld\n", sizeof(charptr));
printf("sizeof(*charptr) --- %ld\n", sizeof(*charptr));
printf("sizeof(&charptr) --- %ld\n", sizeof(&charptr));
printf("strlen(charptr) --- %ld\n", strlen(charptr));
printf("charptr string ---- >>%s<<\n", charptr);
free(charptr);
return 0;
}
This is the output I get:
sizeof(char) ------ 1
sizeof(charptr) --- 8
sizeof(*charptr) --- 1
sizeof(&charptr) --- 8
strlen(charptr) --- 26
charptr string ---- >>aaaaaaaaaaaaaaaaaaaaaaaa<<
First of all, this is undefined behavior, so anything can happen; as said in a comment, on my machine I get your exact same behavior with optimizations disabled, but turning on optimizations I get a warning about a potential buffer overflow at compile time (impressive job gcc!) and a big crash at runtime. Even better, if I print it with a puts before the printf calls I get it printed with a different number of a.
Still, I have the dubious luck to have the exact same behavior as you, so let's investigate. I compiled your program with no optimization and debug information
[matteo#teokubuntu ~/scratch]$ gcc -g memset_test.c
then I fired up the debugger and added a breakpoint on the first printf, just after the memset.
Reading symbols from a.out...done.
(gdb) break 20
Breakpoint 1 at 0x87e: file memset_test.c, line 20.
(gdb) r
Starting program: /home/matteo/scratch/a.out
Breakpoint 1, main () at memset_test.c:20
20 printf("sizeof(char) ------ %ld\n", sizeof(char));
now we can set a hardware write breakpoint on the 26th memory location pointed by charptr
(gdb) p charptr
$1 = 0x555555756260 'a' <repeats 100 times>
(gdb) watch charptr[26]
Hardware watchpoint 2: charptr[26]
... and so...
(gdb) c
Continuing.
Hardware watchpoint 2: charptr[26]
Old value = 97 'a'
New value = 0 '\000'
_int_malloc (av=av#entry=0x7ffff7dcfc40 <main_arena>, bytes=bytes#entry=1024) at malloc.c:4100
4100 malloc.c: File o directory non esistente.
(gdb) bt
#0 _int_malloc (av=av#entry=0x7ffff7dcfc40 <main_arena>, bytes=bytes#entry=1024) at malloc.c:4100
#1 0x00007ffff7a7b0fc in __GI___libc_malloc (bytes=1024) at malloc.c:3057
#2 0x00007ffff7a6218c in __GI__IO_file_doallocate (fp=0x7ffff7dd0760 <_IO_2_1_stdout_>) at filedoalloc.c:101
#3 0x00007ffff7a72379 in __GI__IO_doallocbuf (fp=fp#entry=0x7ffff7dd0760 <_IO_2_1_stdout_>) at genops.c:365
#4 0x00007ffff7a71498 in _IO_new_file_overflow (f=0x7ffff7dd0760 <_IO_2_1_stdout_>, ch=-1) at fileops.c:759
#5 0x00007ffff7a6f9ed in _IO_new_file_xsputn (f=0x7ffff7dd0760 <_IO_2_1_stdout_>, data=<optimized out>, n=23)
at fileops.c:1266
#6 0x00007ffff7a3f534 in _IO_vfprintf_internal (s=0x7ffff7dd0760 <_IO_2_1_stdout_>,
format=0x5555555549c8 "sizeof(char) ------ %ld\n", ap=ap#entry=0x7fffffffe330) at vfprintf.c:1328
#7 0x00007ffff7a48f26 in __printf (format=<optimized out>) at printf.c:33
#8 0x0000555555554894 in main () at memset_test.c:20
(gdb)
So, it's just malloc code invoked (more or less indirectly) by printf doing its stuff on the memory block immediately adjacent to the one it gave you (possibly marking it as used).
Long story short: you took memory that wasn't yours and it's now getting modified by its rightful owner at the first occasion when he needed it; nothing particularly strange or interesting.
I used malloc() to allocate 15 bytes of memory for a character array,
and I wanted to see what would happen if I used memset() incorrectly
to set 100 bytes of memory in the pointer I created.
What happens is undefined behavior as far as the language standard is concerned, and whatever actually does happen in a particular case is not predictable from the source code and may not be consistent across different C implementations or even different runs of the same program.
I expected to see
that memset() had set 15 bytes (and possibly trash some other memory).
That would be one plausible result, but that you had any particular expectation at all is dangerous. Do not assume that you can predict what manifestation UB will take, not even based on past experience. And since you should not do that, and cannot learn anything useful from that way, it is not worthwhile to experiment with UB.
What I'm seeing when I run the program is that it's setting 26 bytes
of memory to the character that I coded.
Any idea why there are 26 bytes allocated for the pointer I created?
Who says your experiment demonstrates that to be the case? Not only the memset() but also the last printf() exhibits UB. The output tells you nothing other than what the output was. That time.
Now in general, malloc may reserve a larger block than you request. Many implementations internally manage memory in chunks larger than one byte, such as 16 or 32 bytes. But that has nothing to do one way or another with the definedness of your program's behavior, and it does not speak to your output.
I have a code base where the process running on linux is crashing in free and sometimes in malloc:
#0 0xffffe430 in __kernel_vsyscall ()
#1 0xf7424e30 in raise () from /lib/libc.so.6
#2 0xf7426765 in abort () from /lib/libc.so.6
#3 0xf7469d33 in malloc_printerr () from /lib/libc.so.6
#4 0xf746e7bc in free () from /lib/libc.so.6
#5 0xf6047e25 in myFree (mem_ptr=0x82165d8) at ../my_code/mylib.c:78
#6 0xf6014a10 in FreeBuffer (buffer=0x82042f8)
In the code I am not seeing anything fishy at the place where memory freeing is happening.
Function myFree() has nothing but a free() function call.
void FreeBuffer(struct MY_BUFFER *buffer)
{
if (buffer)
{
myFree(buffer);
}
}
void myFree(void *mem_ptr)
{
free(mem_ptr);
}
I tried using MALLOC_CHECK_ but it was not of any help.
I suspect some where the heap is getting corrupted and want to find that.
Any hint to proceed to debug the process in such cases?
There is hard to be 100% sure but most probably you are trying to free memory twice. It's great that you are checking if pointer is not NULL before trying to free it
if (buffer)
{
myFree(buffer);
}
but you are probably NOT setting pointer NULL after doing exact free job - that's my guess.
Check if you have something like this
struct MY_BUFFER *buffer;
//do something with buffer
FreeBuffer(buffer)
buffer = NULL;
You can do this as well inside FreeBuffer() function but to do so you will have to pass address of pointer AKA pointer to pointer so the definition will be like this FreeBuffer(struct MY_BUFFER **buffer)
I'm trying to understand what's going wrong with a program run in HP-UX 11.11 that results in a SIGSEGV (11, segmentation fault):
(gdb) bt
#0 0x737390e8 in _sigfillset+0x618 () from /usr/lib/libc.2
#1 0x73736a8c in _sscanf+0x55c () from /usr/lib/libc.2
#2 0x7373c23c in malloc+0x18c () from /usr/lib/libc.2
#3 0x7379e3f8 in _findbuf+0x138 () from /usr/lib/libc.2
#4 0x7379c9f4 in _filbuf+0x34 () from /usr/lib/libc.2
#5 0x7379c604 in __fgets_unlocked+0x84 () from /usr/lib/libc.2
#6 0x7379c7fc in fgets+0xbc () from /usr/lib/libc.2
#7 0x7378ecec in __nsw_getoneconfig+0xf4 () from /usr/lib/libc.2
#8 0x7378f8b8 in __nsw_getconfig+0x150 () from /usr/lib/libc.2
#9 0x737903a8 in __thread_cond_init_default+0x100 () from /usr/lib/libc.2
#10 0x737909a0 in nss_search+0x80 () from /usr/lib/libc.2
#11 0x736e7320 in __gethostbyname_r+0x140 () from /usr/lib/libc.2
#12 0x736e74bc in gethostbyname+0x94 () from /usr/lib/libc.2
#13 0x11780 in dnetResolveName (name=0x400080d8 "smtp.org.com", hent=0x737f3334) at src/dnet.c:64
..
The problem seems to be occurring somewhere inside libc! A system call trace ends with:
Connecting to server smtp.org.com on port 25
write(1, "C o n n e c t i n g t o s e ".., 51) .......................... = 51
open("/etc/nsswitch.conf", O_RDONLY, 0666) ............................... [entry]
open("/etc/nsswitch.conf", O_RDONLY, 0666) ................................... = 5
Received signal 11, SIGSEGV, in user mode, [SIG_DFL], partial siginfo
Siginfo: si_code: I_NONEXIST, faulting address: 0x400118fc, si_errno: 0
PC: 0xc01980eb, instruction: 0x0d3f1280
exit(11) [implicit] ............................ WIFSIGNALED(SIGSEGV)|WCOREDUMP
Last instruction by the program:
struct hostent *him;
him = gethostbyname(name); // name == "smtp.org.com" as shown by gdb
Is this a problem with the system, or am I missing something?
Any guidance for digging deeper would be appreciated.
Thx.
Long story short: vsnprintf corrupted my heap under HP-UX 11.11.
vsnprintf was introduced in C99 (ISO/IEC 9899:1999) and "is equivalent to snprintf, with the variable argument list" (§7.19.6.12.2), snprintf (§7.19.6.5.2): "If n is zero, nothing is written".
Well, HP UX 11.11 doesn't comply with this specification. When 2nd arg == 0, arguments are written at the end of the 1st arg.. which, of course, corrupts the heap (I allocate no space when maxsize==0, given that nothing should be written).
HP manual pages are unclear ("It is the user's responsibility to ensure that enough storage is available."), nothing is said regarding the case of maxsize==0. Nice trap.. at the very least, the WARNINGS section of the man page should warn std-compliant users..
It's an egg/chicken pb: vnsprintf is variadic, so for the "user's responsibility" to ensure that enough storage is available" the "user's responsibility" must first know how much space is needed. And the best way to do that is to call vnsprintf with 2nd arg == 0: it should then return the amount of space required and sprintfs nothing.. well, except HP's !
One solution to use vnsprintf under this std violation to determine needed space: malloc 1 byte more to your buffer (1st arg) and call vnsprintf(buf+buf.length,1,..). This only puts a \0 in the new byte you allocated. Silly, but effective. If you're under wchar conditions, malloc(sizeof..).
Anyway, workaround is trivial : never call v/snprintf under HP-UX with maxsize==0!
I now have a happy stable program!
Thanks to all contributers.
Heap corruption through vsnprintf under HP-UX B11.11
This program prints "##" under Linux/Cygwin/..
It prints "#fooo#" under HP-UX B11.11:
#include <stdarg.h>
#include <stdio.h>
const int S=2;
void f (const char *fmt, ...) {
va_list ap;
int actualLen=0;
char buf[S];
bzero(buf, S);
va_start(ap, fmt);
actualLen = vsnprintf(buf, 0, fmt, ap);
va_end(ap);
printf("#%s#\n", buf);
}
int main () {
f("%s", "fooo");
return 0;
}
Whenever this situation happens to me (unexpected segfault in a system lib), it is usually because I did something foolish somewhere else, i.e. buffer overrun, double delete on a pointer, etc.
In those instances where my mistake is not obvious, I use valgrind. Something like the following is usually sufficient:
valgrind -v --leak-check=yes --show-reachable=yes ./myprog
I assume valgrind may be used in HP-UX...
Your stack trace is in malloc which almost certainly means that somewhere you corrupted one of malloc's data structures. As a previous answer said, you likely have a buffer overrun or underrun and corrupted one of the items allocated off the heap.
Another explanation is that you tried to do a free on something that didn't come from the heap, but that's less likely--that would probably have crashed right in free.
Reading the (OS X) manpage says that gethostbyname() returns a pointer, but as far as I can tell may not be allocating memory for that pointer. Do you need to malloc() first? Try this:
struct hostent *him = malloc(sizeof(struct hostent));
him = gethostbyname(name);
...
free(him);
Does that work any better?
EDIT: I tested this and it's probably wrong. Granted I used the bare string "stmp.org.com" instead of a variable, but both versions (with and without malloc()ing) worked on OS X. Maybe HP-UX is different.
I have been facing a weird issue in a piece of code.
void app_ErrDesc(char *ps_logbuf, char *pc_buf_err_recno)
{
char *pc_logbuf_in;
char rec_num[10];
char *y = "|";
int i, j;
memset(rec_num, 0, sizeof(rec_num));
memset(pc_buf_err_recno, 0, LOGBUFF);
.....
.....
}
For some reason the first memset call sends a SIGSEGV. Whats more strange is when
inside gdb the same line executes for about 30 times though the function is called
only once and there are no loops inside! Here's a piece of gdb session.
7295 /*Point to logbuffer string*/
(gdb)
7292 memset(rec_num, 0, sizeof(rec_num));
(gdb)
7295 /*Point to logbuffer string*/
(gdb)
7292 memset(rec_num, 0, sizeof(rec_num));
(gdb) n
7295 /*Point to logbuffer string*/
(gdb)
7292 memset(rec_num, 0, sizeof(rec_num));
(gdb)
Program received signal SIGSEGV, Segmentation fault.
I have also tried running the program through valgrind's memcheck tool but not getting anything significant about the above piece of code.
The file that I'm parsing has just one record.
Any pointers are appreciated. Thanks.
It's likely that it's the second memset and the reason is that the outer function is called with an insufficient buffer size. Debuggers can show incorrectly where you are. Try to add logging after each step to find out what exactly crashes.
i suspect the call to the function, so ensure the call is not something like
char pc_buf_err_recno[SMALLER_THAN_LOGBUFF];
char ps_logbuf[TOO_SMALL]
app_ErrDesc(ps_logbuf, pc_buf_err_recno);
Debuggers can be incorrect, particularly if you're getting SEGV. Remember, it's quite possible you've trashed the stack when you get a segmentation fault and the debugger will get confused if that happens.
It's also quite possible the calling function has made a mess, not the current one.
Whats more strange is when inside gdb the same line executes for about 30 times though the function is called only once and there are no loops inside!
This sounds symptomatic of having compiled with optimizations. You may have an easier time pinpointing the problem in GDB if you compile with optimizations turned off.