Let's say I've got a program foo that allocates and frees memory. I run it like this:
./foo something.foo
It works perfectly, exits with no errors. Now, if I set the first line of the file to #!/path/foo, change the permissions, and run it as ./something.foo, the program runs correctly, but before exiting, I see this:
*** Error in '/path/foo': free(): invalid next size(fast): 0x019e2008 ***
Aborted
I've seen a lot of questions about free(): invalid next sign(fast), all with specific code examples. So I've got two questions:
Why might the error appear when using #!/path/foo instead of ./foo?
What exactly does the error mean - what conditions must be present for it to appear?
Huh, fixed this by changing
some_var = malloc(sizeof(char *));
to
some_var = malloc(CONSTANT);
It means you have heap corruption in your program. The message is telling you how the C library detected the corruption, not how the corruption occurred.
Heap corruption is a particularly insidious bug to track down as it generally does not manifest at the point where the bug occurs, but rather at some later point. Its quite possible for the program to continue to work despite the corruption, meaning it might be a bug that has been present in your code for weeks or months and has nothing to do with any recent changes you are testing and debugging.
The best response to heap corruption is usually a tool like valgrind, which can run along with your program and will often (though not always) be able to pinpoint where the offending code is.
Related
I have a critical bug in my project. When I use gdb to open the .core it shows me something like(I didn't put all the gdb output for ease of reading):
This is very very suspicious, new written part of code ::
0x00000000004579fe in http_chunk_count_loop
(f=0x82e68dbf0, pl=0x817606e8a Address 0x817606e8a out of bounds)
This is very mature part of code which worked for a long time without problem::
0x000000000045c8a5 in packet_handler_http
(f=0x82e68dbf0, pl=0x817606e8a Address 0x817606e8a out of bounds)
Ok now what messes my mind is the pl=0x817606e8a Address 0x817606e8a out of bounds, gdb shows it was already out of bounds before it reached new written code. This make me think the problem caused by function which calls packet_handler_http.
But packet_handler_http is very mature and working for a long time without problem. And this makes me I am misundertanding gdb output.
The problem is with packet_handler_http I guess but because of this was already working code I am confused, am I right with my guess or am I missing something?
To detect "memory errors" you might like to run the program under Valgrind: http://valgrind.org
If having compiled the program with symbols (-g for gcc) you could quite reliably detect "out of bounds" conditions down to the line of code where the error occurrs, as well with the line of code having allocated the memory (if ever).
The problem is with packet_handler_http I guess
That guess is unlikely to be correct: if the packet_handler_http is really receiving invalid pointer, then the corruption has happened "upstream" from it.
This is very mature part of code which worked for a long time without problem
I routinely find bugs in code that worked "without problem" for 10+ years. Also, the corruption may be happening in newly-added code, but causing problems elsewhere. Heap and stack buffer overflows are often just like that.
As alk already suggested, run your executable under Valgrind, or Address Sanitizer (also included in GCC-4.8), and fix any problems they find.
Thanks guys for your contrubition , even gdb says opposite it turn out pointer was good.
There was a part in new code which causes out of bounds problem.
There was line like :: (goodpointer + offset) and this offset was http chunk size and I were taking it from network(data sniffing). And there was kind of attack that this offset were extremely big, which cause integer overflow. And this resulted out of bounds problem.
My conclusions : don't thrust the parameters from network never AND gdb may not always points the parameter correctly at coredump because at the moment of crush things can get messy in stack .
my program is crashing on the second run on this line:
char* temp_directive = (char *)malloc(7);
with this error:
Critical error detected c0000374
Windows has triggered a breakpoint in Maman14.exe.
This may be due to a corruption of the heap, which indicates a bug in Maman14.exe or any of the DLLs it has loaded.
This may also be due to the user pressing F12 while Maman14.exe has focus.
I can't understand why, it always happen on the second run.
I've tried to add free(temp_directive), but it didn't help
anyone famailer with this issue?
http://blogs.msdn.com/b/jiangyue/archive/2010/03/16/windows-heap-overrun-monitoring.aspx
Sounds like you ran off the end of the array earlier in the code, and your memory management isn't picking it up until you try to malloc that memory space.
Found the problem, it was caused from a different realloc . Thanks everyone!
Got a problem which to me make no sense. So here goes:
I have a function that counts how many times a word appears in a file, thus this function return a integer (int). So on another function it uses the "counter". Now for some reason it decided to start launching a stack smashing detected error. I had been testing it for 2 weeks the whole program and it worked to perfection. Now I get that error, which really makes no sense. What in the world is going on? And the error is right there, after the function has counter and it return, it launches the stack smashing detected error.
Edit:
I keep searching, and yes i get a stack smashing detected error when returning a int function. Any ideas? If i take that code out, it does not crash. Really i have no idea
Any suggestion?
Thanks...
May I suggest compiling your program with debugging information and running it under Valgrind? See also this related question.
If you need it, I have posted some hints on using Valgrind in an older answer of mine.
I've had trouble before with this same program because it makes lots of memory allocations. I got rid of most problems but I'm still having trouble with one particular problem. When I ran my program in Eclipse it compiles well but it crashes with this message
*** glibc detected *** /home/user/workspace/TTPrueba/Debug/TTPrueba: free(): invalid pointer: 0xb6bc0588 ***
When I ran it with Valgrind it tells me this
==31580== Process terminating with default action of signal 11 (SIGSEGV)
==31580== Access not within mapped region at address 0x0
==31580== at 0x804BEA3: termino (Menu.c:899)
==31580== by 0x804BE05: computar_transformadas (Menu.c:840)
So the problem is that it is trying to free an invalid memory address but then I go step by step in debug mode and the program never crashes!!!! :(
Any idea why such a thing could happen? How come it works while debugging but not while running? This is pretty strange behavior.
for(phi=0;phi<360;phi++){
for(j=0;j<par.param1[phi][0];j++){
for(o=0;o<(par.prueba[phi][j][1]-par.prueba[phi][j][0]);o++){//AQUI 849
free(par.pixels[phi][j][o]);//HERE IS LINE 899 WHERE IT ALWAYS CRASHES
if(o==(par.prueba[phi][j][1]-par.prueba[phi][j][0]-1))
free(par.pixels[phi][j]);
}
free(par.prueba[phi][j]);
}
Thanks for the help!
One likely reason -- the debugger could be changing the memory layout of things, so when memory is corrupted, it happens to be in an "out of the way" place.
Or the debugger might be causing allocated memory to be zeroed which may not be happening in a production run.
It is not surprising. For example if par.pixels[phi][j][o] is not initialized. It can contain anything, in a debugger environment, you have different memory layout par.pixels[phi][j][o] may become 0, so free didn't crash.
One problem that I see is that you free par.pixels[phi][j][o] where o looping from zero, and then access par.pixels[phi][j][0], which just have been free'd!
You also free par.pixels[phi][j] but continue looping accessing par.pixels[phi][j] and freeing pointers that a no longer valid.
Lets state the conditions where sqlcxt() can cause segmentation fault, I am woking on unix, using ProC for database connections to Oracle database.
My program crashes and the core file shows that the crash is due to the sqlcxt() function
A loadobject was found with an unexpected checksum value.
See `help core mismatch' for details, and run `proc -map'
to see what checksum values were expected and found.
...
dbx: warning: Some symbolic information might be incorrect.
...
t#null (l#1) program terminated by signal SEGV
(no mapping at the fault address)0xffffffffffffffff:
<bad address 0xffffffffffffffff>
Current function is dbMatchConsortium
442 **sqlcxt((void **)0, &sqlctx, &sqlstm, &sqlfpn);**
There is a decent chance that the problem you are having is some sort of pointer-error / memory allocation error in your C code. These things are never easy to find. Some things that you might
try:
See if you can comment out (or #ifdef) out sections of your program and if the problem disappears. If so then you can close in on the bad section
Run your program in a debugger.
Do a code review with somebody else - this will often lead to finding more than one problem (Usually works in my code).
I hope that this helps. Please add more details and I will check back on this question and see if I can help you .
It's probably an allocation error in your program. When I got this kind of behaviour it was always my fault. I develop on Solaris/SPARC and Oracle 10g. Once it was a double free (i.e. I freed the same pointer twice) and the second time I had a core in the Oracle part of the program was when I freed a pointer which was not an allocated memory block.
If you're on Solaris you can try the libumem allocation library (google it for details) to see if the behaviour change.
A solution that worked for me: Delete the c files created by ProC & make(recompile)
Pro c files(*.pc) are 'compiled'/preprocessed in c files and sometimes when 'compiling' them some errors may occur (in my case it wasn't any more space left) and even if the build succeeds I would get a SIGSEGV signal in sqlcxt libclntsh.so when executing them.
pstack & gdb could help you for debugging if that's not the case.