Why does not stack overflow/underflow trigger an run-time error? - c

I use this code snippet:
// stackoverflow.c
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
int main(int argc, char** argv)
{
int i;
int a[10];
// init
a[-1] = -1;
a[11] = 11;
printf(" a[-1]= = %d, a[11] = %d\n", a[-1], a[11]);
printf("I am finished.\n");
return a[-1];
}
The compiler is GCC for linux x86. It works well without any run-time error. I also test this code in Valgrind, which don't trigger any memory error either.
$ gcc -O0 -g -o stack_overflow stack_overflow.c
$ ./stack_overflow
a[-1]= = -1, a[11] = 11
I am finished.
$ valgrind ./stack_overflow
==3705== Memcheck, a memory error detector
==3705== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==3705== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==3705== Command: ./stack_overflow
==3705==
a[-1]= = -1, a[11] = 11
I am finished.
==3705==
==3705== HEAP SUMMARY:
==3705== in use at exit: 0 bytes in 0 blocks
==3705== total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==3705==
==3705== All heap blocks were freed -- no leaks are possible
==3705==
==3705== For counts of detected and suppressed errors, rerun with: -v
==3705== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
From my understanding, heap and stack is the same kind of memory. The only difference is that they grow in the opposite direction.
So my question is:
Why heap overflow/underflow will trigger an rum-time error, while stack overflow/underflow will not?
why C language designer didn't take this into account just like heap, other than leave it Undefined Behaviour

valgrind does not detect stack buffer overflows. Use AddressSanitizer. At least gcc 4.8 is required and libasan must be installed.
gcc -g -fsanitize=address stackbufferoverflow.c
==1955==ERROR: AddressSanitizer: stack-buffer-underflow on address 0x7fffff438d4c at pc 0x000000400a1d bp 0x7fffff438d10 sp 0x7fffff438d00
WRITE of size 4 at 0x7fffff438d4c thread T0
#0 0x400a1c in main /home/m/stackbufferoverflow.c:9
#1 0x7fe7e24e178f in __libc_start_main (/lib64/libc.so.6+0x2078f)
#2 0x400888 in _start (/home/m/a.out+0x400888)
Address 0x7fffff438d4c is located in stack of thread T0 at offset 28 in frame
#0 0x400965 in main /home/m/stackbufferoverflow.c:5
This frame has 1 object(s):
[32, 72) 'a' <== Memory access at offset 28 underflows this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism or swapcontext
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-underflow /home/m/stackbufferoverflow.c:9 main

why C language designer didn't take this into account just like heap, other than leave it Undefined Behaviour
The original C langauge designers wrote a kind of more comfortable and portable assembler for themselves. The original language has not been designed to be bullet-proof against programmers' errors.
If you are interested in an opposite example then look at Ada (http://en.wikipedia.org/wiki/Ada_%28programming_language%29).

C doesn't check things like out-of-bounds array indexing. It just does what you told it to, in this case to change element number 11 in an array of 10 elements. Typically, this means that your program writes to the location in memory where this item should have been stored, if it had existed. This may or may not cause some sort of visible error, such a crash. It might have no effect, or it could make your program do something strange. It depends on what, if anything, happened to be stored at that place in memory, and how it is used.
Some other programming languages do perform checks such as these, and guarantee that an error will be reported. The C standard gives no such guarantees, and just says that it will cause "undefined behaviour". One reason for this is that it should be possible to write very efficient programs in C, where checks would cause a small, but in some cases perhaps unacceptable, delay. Also, back when C was designed, computers were slower, and the delay would have been a worse problem.
There is also no guarantee in C that heap errors will be detected or reported. Valgrind is not part of the C language, but a different tool, and it does its best to find errors using other and more effective mechanisms than C would, but there is no guarantee that it will find all errors.

EDIT
Here's an interesting tuto:
http://gribblelab.org/CBootcamp/7_Memory_Stack_vs_Heap.html
BTW Clang (OSX) detects it, but it's just and extra feature, good old gcc would let you do it.
ctest.c:6:5: warning: array index 42 is past the end of the array (which contains 1 element) [-Warray-bounds]
a[42] = 42;
^ ~~
cpp.cpp:4:5: note: array 'a' declared here
int a[1];
^
1 warning generated.
Old
a[11] = 11;
Would trigger a Segmentation fault (but here it's only one byte it's just overriding the value of another variable, most likely), if you want a stack overflow try something that does an infinite recursion.
Also if you want to make your code segfault proof (for malloc only) I suggest you compile it with electric fence for your tests. It will prevent your program to go above its allocated memory (starting from the first byte)
http://linux.die.net/man/3/efence
As suggested in the comments Valgrind is also a useful tool.
http://valgrind.org/

Why does not stack overflow/underflow trigger an run-time error?
C is not limited to "heap" and "stack" implementations. Example: Variables in main() need not be in a "stack". Even GCC may optimize in way that defy a simple understanding. Many memory architectures are possible. Since C does not specify the underlying memory architecture, the following is simply undefined behavior. #Karoly Horvath
// Undefined behavior: accessing memory outside array's range.
int a[10];
a[-1] = -1;
a[11] = 11;
Any analysis may make sense with a given memory model on a given day of the week, but that behavior is just one of many possibilities.

Allocating heap storage always includes a test for insufficient memory; for stack space this is less critical due to the way stack space is reused over and over again. If they share the same block of storage, then they could collide.
GCC won't do this because heap space and stack space are separate; I don't know about Valgrind.
In at least one old language (Turbo C), an alloc() will fail if less than 256 bytes of storage remain between top-of-heap and bottom-of-stack. It is assumed 256 bytes is enough to accommodate stack growth. If it's not, you get some very weird run-time errors.
Turbo C has a compile-time option, -N, to check for stack overflow more thoroughly. Other languages may have a similar option.

Related

Valgrind conditional jump ... error with PCRE2 JIT when reading from file

I have a very interesting problem.
I'd like to use PCRE2, and its JIT function. The task is simple: read lines from a file, and find patterns.
Here is the sample code:
#include <stdio.h>
#include <string.h>
#define PCRE2_CODE_UNIT_WIDTH 8
#include <pcre2.h>
int search(pcre2_code *re, unsigned char * subject) {
pcre2_match_data *match_data_real = pcre2_match_data_create_from_pattern(re, NULL);
size_t len_subject = strlen((const char *)subject);
int rc = pcre2_match(
re,
(PCRE2_SPTR)subject,
len_subject,
0,
0,
match_data_real,
NULL
);
pcre2_match_data_free(match_data_real);
return rc;
}
int main(int argc, char ** argv) {
unsigned char subject[][100] = {
"this is a foobar",
"this is a barfoo",
"this is a barbar",
"this is a foofoo"
};
pcre2_code *re;
PCRE2_SPTR pattern = (unsigned char *)"foo";
int errornumber;
PCRE2_SIZE erroroffset;
re = pcre2_compile(
pattern,
PCRE2_ZERO_TERMINATED,
0,
&errornumber,
&erroroffset,
NULL
);
pcre2_jit_compile(re, PCRE2_JIT_COMPLETE);
FILE *fp;
int s = 0;
while(s < 2) {
search(re, subject[s++]);
}
if (argc >= 2) {
fp = fopen(argv[1], "r");
if (fp != NULL) {
char tline[2048];
while(fgets(tline, 2048, fp) != NULL) {
search(re, (unsigned char *)tline);
}
fclose(fp);
}
}
pcre2_code_free(re);
return 0;
}
Compile the code:
gcc -Wall -O2 -g pcretest.c -o pcretest -lpcre2-8
As you can see, in line 58 I check if there is an argument given, the code tries to open it as a file.
Also as you can see in line 49, I'd like to use PCRE2's JIT.
The code works as well, but I checked it with Valgrind, and found an interesting behavior:
if I add a file as argument, then Valgrind reports Conditional jump or move depends on uninitialised value(s) and Uninitialised value was created by a stack allocation, but it points to the main(). The command:
valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all --track-origins=yes -s ./pcretest myfile.txt
Without argument, there is no any Valgrind report. Command:
valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all --track-origins=yes -s ./pcretest
if I comment out the pcre2_jit_compile((*re), PCRE2_JIT_COMPLETE); in line 55, then everything works as well, no any Valgrind reports. Command:
valgrind --tool=memcheck --leak-check=full --show-leak-kinds=all --track-origins=yes -s ./pcretest myfile.txt
The Valgrind's relevant output:
==31385== Conditional jump or move depends on uninitialised value(s)
==31385== at 0x4EECD1A: ???
==31385== by 0x1FFEFFFC1F: ???
==31385== Uninitialised value was created by a stack allocation
==31385== at 0x1090FA: main (pcretest.c:27)
...
==31385== HEAP SUMMARY:
==31385== in use at exit: 0 bytes in 0 blocks
==31385== total heap usage: 12 allocs, 12 frees, 13,486 bytes allocated
==31385==
==31385== All heap blocks were freed -- no leaks are possible
==31385==
==31385== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
==31385==
==31385== 1 errors in context 1 of 1:
==31385== Conditional jump or move depends on uninitialised value(s)
==31385== at 0x4EECD1A: ???
==31385== by 0x1FFEFFFC1F: ???
==31385== Uninitialised value was created by a stack allocation
==31385== at 0x1090FA: main (pcretest.c:27)
==31385==
==31385== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
In line 27 there is the int main(...).
What do I miss?
Observations:
The Valgrind report is telling you that the uninitialized data being accessed are in the stack frame of the initial call to main(). However,
even though you're compiling with debug information, the Valgrind report does not implicate a specific variable. Also,
the report's stack trace for the error does not present function names, and does not trace back to main(). And of course,
the error is not reported when you disable JIT compilation of the pattern.
Apparently, then, the error is associated with the machine code generated by PCRE2's JIT compiler. If you don't perform JIT compilation then you get correct operation via the ordinary matching path. If you do perform JIT compilation then the JIT-generated code is engaged, and that code triggers the Valgrind error. You might nevertheless get correct matching, but I would not rely on that for code that triggers the Valgrind error observed.
I played around with variations on your code, and discovered that the error is specifically associated with the calls to pcre2_match_data_create_from_pattern() and pcre2_match() in function search(). Either one will cause Valgrind to report the error. But why does the error occur only in some calls to search()?
It seems likely to be because the JIT compilation sets up data structures in main()'s stack frame that are clobbered by executing the body of the if (argc > 2) statement. This is consistent with the fact that I was able to avoid the error by adding an initializer for variable tline in that block:
char tline[2048] = {0};
I can imagine a variety of scenarios for why that might make a difference, all having to do with how the JIT-generated code and the compiler-generated code manipulate the stack pointer.
Personally, discovering such an issue would likely persuade me to stay far away from PCRE's JIT compiler. Definitely I would do that at least until I had evidence of pattern matching being a performance hotspot for my program. If you must engage the JIT, however, then here are some recommendations that might (or might not) help you avoid trouble:
Take "just in time" to heart: perform JIT as close as possible to when you actually use the pattern.
Do not assume that the JIT code is long-term viable. In particular, it probably is unsafe to use after the function that calls the JIT compiler returns, but it might not be good even that long.
Use the JIT-compiled regex (only) in the same function that runs the JIT compiler.
Make that function as simple as possible.
Declare all local variables of that function at the beginning, with initializers.
Test thoroughly.
That's more than seems to have been necessary to resolve the issue for your particular example code, but it's aimed more generally at reducing the cross section for the compiled program violating assumptions made by the JIT.
This is indeed caused by efficient use of SSE2. CPU-s use 1K or bigger pages to map memory, so a 16 byte aligned 16 byte read (SSE2 registers are 16 byte long) which intersects with a valid buffer is always valid. However, bytes before the start or after the end of the buffer might never be initialized. The algorithm ignores these bytes, so the random data (regardless it is initialized or not) have no effect on any computation.

Why is valgrind memcheck not finding errors?

I havent used valgrind before but I think it should detect some memory errors.
My code:
#include <stdio.h>
unsigned int a[2];
int main()
{
a[-1] = 21;
printf("%d,", a[-1]);
return 1;
}
As you can see, I am accessing a[-1] which I should not.
How am I using valgrind?
I am compiling with gcc -g -O0 codeFile.c
And executing: valgrind -s ./a.out
Result is:
==239== Memcheck, a memory error detector
==239== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==239== Using Valgrind-3.16.0.GIT and LibVEX; rerun with -h for copyright info
==239== Command: ./a.out
==239== 21,==239==
==239== HEAP SUMMARY:
==239== in use at exit: 0 bytes in 0 blocks
==239== total heap usage: 1 allocs, 1 frees, 1,024 bytes allocated
==239==
==239== All heap blocks were freed -- no leaks are possible
==239==
==239== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Shouldnt valgrind find these error, or am I using it wrong?
EDIT:
It seems that valgrind memcheck does not do anything for global variables and as suggested in the answers/comments that it should work with indexes further from the pointer, therefore:
I removed the global declaration and added it insude main, and accessed a[-10] instead of a[1]. Same behaviour.
int main()
{
unsigned int a[2];
a[-10] = 21;
printf("%d,", a[-10]);
return 1;
}
It actually throws error if I use a[-100] though. Whats the deal?
EDIT 2
Furthermore, why this has no errors
while (i <= 2)
{
j = a[i];
i++;
}
But this does
while (i <= 2)
{
printf("%d,", a[i]);
i++;
}
Valgrind usually can't find memory errors where the memory being modified is at a negative offset from the current stack pointer or memory that coincides with another variable in memory.
For example, if a was on the stack, a[3] would trigger memcheck. a[-1] would not, because that, for all Valgrind knows, could easily be valid memory.
To expand on that, here's a quote from the documentation with my emphasis added:
In this example, Memcheck can't identify the address. Actually the address is on the stack, but, for some reason, this is not a valid stack address -- it is below the stack pointer and that isn't allowed.
This quote is actually partially incorrect; when it says "below the stack pointer" it really means at a positive offset from the stack pointer, or interfering with another function's stack memory.
I should also note that (from your second edit) Valgrind doesn't actually complain until the value is used in some meaningful way. Assignment is, in Valgrind's eyes, not using the value in a meaningful way. Here's another quote to back that up with my emphasis added:
It is important to understand that your program can copy around junk (uninitialised) data as much as it likes. Memcheck observes this and keeps track of the data, but does not complain. A complaint is issued only when your program attempts to make use of uninitialised data in a way that might affect your program's externally-visible behaviour.
Because a is a global variable, you'll have a hard time trying to check the memory of it. One Valgrind tool I've used before that deals with this is exp-sgcheck (experimental static and global variable check), although I've found it to be unreliable (most likely due to it being experimental).
An easier and better way to detect these would be to enable compiler warnings or use a static analyzer (my favorite is LLVM's scan-build).
You declared a as an global array, so use --tool=exp-sgcheck to check for stack and global array overruns. Keep in mind that --tool=exp-sgcheck is an experimental implementation so it doesn't show up whenever enabling -s or --show-error-list=yes, you can read more about it here.

Segmentation fault from pointer in fprintf

I am trying to run a program that takes an input file, reads it, and then runs a simulation based on the parameters specified in the input file. The code is a mixture of C and Fortran77, with the main executable entirely in Fortran77. I did not write the code and I'm inexperienced with both languages (so if I say something stupid, that's why).
Upon running the program, I get the following error regardless of the input file: Segmentation fault (core dumped)
This is the output I get from valgrind, with a few small omissions:
Invalid read of size 4
at 0x55ACFDD: vfprintf (vfprintf.c:1295)
by 0x55B76B6: fprintf (fprintf.c:32)
by 0x578C64: cfprintf_ (in [path omitted])
by 0x56D288: writebuf_ (writebuf.F:22)
by 0x56D3CC: writemsg_ (writemsg.F:15)
by 0x4C09E7: init0_ (init0.F:68)
by 0x4D6C65: [omitted]
by 0x4E9EC4: main (main.c:113)
Address 0xffffffff0592bbc0 is not stack'd, malloc'd or (recently) free'd
Process terminating with default action of signal 11 (SIGSEGV)
Access not within mapped region at address 0xFFFFFFFF0592BBC0
at 0x55ACFDD: vfprintf (vfprintf.c:1295)
by 0x55B76B6: fprintf (fprintf.c:32)
by 0x578C64: cfprintf_ (in [path omitted])
by 0x56D288: writebuf_ (writebuf.F:22)
by 0x56D3CC: writemsg_ (writemsg.F:15)
by 0x4C09E7: init0_ (init0.F:68)
by 0x4D6C65: [omitted]
by 0x4E9EC4: main (main.c:113)
If you believe this happened as a result of a stack
overflow in your program's main thread (unlikely but
possible), you can try to increase the size of the
main thread stack using the --main-stacksize= flag.
The main thread stack size used in this run was 8388608.
HEAP SUMMARY:
in use at exit: 11,354 bytes in 7 blocks
total heap usage: 46 allocs, 39 frees, 22,590 bytes allocated
LEAK SUMMARY:
definitely lost: 0 bytes in 0 blocks
indirectly lost: 0 bytes in 0 blocks
possibly lost: 8,008 bytes in 2 blocks
still reachable: 3,346 bytes in 5 blocks
suppressed: 0 bytes in 0 blocks
Rerun with --leak-check=full to see details of leaked memory
For counts of detected and suppressed errors, rerun with: -v
ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
It seems the vfprintf.c and fprintf.c files are part of some internal library, because I can't find those files on my computer. cfprintf.c is part of my code, shown below:
#include <stdio.h>
#include <fcntl.h>
#ifdef _CRAY
#ifdef FORTRAN_PASSES_ASSUMED_CHAR_LEN
int CFPRINTF (fileptr, s1, cl1, num_chars)
int *cl1;
#else
int CFPRINTF (fileptr, s1, num_chars)
#endif
#else
#ifdef POST_UNDERSCORE
int cfprintf_ (fileptr, s1, num_chars)
#else
int cfprintf (fileptr, s1, num_chars)
#endif
#endif
/* write num_chars characters from string s1 to file associated with fileptr */
FILE **fileptr;
char *s1;
int *num_chars;
{
char format_string [11];
char buffer[256];
sprintf (format_string, "%c.%ds", '%', *num_chars);
/* write to an intermediate buffer to avoid problems with '\n' */
sprintf (buffer, format_string, s1);
fprintf (*fileptr, "%s\n", buffer);
}
The line that throws the error is fprintf (*fileptr, "%s\n", buffer);
Specifically, I think it's the pointer that is causing the error, but I don't what's wrong or how to fix it.
I had issues compiling the code initially because I am using a 64-bit machine. The code, which relies heavily on pointers, implicitly declares many if not all of them, so I think that means they take the default 32-bit format of 4-bit integers. However, on my machine these pointers are 8-bit integers because I compiled in 64 bits.
If *fileptr is of one format, but the code is expecting another, perhaps that is what's generating the Invalid read of size 4 message from valgrind? But like I said, if that's the problem I still don't know how to fix it.
Thanks for your help, and please let me know if there's any additional code I should post.
First of all, you definitely shouldn't be constructing your format string at runtime like that. Use the following instead:
snprintf(buffer, 256, "%.*s", *num_chars, s1);
or even just
fprintf(*fileptr, "%.*s\n", *num_chars, s1);
.* allows the width of the string to be dynamically specified.
Next, I would check that your fileptr is valid, and that it was successfully opened. The address 0xffffffff0592bbc0 is very suspicious due to the four high bytes all being 0xff, so it is possible that you are having 32-bit/64-bit issues.
As I understand from your question, the application has been known to work in the past, but compiled on 32-bit Unix systems? (That's the only environment I've seen old Fortran and C code mixed in a single program in general.)
Unless you want to port the application to 64-bit I would suggest you simply compile it as an 32-bit native application, assuming your operating system allows 32-bit applications to run in a 64-bit environment.
For example using the gcc compiler the flag -m32 is all you need to add for 64-bit x86 processor systems. I believe clang supports the same flag, but if not check the documentation.
A second potential complicating factor, is whether the program / application (or its data) is endian dependent, which may be an issue if was written under the assumption of being run on BIG ENDIAN CPUs, such as the majority of Unix workstations with RISC-style processors (MIPS, PowerPC, etc.), and you are now trying to run it on an 64-bit x86 Intel or AMD processor (aka amd64, x86-64).
My advice in this case is to hurt someone..., er, review the important of the application and its usage.

How do I use valgrind to find memory leaks?

How do I use valgrind to find the memory leaks in a program?
Please someone help me and describe the steps to carryout the procedure?
I am using Ubuntu 10.04 and I have a program a.c, please help me out.
How to Run Valgrind
Not to insult the OP, but for those who come to this question and are still new to Linux—you might have to install Valgrind on your system.
sudo apt install valgrind # Ubuntu, Debian, etc.
sudo yum install valgrind # RHEL, CentOS, Fedora, etc.
sudo pacman -Syu valgrind # Arch, Manjaro, Garuda, etc
Valgrind is readily usable for C/C++ code, but can even be used for other
languages when configured properly (see this for Python).
To run Valgrind, pass the executable as an argument (along with any
parameters to the program).
valgrind --leak-check=full \
--show-leak-kinds=all \
--track-origins=yes \
--verbose \
--log-file=valgrind-out.txt \
./executable exampleParam1
The flags are, in short:
--leak-check=full: "each individual leak will be shown in detail"
--show-leak-kinds=all: Show all of "definite, indirect, possible, reachable" leak kinds in the "full" report.
--track-origins=yes: Favor useful output over speed. This tracks the origins of uninitialized values, which could be very useful for memory errors. Consider turning off if Valgrind is unacceptably slow.
--verbose: Can tell you about unusual behavior of your program. Repeat for more verbosity.
--log-file: Write to a file. Useful when output exceeds terminal space.
Finally, you would like to see a Valgrind report that looks like this:
HEAP SUMMARY:
in use at exit: 0 bytes in 0 blocks
total heap usage: 636 allocs, 636 frees, 25,393 bytes allocated
All heap blocks were freed -- no leaks are possible
ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
I have a leak, but WHERE?
So, you have a memory leak, and Valgrind isn't saying anything meaningful.
Perhaps, something like this:
5 bytes in 1 blocks are definitely lost in loss record 1 of 1
at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
by 0x40053E: main (in /home/Peri461/Documents/executable)
Let's take a look at the C code I wrote too:
#include <stdlib.h>
int main() {
char* string = malloc(5 * sizeof(char)); //LEAK: not freed!
return 0;
}
Well, there were 5 bytes lost. How did it happen? The error report just says
main and malloc. In a larger program, that would be seriously troublesome to
hunt down. This is because of how the executable was compiled. We can
actually get line-by-line details on what went wrong. Recompile your program
with a debug flag (I'm using gcc here):
gcc -o executable -std=c11 -Wall main.c # suppose it was this at first
gcc -o executable -std=c11 -Wall -ggdb3 main.c # add -ggdb3 to it
Now with this debug build, Valgrind points to the exact line of code
allocating the memory that got leaked! (The wording is important: it might not
be exactly where your leak is, but what got leaked. The trace helps you find
where.)
5 bytes in 1 blocks are definitely lost in loss record 1 of 1
at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
by 0x40053E: main (main.c:4)
Techniques for Debugging Memory Leaks & Errors
Make use of www.cplusplus.com! It has great documentation on C/C++ functions.
General advice for memory leaks:
Make sure your dynamically allocated memory does in fact get freed.
Don't allocate memory and forget to assign the pointer.
Don't overwrite a pointer with a new one unless the old memory is freed.
General advice for memory errors:
Access and write to addresses and indices you're sure belong to you. Memory
errors are different from leaks; they're often just IndexOutOfBoundsException
type problems.
Don't access or write to memory after freeing it.
Sometimes your leaks/errors can be linked to one another, much like an IDE discovering that you haven't typed a closing bracket yet. Resolving one issue can resolve others, so look for one that looks a good culprit and apply some of these ideas:
List out the functions in your code that depend on/are dependent on the
"offending" code that has the memory error. Follow the program's execution
(maybe even in gdb perhaps), and look for precondition/postcondition errors. The idea is to trace your program's execution while focusing on the lifetime of allocated memory.
Try commenting out the "offending" block of code (within reason, so your code
still compiles). If the Valgrind error goes away, you've found where it is.
If all else fails, try looking it up. Valgrind has documentation too!
A Look at Common Leaks and Errors
Watch your pointers
60 bytes in 1 blocks are definitely lost in loss record 1 of 1
at 0x4C2BB78: realloc (vg_replace_malloc.c:785)
by 0x4005E4: resizeArray (main.c:12)
by 0x40062E: main (main.c:19)
And the code:
#include <stdlib.h>
#include <stdint.h>
struct _List {
int32_t* data;
int32_t length;
};
typedef struct _List List;
List* resizeArray(List* array) {
int32_t* dPtr = array->data;
dPtr = realloc(dPtr, 15 * sizeof(int32_t)); //doesn't update array->data
return array;
}
int main() {
List* array = calloc(1, sizeof(List));
array->data = calloc(10, sizeof(int32_t));
array = resizeArray(array);
free(array->data);
free(array);
return 0;
}
As a teaching assistant, I've seen this mistake often. The student makes use of
a local variable and forgets to update the original pointer. The error here is
noticing that realloc can actually move the allocated memory somewhere else
and change the pointer's location. We then leave resizeArray without telling
array->data where the array was moved to.
Invalid write
1 errors in context 1 of 1:
Invalid write of size 1
at 0x4005CA: main (main.c:10)
Address 0x51f905a is 0 bytes after a block of size 26 alloc'd
at 0x4C2B975: calloc (vg_replace_malloc.c:711)
by 0x400593: main (main.c:5)
And the code:
#include <stdlib.h>
#include <stdint.h>
int main() {
char* alphabet = calloc(26, sizeof(char));
for(uint8_t i = 0; i < 26; i++) {
*(alphabet + i) = 'A' + i;
}
*(alphabet + 26) = '\0'; //null-terminate the string?
free(alphabet);
return 0;
}
Notice that Valgrind points us to the commented line of code above. The array
of size 26 is indexed [0,25] which is why *(alphabet + 26) is an invalid
write—it's out of bounds. An invalid write is a common result of
off-by-one errors. Look at the left side of your assignment operation.
Invalid read
1 errors in context 1 of 1:
Invalid read of size 1
at 0x400602: main (main.c:9)
Address 0x51f90ba is 0 bytes after a block of size 26 alloc'd
at 0x4C29BE3: malloc (vg_replace_malloc.c:299)
by 0x4005E1: main (main.c:6)
And the code:
#include <stdlib.h>
#include <stdint.h>
int main() {
char* destination = calloc(27, sizeof(char));
char* source = malloc(26 * sizeof(char));
for(uint8_t i = 0; i < 27; i++) {
*(destination + i) = *(source + i); //Look at the last iteration.
}
free(destination);
free(source);
return 0;
}
Valgrind points us to the commented line above. Look at the last iteration here,
which is *(destination + 26) = *(source + 26);. However, *(source + 26) is
out of bounds again, similarly to the invalid write. Invalid reads are also a
common result of off-by-one errors. Look at the right side of your assignment
operation.
The Open Source (U/Dys)topia
How do I know when the leak is mine? How do I find my leak when I'm using
someone else's code? I found a leak that isn't mine; should I do something? All
are legitimate questions. First, 2 real-world examples that show 2 classes of
common encounters.
Jansson: a JSON library
#include <jansson.h>
#include <stdio.h>
int main() {
char* string = "{ \"key\": \"value\" }";
json_error_t error;
json_t* root = json_loads(string, 0, &error); //obtaining a pointer
json_t* value = json_object_get(root, "key"); //obtaining a pointer
printf("\"%s\" is the value field.\n", json_string_value(value)); //use value
json_decref(value); //Do I free this pointer?
json_decref(root); //What about this one? Does the order matter?
return 0;
}
This is a simple program: it reads a JSON string and parses it. In the making,
we use library calls to do the parsing for us. Jansson makes the necessary
allocations dynamically since JSON can contain nested structures of itself.
However, this doesn't mean we decref or "free" the memory given to us from
every function. In fact, this code I wrote above throws both an "Invalid read"
and an "Invalid write". Those errors go away when you take out the decref line
for value.
Why? The variable value is considered a "borrowed reference" in the Jansson
API. Jansson keeps track of its memory for you, and you simply have to decref
JSON structures independent of each other. The lesson here:
read the documentation. Really. It's sometimes hard to understand, but
they're telling you why these things happen. Instead, we have
existing questions about this memory error.
SDL: a graphics and gaming library
#include "SDL2/SDL.h"
int main(int argc, char* argv[]) {
if (SDL_Init(SDL_INIT_VIDEO|SDL_INIT_AUDIO) != 0) {
SDL_Log("Unable to initialize SDL: %s", SDL_GetError());
return 1;
}
SDL_Quit();
return 0;
}
What's wrong with this code? It consistently leaks ~212 KiB of memory for me. Take a moment to think about it. We turn SDL on and then off. Answer? There is nothing wrong.
That might sound bizarre at first. Truth be told, graphics are messy and sometimes you have to accept some leaks as being part of the standard library. The lesson here: you need not quell every memory leak. Sometimes you just need to suppress the leaks because they're known issues you can't do anything about. (This is not my permission to ignore your own leaks!)
Answers unto the void
How do I know when the leak is mine?
It is. (99% sure, anyway)
How do I find my leak when I'm using someone else's code?
Chances are someone else already found it. Try Google! If that fails, use the skills I gave you above. If that fails and you mostly see API calls and little of your own stack trace, see the next question.
I found a leak that isn't mine; should I do something?
Yes! Most APIs have ways to report bugs and issues. Use them! Help give back to the tools you're using in your project!
Further Reading
Thanks for staying with me this long. I hope you've learned something, as I tried to tend to the broad spectrum of people arriving at this answer. Some things I hope you've asked along the way: How does C's memory allocator work? What actually is a memory leak and a memory error? How are they different from segfaults? How does Valgrind work? If you had any of these, please do feed your curiousity:
More about malloc, C's memory allocator
Definition of a segmentation fault
Definition of a memory leak
Definition of a memory access error
How does Valgrind work?
Try this:
valgrind --leak-check=full -v ./your_program
As long as valgrind is installed it will go through your program and tell you what's wrong. It can give you pointers and approximate places where your leaks may be found. If you're segfault'ing, try running it through gdb.
You can run:
valgrind --leak-check=full --log-file="logfile.out" -v [your_program(and its arguments)]
You can create an alias in .bashrc file as follows
alias vg='valgrind --leak-check=full -v --track-origins=yes --log-file=vg_logfile.out'
So whenever you want to check memory leaks, just do simply
vg ./<name of your executable> <command line parameters to your executable>
This will generate a Valgrind log file in the current directory.

Cannot free memory after using strdup

gcc 4.5.1 c89
I am trying to free some memory. However, when I check with valgrind the memory hasn't been freed. I am wondering what I am doing wrong.
I have the following structure:
typedef struct tag_cand_results {
char *candidate_winners[NUMBER_OF_CANDIDATES];
} cand_results;
I create an object of this structure:
cand_results *results = NULL;
I allocate some memory for the structure.
results = calloc(1, sizeof *results);
Assign some data to it
results->candidate_winners[0] = strdup("Steve Martin");
results->candidate_winners[1] = strdup("Jack Jones");
Then I try to free all the memory allocated:
free(results->candidate_winners[0]);
free(results->candidate_winners[1]);
free(results);
Just to be safe assign to NULL
results = NULL;
I get the following output from valgrind.
==8119== 72 bytes in 6 blocks are definitely lost in loss record 1 of 2
==8119== at 0x4A05E46: malloc (vg_replace_malloc.c:195)
==8119== by 0x3FE2E82A91: strdup (strdup.c:43)
==8119== by 0x400E5A: main (driver.c:116)
==8119==
==8119== 72 bytes in 6 blocks are definitely lost in loss record 2 of 2
==8119== at 0x4A05E46: malloc (vg_replace_malloc.c:195)
==8119== by 0x3FE2E82A91: strdup (strdup.c:43)
==8119== by 0x400E72: main (driver.c:117)
I don't know why the memory is not been freed?
Many thanks for any suggestions,
If that is actually the sequence of events, then valgrind is wrong. The memory is being freed.
As to the best technique requested in your comment, normally I would say valgrind but perhaps not in this case :-)
Some things to check.
What happens if you just call malloc(30) instead of strdup(some_string) (in both cases)?
Remove the (malloc-or-strdup)/free pairs one at a time to see what happens.
I haven't seen your actual code so put a printf before and after every strdup and free line to make sure they're all being run.
Post a full small program here (that exhibits the problem) so we can check it out.
For what it's worth, the following small (complete) program:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NUMBER_OF_CANDIDATES 10
typedef struct tag_cand_results {
char *candidate_winners[NUMBER_OF_CANDIDATES];
} cand_results;
int main (void) {
cand_results *results = NULL;
results = calloc(1, sizeof *results);
results->candidate_winners[0] = strdup("Steve Martin");
results->candidate_winners[1] = strdup("Jack Jones");
free(results->candidate_winners[0]);
free(results->candidate_winners[1]);
free(results);
results = NULL;
return 0;
}
results in the following valgrind output:
==9649== Memcheck, a memory error detector
==9649== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==9649== Using Valgrind-3.6.0.SVN-Debian and LibVEX; rerun with -h for
copyright info
==9649== Command: ./qq
==9649==
==9649==
==9649== HEAP SUMMARY:
==9649== in use at exit: 0 bytes in 0 blocks
==9649== total heap usage: 3 allocs, 3 frees, 64 bytes allocated
==9649==
==9649== All heap blocks were freed -- no leaks are possible
==9649==
==9649== For counts of detected and suppressed errors, rerun with: -v
==9649== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 13 from 8)
In other words, no problems. So it may be something else in your case (an environment issue perhaps). This particular run was done on Ubuntu Lucid (10.04), gcc 4.4.3, c89 mode.
I'd suggest typing in that code exactly on your system to see what happens. The command line I used to compile and test was:
gcc -std=c89 -o qq qq.c
valgrind ./qq
You can also debug your application with gdb and watch if any pointer gets changed with the "watch" command. Place a breakpoint on your main function and do a step by step followup to discover where the problem resides.
Regards,
Miguel
There is no obvious error in your allocations/frees.
It looks like the content of result has been changed somehow (overwritten by some wild pointer ?).
One easy way to check that is to print memory address values of pointer (using printf("%p", ...)) immediately after the allocation using strdup and just before freeing. If it changed : bingo!
Do it also with result, another explanation could be that the pointer to result has changed (and henceforth the values pointed to).
Now, if the pointer has indeed changed how to pinpoint where it occurs ?
One solution can be to run the program using a debugger. This can be very time consuming in some case, but it usually works. But if this is not an option, there is another way. I usually find it faster than using a debugger.
Keep a copy of the allocated pointer in another variable, preferably make it remote from the memory chunk where is your corrupted pointer (a global will usually do).
Now in the control flow put assertions like:
assert(result == saved_result);
At some place the assertion should fail and you will eventually find the problem.
Aftwerward, you should not forget to remove your assertions that should not be left in the final project. To be sure of that just remove the saved_result variable. The program won't compile in debug mode if any assertion is left.
"72 bytes in 6 blocks", doesn't sound like "Steve Martin" or "Jack Jones". You're not overwriting the pointers at some point(!)?

Resources