GCC warn for non-freed heap blocks - c

So the question is simple is there a way to tell the GCC that I want to get warned if I do not free a heap allocated block? I know that we can have non-freed blocks for some purposes/we already reached end of program or something like that.
int main(){
int *a = malloc(sizeof(int));
return 0;
}
If I can get a warning even for this it would be awesome.

This is not a possible job for GCC to do. Static analysis cannot prove that a free is forgotten, that's the job of run-time analysers like valgrind's memcheck, or eventually gcc -fsanitize=leak, which I haven't seen there yet, only with clang -fsanitize=leak.
But you won't get a compile-time warning, even when gcc or clang supports it. It will be a run-time warning.

The compiler cannot predict and warn for non-freed blocks. This is runtime job, not compile time. You can implement your own malloc-free-check subsystem or modify memory management library.

Related

memcpy behaves differently with optimization flags compared to without

Consider this demo programme:
#include <string.h>
#include <unistd.h>
typedef struct {
int a;
int b;
int c;
} mystruct;
int main() {
int TOO_BIG = getpagesize();
int SIZE = sizeof(mystruct);
mystruct foo = {
123, 323, 232
};
mystruct bar;
memset(&bar, 0, SIZE);
memcpy(&bar, &foo, TOO_BIG);
}
I compile this two ways:
gcc -O2 -o buffer -Wall buffer.c
gcc -g -o buffer_debug -Wall buffer.c
i.e. the first time with optimizations enabled, the second time with debug flags and no optimization.
The first thing to notice is that there are no warnings when compiling, despite getpagesize returning a value that will cause buffer overflow with memcpy.
Secondly, running the first programme produces:
*** buffer overflow detected ***: terminated
Aborted (core dumped)
whereas the second produces
*** stack smashing detected ***: terminated
Aborted (core dumped)
or, and you'll have to believe me here since I can't reproduce this with the demo programme, sometimes no warning at all. The programme doesn't even interrupt, it runs as normal. This was a behaviour I encountered with some more complex code, which made it difficult to debug until I realised that there was a buffer overflow happening.
My question is: why are there two different behaviours with different build flags? And why does this sometimes execute with no errors when built as a debug build, but always errors when built with optimizations?
..I can't reproduce this with the demo program, sometimes no warning at all...
The undefined behavior directives are very broad, there is no requirement for the compiler to issue any warnings for a program that exhibits this behavior:
why are there two different behaviours with different build flags? And why does this sometimes execute with no errors when built as a debug build, but always errors when built with optimizations?
Compiler optimizations tend to optimize away unused variables, if I compile your code with optimizations enabled I don't get a segmentation fault, looking at the assembly (link above), you'll notice that the problematic variables are optimized away, and memcpy doesn't get called, so there is no reason for it to not compile successfuly, the program exits with success code 0, whereas if don't optimize it, the undefined behavior manifests itself, and the program exits with code 139, classic segmentation fault exit code.
As you can see these results are different from yours and that is one of the features of undefined behavior, different compilers, systems or even compiler versions can behave in a completely different way.
Accessing memory behind what's been allocated is undefined behavior, which means the compiler is allowed to do anything. When there are no optimizations, the compiler may try to guess and do something reasonable. When optimizations are turned on, the compiler may take advantage of the fact that any behavior is allowed to do something that runs faster.
The first thing to notice is that there are no warnings when compiling, despite getpagesize returning a value that will cause buffer overflow with memcpy.
That is the programmer's responsibility to fix, not the compiler. You'll be very lucky if a compiler manages to find potential buffer overflows for you. Its job is to check that your code is valid C then translate it to machine code.
If you want a tool that catches bugs, they are called static analysers and that's a different type of program. At some extent, static analysis might be integrated in a compiler as a feature. There is one for clang, but most static analysers are commercial tools and not open source.
Secondly, running the first programme produces: ... whereas the second produces
Undefined behavior simply means there is no defined behavior. What is undefined behavior and how does it work?. Meaning there's not likely anything to learn from examining the results, no interesting mystery to solve. In one case it apparently accessed forbidden memory, in the other case it mangled a poor little "stack canary". The difference will be related to different memory layouts. Who cares - bugs are bugs. Focus on why the bug happened (you already know!), instead of trying to make sense of the undefined results.
Now when I run your code with optimizations actually enabled for real (gcc -O2 on an x86 Linux), the compiler gives me
main:
subq $8, %rsp
call getpagesize
xorl %eax, %eax
addq $8, %rsp
ret
With optimizations actually enabled, it didn't even bother calling memcpy & friends because there are no side effects and the variables aren't used, so they can be safely removed from the executable.

warning for not using free() malloc

Are there any safeguards built into GCC that check for memory leaks? If so how can I use them? When I compile with "gcc -Wall -o run run.c", the compiler does not seem to care if any allocated heap-space is being freed at the end of the code. I could not find any simple fixes for this on Google.
Thanks much for your time.
EDIT:
Google Searches did point to Valgrind among other tools. But I was curious as to why the compiler cant deal with this issue. As a newbie, it seemed a simple enough task to check if every "malloc" has a "free" associated with it.
There are two ways to analyze code for problems - static analysis and run-time analysis. Static analysis reads the code - this is what compilers do really well. Run-time analysis for code problems happens when the code is linked against another set of libraries that see what the code actually does as it runs under surveillance. Finding memory leaks is difficult for static analysis but not for a run-time analysis package.
Other run-time analyses are things like code coverage - does all parts of your code run? gcov does this, like valgrind and electric fence look for memory problems like leaks.
So, no, there are no really good compiler safeguards for testing memory leaks.
There is -fsanitize=leak GCC flag.
It overrides malloc/calloc/free to make them count allocated and freed blocks of memory.
If your program is compiled with this flag, it prints information about detected leaks to the terminal after execution.
You can read about it here and here.
Also, I have never used it, so this answer is completely based on GCC manual.

Mudflap and pointer arrays

I've just implemented a pretty complicated piece of software, but my school's testing system won't take it.
The system uses the so-called mudflap library which should be able to prevent illegal memory accesses better. As a consequence, my program generates segfaults when run on the school's testing system (I submit the source code and the testing system compiles it for itself, using the mudflap library).
I tried to isolate the problematic code in my program, and it seems that it all boils down to something as simple as pointer arrays. Mudflap doesn't seem to like them.
Below is a piece of some very simple code with that works with a pointer array:
#include <stdlib.h>
int main()
{
char** rows;
rows=(char**)malloc(sizeof(char*)*3);
rows[0]=(char*)malloc(sizeof(char)*4);
rows[1]=(char*)malloc(sizeof(char)*4);
rows[2]=(char*)malloc(sizeof(char)*4);
strcpy(rows[0], "abc");
strcpy(rows[1], "abc");
strcpy(rows[2], "abc");
free(rows[0]); free(rows[1]); free(rows[2]);
free(rows);
return 0;
This will generate a segfault with mudflap. In my opinion, this is a perfectly legal code.
Could you please explain to me what is wrong with it, and why it generates a segfault with mudflap?
Note: The program should be compiled under an amd64 linux system with g++ using the following commands:
export MUDFLAP_OPTIONS='-viol-segv -print-leaks';
g++ -Wall -pedantic -fmudflap -fmudflapir -lmudflap -g file.cpp
You have at least one problem here:
char** rows;
rows=(char**)malloc(3);
This allocates 3 bytes. On most platforms the allocator probably has a minimum of at least 4 bytes which lets you get away with overwriting the buffer a bit. I'm guessing your mudflap library is more strict in its checking and catches the overwrite.
However, if you want an array of 3 char * pointers, you probably need at least 12 bytes.
Try changing these lines to:
char** rows;
rows=(char**)malloc(3 * sizeof(char *));
EDIT: Based on your modified code, I agree it looks correct now. The only thing I can suggest is that perhaps malloc() is failing and causing a NULL pointer access. If thats not the case it sounds like a bug or misconfiguration of mudflap.

How can I do automatic memory management in C?

In C memory allocation/deallocation done by malloc and free.
In C++ memory allocation/deallocation done by new and delete.
There are some solutions in C++ for automatic memory management like:
Smart Pointers.
RAII (Resource Acquisition Is Initialization)
Reference counting and cyclic references
...
But how can I do automatic memory management in C?
Is there any solutions for AUTOMATIC memory management in C?
Is there any guidelines or something like that for C?
I want when I foget free a block of memory:
My code doesn't compile
-- or --
Memory automatically deallocated
And then I say: Oh, C is better than C++, Java and C#. :-)
You may use a Boehm garbage collector library.
As answered by Juraj Blaho, you can use a garbage collection library, such as the Boehm conservative garbage collector, but there are other ones : Ravenbrook's memory pool system, my (unmaintained) Qish GC, Matthew Plant's GC, etc...
And often, you can write your own garbage collector specialized for your use case. You could use in C the techniques mentioned in your question (smart pointers, reference counting), but you can also implement a mark & sweep GC, or a copying GC.
An important issue when coding your GC is to keep track of local pointer variables (to garbage collected data). You could keep them in local struct and chain them together.
I strongly suggest to read more about GC, e.g. the GC handbook. The algorithms there are useful in many situations.
You could even customize your GCC compiler (e.g. using MELT) to add checks or to generate code (e.g. code to scan local variables) for your particular GC implementation. Or you could use some pre-processor (e.g. GPP) for that
In practice, Boehm's GC is often good enough.
Notice that liveness of some data is a whole-program property. So it better to think about GC very early in the design phase of your software development.
Notice also that detecting reliably memory leaks by static source code analysis is in general impossible (undecidable), since it can be proven equivalent to the halting problem.
For linux, I use valgrind. Sure, the original reason for why valgrind was build was to debug your code, but it does a lot more. It will even tell you where potentially erroneous code could be in a non-invasive way. My own command line of choice is as follows.
# Install valgrind. Remove this line of code if you already have it installed
apt install valgrind
# Now, compile and valgrind the C
gcc main.c -Werror -fshort-enums -std=gnu11 -Og -g3 -dg -gdwarf-2 -rdynamic -o main
valgrind --quiet --leak-check=yes --tool=memcheck -Wall ./main
Hope this helps. ~ Happy Coding!

How do I know which illegal address the program access when a segmentation fault happens

Plus, The program runs on a arm device running Linux, I can print out stack info and register values in the sig-seg handler I assign.
The problem is I can't add -g option to the source file, since the bug may won't reproduce due to performance downgrade.
Compiling with the -g option to gcc does not cause a "performance downgrade". All it does is cause debugging symbols to be included; it does not affect the optimisation or code generation.
If you install your SIGSEGV handler using the sa_sigaction member of the sigaction struct passed to sigaction(), then the si_addr member of the siginfo_t structure passed to your handler contains the faulting address.
I tend to use valgrind which indicates leaks and memory access faults.
This seems to work
http://tlug.up.ac.za/wiki/index.php/Obtaining_a_stack_trace_in_C_upon_SIGSEGV
static void signal_segv(int signum, siginfo_t* info, void*ptr) {
// info->si_addr is the illegal address
}
If you are worried about using -g on the binary that you load on the device, you may be able to use gdbserver on the ARM device with a stripped version of the executable and run arm-gdb on your development machine with the unstripped version of the executable. The stripped version and the unstripped version need to match up to do this, so do this:
# You may add your own optimization flags
arm-gcc -g program.c -o program.debug
arm-strip --strip-debug program.debug -o program
# or
arm-strip --strip-unneeded program.debug -o program
You'll need to read the gdb and gdbserver documentation to figure out how to use them. It's not that difficult, but it isn't as polished as it could be. Mainly it's very easy to accidentally tell gdb to do something that it ends up thinking you meant to do locally, so it will switch out of remote debugging mode.
You may also want to use the backtrace() function if available, that will provide the call stack at the time of the crash. This can be used in order to dump the stack like it happens in an high level programming language when a C program gets a segmentation fault, bus error, or other memory violation error.
backtrace() is available both on Linux and Mac OS X
If the -g option makes the error disappear, then knowing where it crashes is unlikely to be useful anyway. It's probably writing to an uninitialized pointer in function A, and then function B tries to legitimately use that memory, and dies. Memory errors are a pain.

Resources