I'm working on an embedded program. I use the avr-gcc tool chain to compile the C source from my MacBook Pro. Until recently things have been going pretty well. In my latest development iteration though, I seem to have introduced some sort of intermittent bug that I'm suspecting is some sort of stack or other memory corruption error.
I've never used Valgrind, but it seems it gets rave reviews, but most of the references seem to refer to malloc/free types of errors. I don't do any malloc'ing. It's a smallish embedded program, no OS. Can Valgrind help me? Any pointers on how I would use it to help find static memory mismanagement errors in a cross-compiled scenario would be really helpful!
Or is there a different tool or technique I should look at to validate my code's memory management?
Yes, valgrind can definitely help you. In addition to a lot of heap-based analysis (illegal frees, memory leaks, etc.) its memcheck tool detects illegal reads and writes, i.e. situations when your program accesses memory that it should not access. This analysis does not differentiate between static and dynamic memory: it would report accesses outside of a stack frame, accesses beyond bounds of a static array, and so on. It also detects access to variables that have not been onitialized previously. Both situations are undefined behavior, and can lead to crash.
Frama-C is a static analysis framework (as opposed to Valgrind which provides dynamic analysis). It was originally designed with embedded, possibly low-level code in mind. Frama-C's “value analysis“ plug-in basically detects all C undefined behaviors that you may want to know about in embedded code (including accessing an invalid pointer).
Since it is a static analyzer, it does not execute the code(*) and is thus ideal in a cross-compiled context. Look for option -machdep. Values for this option include x86_64 x86_32 ppc_32 x86_16.
Disclaimer: I am one of the contributors of Frama-C's “value analysis” plug-in.
(*) though if you provide all inputs and set precision on maximum, it can interpret the source code as precisely as any cross-compilation+execution would.
Related
Consider the two following lines of C :
int a[1] = {0};
a[1] = 0;
The second line makes a write access somewhere in the memory where it should not. Sometimes such programs will give a segfault during the execution, and sometimes not, depending on the environment I suppose, and maybe other things.
I wonder if there is a way to force, as much as possible, such programs to segfault (by compiling them in a special way for instance, or execute them in some virtual machine, I don't know).
This is for pedagogic purpose.
According to the C language standard these kinds of accesses are undefined behaviour and the compiler and runtime are not obliged to make them segfault (though they obviously do sometimes).
For pedagogical purposes you can have a look at the address sanitizers in popular compilers like GCC (-fsanitize=address in https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html) and Clang (https://clang.llvm.org/docs/AddressSanitizer.html).
In simple terms these options cause the compiler to instrument memory accesses with extra logic to catch out-of-bounds memory accesses and produce a user-visible error (though not exacly a segfault message), allowing users to spot such errors and fix them.
This might be what you are looking for.
Valgrind on Linux, stack guards for most compilers, debug options for you selected runtime (e.g. Application Verifier on Windows), there are plenty of options.
In your example the overflow is on the stack, which will always require the compiler to emit the appropriate guards. For dynamic memory allocations it's either up to the used C/C++ runtime library or a custom wrapper inside your application to catch this.
Tools like valgrind catch the heap based buffer overflow as they happen, as they actually execute the code in a VM.
Compiler assisted options work with canaries which are placed in front and back of the buffer, and which are typically checked again when the buffer is released. Options from the address sanitizer family may also add additional checks to all accesses on fields of a fixed size, but this won't work if raw pointers are involved.
Debug options for the runtime typically only provide a very rough granularity. Often they work by simply placing each allocation in a dedicated page in a non-continous address space. Accessing the gaps in between the pages then is an instant error. However only minor buffer overflows are typically not detected immediately.
Finally there is also static code analysis which all modern compilers support to some extent, which can easily detect at least trivial mistakes like the one in your example.
None of these options is able to catch all possible errors though. The C language gives you plenty of options to achieve undefined behavior which none of these tools can detect.
I know this might be a bit vague and far-fetched (sorry, stackoverflow police!).
Is there a way, without external forces, to instrument (track basically) each pointer access and track reads and writes - either general reads/writes or quantity of reads/writes per access. Bonus if it can be done for all variables and differentiate between stack and heap ones.
Is there a way to wrap pointers in general or should this be done via custom heap? Even with custom heap I can't think of a way.
Ultimately I'd like to see a visual representation of said logs that would show me variables represented as blocks (of bytes or multiples of) and heatmap over them for reads and writes.
Ultra simple example:
int i = 5;
int *j = &i;
printf("%d", *j); /* Log would write *j was accessed for read and read sizeof(int) bytes
Attempt of rephrasing in more concise manner:
(How) can I intercept (and log) access to a pointer in C without external instrumentation of binary? - bonus if I can distinguish between read and write and get name of the pointer and size of read/write in bytes.
I guess (or hope for you) that you are developing on Linux/x86-64 with a recent GCC (5.2 in october 2015) or perhaps Clang/LLVM compiler (3.7).
I also guess that you are tracking a naughty bug, and not asking this (too broad) question from a purely theoretical point of view.
(Notice that practically there is no simple answer to your question, because in practice C compilers produce machine code close to the hardware, and most hardware do not have sophisticated instrumentations like the one you dream of)
Of course, compile with all warnings and debug info (gcc -Wall -Wextra -g). Use the debugger (gdb), notably its watchpoint facilities which are related to your issue. Use also valgrind.
Notice also that GDB (recent versions like 7.10) is scriptable in Python (or Guile), and you could code some scripts for GDB to assist you.
Notice also that recent GCC & Clang/LLVM have several sanitizers. Use some of the -fsanitize= debugging options, notably the address sanitizer with -fsanitize=address; they are instrumenting the code to help in detecting pointer accesses, so they are sort-of doing what you want. Of course, the performance of the instrumented generated code is decreasing (depending on the sanitizer, can be 10 or 20% or a factor of 50x).
At last, you might even consider adding your own instrumentation by customizing your compiler, e.g. with MELT -a high level domain specific language designed for such customization tasks for GCC. This would take months of work, unless you are already familiar with GCC internals (then, only several weeks). You could add an "optimization" pass inside GCC which would instrument (by changing the Gimple code) whatever accesses or stores you want.
Read more about aspect-oriented programming.
Notice also that if your C code is generated, that is if you are meta-programming, then changing the C code generator might be very relevant. Read more about reflection and homoiconicity. Dynamic software updating is also related to your issues.
Look also into profiling tools like oprofile and into sound static source analyzers like Frama-C.
You could also run your program inside some (instrumenting) emulator (like Qemu, Unisim, etc...).
You might also compile for a fictitious architecture like MMIX and instrument its emulator.
I am debugging a program which links against a commercial API library (under Linux). I am using valgrind memcheck, because I am experiencing strange behavior which could be due to writes beyond allocated blocks of memory:
valgrind --tool=memcheck --error-limit=no --log-file=memcheck.log ./executable
The first thing which jumps to my eye, however, are many errors of the types
Use of uninitialised value of size (4/8/16)
Invalid read of size (4/8/16)
Conditional jump or move depends on uninitialised value(s)
Some, but not all, of these occur in __intel_sse2_strcpy or __intel_sse2_strlen. Furthermore, according to valgrind there are definite memory leaks.
which appear in the library. They also appear when I compile one of the examples that ship with the library, so they are not my programming errors. Furthermore, they consistently occur with different versions of the library. Since the library is closed-source I cannot seem to clarify if the errors are fatal or not.
Practically this makes it hard for me to identify my potential own errors. I am a bit surprised to see so many warnings because I tend to fix my own programs until memcheck does not print these anymore (before I give it away at least). The question is: Can I consider such errors as save to ignore, do they commonly appear in packaged software, or are they likely even false positives (for instance because the library was compiled with optimizations)?
I would say:
No, you can't consider them safe to ignore. Valgrind is good.
Yes, they can be pretty common if the original developers have never used Valgrind or a similar tool on their code, it's reasonable to expect some hits.
I don't think they are false posivives, such are rare.
Quoting an answer from here which might explains the false positives encountered in string operations:
https://www.intel.com/content/www/us/en/developer/articles/troubleshooting/false-positive-diagnostic-on-string-operations-reported-by-intel-inspector.html
''' there are certain string operations that use vector(SIMD) instructions to calculate the string length. They read a string pointer in 32 byte chunks and check for a NULL character in each chunk that it reads. If the string size is not a multiple of 32, then it reads garbage in the memory region after the NULL '''
I have some software that I have working on a redhat system with icc and it is working fine. When I ported the code to an IRIX system running with MIPS then I get some calculations that come out as "nan" when there should definitely be values there.
I don't have any good debuggers on the non-redhat system, but I have tracked down that some of my arrays are getting "nan" sporadically in them and that is causing my dot product calculation to come back as "nan."
Seeing as how I can't track it down with a debugger, I am thinking that the problem may be with a memcpy. Are there any issues with the MIPS compiler memcpy() function with dynamically allocated arrays? I am basically using
memcpy(to, from, n*sizeof(double));
And I can't really prove it, but I think this may be the issue. Is there some workaround? Perhaps sme data is misaligned? How do I fix that?
I'd be surprised if your problem came from a bug in memcpy. It may be an alignment issue: are your doubles sufficiently aligned? (They will be if you only store them in double or double[] objects or through double* pointers but might not be if you move them around via void* pointers). X86 platforms are more tolerant to misalignment than most.
Did you try compiling your code with gcc at a high warning level? (Gcc is available just about everywhere that's not a microcontroller or mainframe. It may produce slower code but better diagnostics than the “native” compiler.)
Of course, it could always be a buffer overflow or other memory management problem in some unrelated part of the code that just happened not to cause any visible bug on your original platform.
If you can't get a access to a good debugger, try at least printf'ing stuff in key places.
Is it possible for the memory regions to and from to overlap? memcpy isn't required to handle overlapping memory regions. If this is your problem then the solution is as simple as using memmove instead.
Is sizeof() definitely supported?
What are some techniques in detecting/debugging memory leak if you don't have trace tools?
Intercept all functions that allocate and deallocate memory (depending on the platform, the list may look like: malloc, calloc, realloc, strdup, getcwd, free), and in addition to performing what these functions originally do, save information about the calls somewhere, in a dynamically growing global array probably, protected by synchronization primitives for multithreaded programs.
This information may include function name, amount of memory requested, address of the successfully allocated block, stack trace that lets you figure out what the caller was, and so on. In free(), remove corresponding element from the array (if there are none, a wrong pointer is passed to free which is also a error that's good to be detected early). When the program ends, dump the remaining elements of the array - they will be the blocks that leaked. Don't forget about global objects that allocate and deallocate resources before and after main(), respectively. To properly count those resources, you will need to dump the remaining resources after the last global object gets destroyed, so a small hack of your compiler runtime may be necessary
Check out your loops
Look at where you are allocating variables - do you ever de-allocate them?
Try and reproduce the leak with a small subset of suspected code.
MAKE trace tools - you can always log to a file.
One possibility could be to compile the code and execute it on a system where you can take advantage of built in tools (e.g. libumem on Solaris, or the libc capability on Linux)
Divide and conquer is the best approach. If you have written you code in a systematic way, it should be pretty easy to call subsets of you code. Your best bet is to execute each section of code over and over and see if your memory usage steadily climbs, if not move on to the next section of code.
Also, the wikipedia article on memory leaks has several great links in the references section on detecting memory leaks for different systems (window, macos, linux, etc)
Similar questions on SO:
Memory leak detectors for C
Strategies For Tracking Down Memory Leaks When You’ve Done Everything Wrong
In addition to the manual inspection techniques mentioned by others, you should consider a code analysis tool such as valgrind.
Introduction from their site:
Valgrind is an award-winning
instrumentation framework for building
dynamic analysis tools. There are
Valgrind tools that can automatically
detect many memory management and
threading bugs, and profile your
programs in detail. You can also use
Valgrind to build new tools.
The Valgrind distribution currently
includes six production-quality tools:
a memory error detector, two thread
error detectors, a cache and
branch-prediction profiler, a
call-graph generating cache profiler,
and a heap profiler. It also includes
two experimental tools: a
heap/stack/global array overrun
detector, and a SimPoint basic block
vector generator. It runs on the
following platforms: X86/Linux,
AMD64/Linux, PPC32/Linux, PPC64/Linux,
and X86/Darwin (Mac OS X).
I have used memtrace
http://www.fpx.de/fp/Software/MemTrace/
http://sourceforge.net/projects/memtrace/
You may need to call the statistics function to printout if there are any leaks. Best thing is to call this statistics function before and after a module or piece of code gets executed.
* Warning * Memtrace is kind enough to allow memory overwrite/double free. It detects these anomalies and gracefully avoids any crash.