How to Find Stack Overflow in Microcontrollers using C in Run Time - c

I am creating the application on STM32 microcontroller. I am using some library stack. Above that stack, I am using my application. I have two questions:
How can I detect and handle the stack over flow in runtime. Because I don't know how much memory that library is using.
How can I detect and handle the stack over flow in runtime if I am developing the code from the scratch. I read some where we have to maintain some count for each declaration. Is this correct way or any standard way to find it.

if you are limited to your device and no "high sophisticated" tools available, you could at least try "the old way". A simple stack guard may help. Somewhere in your code (depends on the tools you use), there must be the definition of the stack area. Something similar to:
.equ stacksize, 1024
stack: .space stacksize,0
(gnu as syntax, your's might be different)
with your device's startup code somewhere initializing the stack register to the top address of the stack area.
A stack guard would then just add a "magic number" to the stack top and bottom:
.equ stackmagic,0xaffeaffe
.equ stacksize, 1024
stacktop: .int stackmagic
stack: .space stacksize,0
stackbottom: .int stackmagic
with some code at least periodically checking (e.g. in a timer interrupt routine or - if available - in your debugger) if the stackmagic values are still there.

If your software is not tiny, I would first try to debug most of it on your laptop or desktop or tablet (perhaps running Linux, because it has good tools and very standard compliant compilers, similar to the cross-compiler you are using). Then you can profit from tools like valgrind or GCC compilation options like -Wall -Wextra -g -fsanitize=address etc....
You might store an approximation of the top of stack at start of your main function (e.g. by doing extern int* start_top_of_stack; then int i=0; start_top_of_stack= &i; near beginning of your main function. You could then have some local int j=0; in several functions and check at start of them that &j - start_top_of_stack is not too big.
But there is no silver bullet. I am just suggesting a trick.
If your application is critical to the point of accepting costly development efforts, you could use some formal method & source static program analysis tools (e.g. Frama-C, or make your own using MELT). If you are cross-compiling with a recent GCC you might want to use -Wstack-usage=some length and/or -fstack-usage to check that every call frame is not too big, or to compute manually the required stack depth.

Related

How to instrument/profile memory(heap, pointers) reads and writes in C?

I know this might be a bit vague and far-fetched (sorry, stackoverflow police!).
Is there a way, without external forces, to instrument (track basically) each pointer access and track reads and writes - either general reads/writes or quantity of reads/writes per access. Bonus if it can be done for all variables and differentiate between stack and heap ones.
Is there a way to wrap pointers in general or should this be done via custom heap? Even with custom heap I can't think of a way.
Ultimately I'd like to see a visual representation of said logs that would show me variables represented as blocks (of bytes or multiples of) and heatmap over them for reads and writes.
Ultra simple example:
int i = 5;
int *j = &i;
printf("%d", *j); /* Log would write *j was accessed for read and read sizeof(int) bytes
Attempt of rephrasing in more concise manner:
(How) can I intercept (and log) access to a pointer in C without external instrumentation of binary? - bonus if I can distinguish between read and write and get name of the pointer and size of read/write in bytes.
I guess (or hope for you) that you are developing on Linux/x86-64 with a recent GCC (5.2 in october 2015) or perhaps Clang/LLVM compiler (3.7).
I also guess that you are tracking a naughty bug, and not asking this (too broad) question from a purely theoretical point of view.
(Notice that practically there is no simple answer to your question, because in practice C compilers produce machine code close to the hardware, and most hardware do not have sophisticated instrumentations like the one you dream of)
Of course, compile with all warnings and debug info (gcc -Wall -Wextra -g). Use the debugger (gdb), notably its watchpoint facilities which are related to your issue. Use also valgrind.
Notice also that GDB (recent versions like 7.10) is scriptable in Python (or Guile), and you could code some scripts for GDB to assist you.
Notice also that recent GCC & Clang/LLVM have several sanitizers. Use some of the -fsanitize= debugging options, notably the address sanitizer with -fsanitize=address; they are instrumenting the code to help in detecting pointer accesses, so they are sort-of doing what you want. Of course, the performance of the instrumented generated code is decreasing (depending on the sanitizer, can be 10 or 20% or a factor of 50x).
At last, you might even consider adding your own instrumentation by customizing your compiler, e.g. with MELT -a high level domain specific language designed for such customization tasks for GCC. This would take months of work, unless you are already familiar with GCC internals (then, only several weeks). You could add an "optimization" pass inside GCC which would instrument (by changing the Gimple code) whatever accesses or stores you want.
Read more about aspect-oriented programming.
Notice also that if your C code is generated, that is if you are meta-programming, then changing the C code generator might be very relevant. Read more about reflection and homoiconicity. Dynamic software updating is also related to your issues.
Look also into profiling tools like oprofile and into sound static source analyzers like Frama-C.
You could also run your program inside some (instrumenting) emulator (like Qemu, Unisim, etc...).
You might also compile for a fictitious architecture like MMIX and instrument its emulator.

How can I optimize GCC compilation for memory usage?

I am developing a library which should use as little memory as possible (I am not concerned about anything else, like the binary size, or speed optimizations).
Are there any GCC flags (or any other GCC-related options) I can use? Should I avoid some level of -O* optimization?
You library -or any code in idiomatic C- has several kinds of memory usage :
binary code size, and indeed -Os should optimize that
heap memory, using C dynamic allocation, that is malloc; you obviously should know how, and how much, heap memory is allocated (and later free-d). The actual memory consumption would depend upon your particular malloc implementation (e.g. many implementations, when calling malloc(25) could in fact consume 32 bytes), not on the compiler. BTW, you might design your library to use some memory pools or even implement your own allocator (above OS syscalls like mmap, or above malloc etc...)
local variables, that is the call frames on the call stack. This mostly depend upon your code (but an optimizing compiler, e.g. -Os or -O2 for gcc, would probably use more registers and perhaps slightly less stack when optimizing). You could pass -fstack-usage to gcc to ask it to give the size of every call frame and you might give -Wstack-usage=len to be warned when a call frame exceeds len bytes.
global or static variables. You should know how much memory they need (and you might use nm or some other binutils program to query them). BTW, declaring carefully some variables inside a function as static would lower the stack consumption (but you cannot do that for every variable or every function).
Notice also that in some limited cases, GCC is doing tail calls, and then the stack usage is lowered (since the stack frame of the caller is reused in the callee). (See also this old question).
You might also ask the compiler to pack some particular struct-s (beware, this could slowdown the performance significantly). You'll want to use some type attributes like __attribute__((packed)), etc... and perhaps also some variable attributes etc...
Perhaps you should read more about Garbage Collection, since GC techniques, concepts, and terminology might be relevant. See this answer.
If on Linux, the valgrind tool should be useful too... (and during the debugging phase the -fsanitize=address option of recent GCC).
You might perhaps also use some code generation options like -fstack-reuse= or -fshort-enums or -fpack-struct or -fstack-limit-symbol= or -fsplit-stack ; be very careful: some such options make your binary code incompatible with your existing C (and others!) libraries (then you might need to recompile all used libraries, including your libc, with the same code generation flags).
You probably should enable link-time optimizations by compiling and linking with -flto (in addition of other optimization flags like -Os).
You certainly should use a recent version of GCC. Notice that GCC 5.1 has been released a few days ago (in april 2015).
If your library is large enough to worth the effort, you might even consider customizing your GCC compiler with MELT (to help you find out how to spend less memory). This might take weeks or months of work.
there are advantages to using 'stack frames', but that does use more stack space to save the stack frame pointer.
You can tell the compiler to not use stack frames. This will (generally) slightly increase the code size but will reduce the amount of stack used.
you can only use char and short for values rather than int.
It is poor programing practice, but can re-use variables and arrays for multiple purposes.
if some set of variables are mutually exclusive on usage, then can place them in a union.
If the function parameter lists are all very short, then can for the compiler to pass all the parameters in registers. (having an architecture with lots of general purpose registers really helps here.
Only use one malloc that contains ALL the area needed for malloc kind of operations, so as to minimize the amount of allocated memory overhead.
there are many techniques. Most make the code much more difficult to debug/maintain and often make the code much harder for humans to read
When possible, you can use -m32 option to compile your application for 32-bit. So, the application will consume only half of the memory on 64-bit systems.
apt-get install libc6-dev-i386
gcc -m32 application.c -o application

C embedded systems stack and heap size

How could I determine the current stack and heap size of a running C program on an embedded system? Also, how could I discover the maximum stack and heap sizes that my embedded system will allow? I thought about linearly calling malloc() with an increasing size until it fails to find the heap size, however I am more interested in the size of the stack.
I am using an mbed NXP LPC1768, and I am using an offline compiler developed on GitHub called gcc4mbed.
Any better ideas? All help is greatly appreciated!
For this look at your linker script, this will define how much space you have allocated to each.
For stack size usage do this:
At startup (before C main()) during initialization of memory, init all your stack bytes to known values such as 0xAA, or 0xCD. Run your program, at any point you can stop and see how many magic values you have left. If you don't see any magic values then you have overflowed your stack and weirdness may start to happen.
At runtime you can also check the last 4 bytes or so (maybe last two words, this is really up to you). If they don't match your magic value then force a reset. This only works if your system is well behaved on reset and it is best if it starts up quick and isn't doing something "real time" or mission critical.
Here's a really helpful whitepaper from IAR on the subject.
A crude way of measuring at runtime the current stack size is to declare
static void* mainsp;
then start your main with e.g:
int main(int argc, char**argv) {
int here;
mainsp = (void*) &here;
then inside some leaf routine, when the call stack is deep enough, do something similar to
int local;
printf ("stack size = %ld\n",
(long) ((intptr_t) &local - (intptr_t) mainsp));
Statically estimating from full source code of an application the required stack size is in general undecidable (think of recursion, function pointers), and in practice very difficult (even on a severely restricted class of applications). Look into Couverture. You might also consider customizing a recent GCC compiler with your plugin (perhaps Bismon in mid 2021; email me to basile.starynkevitch#cea.fr about it) for such purposes, but that won't be easy and will give you over-approximations.
If compiling with GCC, you might use the return address bultins to query the stack frame pointer at run time. On some architectures it is not available with some optimization flags. You could also use the -Wstack-usage=byte-size and/or -Wframe-larger-than=byte-size warning options to recent GCC.
As to how heap and stack spaces are distributed, this is system dependent. You might parse /proc/self/maps file on Linux. See proc(5). You could limit stack space on Linux in user-space using setrlimit(2).
Be however aware of Rice's theorem.
With multi-threaded applications things could be more difficult. Read some Pthread tutorial.
Notice that in simple cases, GCC may be capable of tail-call optimizations. You could compile your foo.c with gcc -Os -fverbose-asm -S foo.c and look inside the generated foo.s assembler code.
If you don't care about portability, consider also using the extended asm features of GCC.

Tools to show spills in a c code

Is there a tool to where I have spills in my c code?
I mean see what block of code potentially make a register move to memory.
EDIT: what is a spill:
In the process of compiling your code at some point you will have to do register allocation. The compiler will do an interference graph ( "variables" are nodes and they are connected if they are alive at the same time ). From this point there is a linear process that will do graph coloring: for each variable assign a register that wont interfere with other variables... If you don't have enough register to color the graph the algorithm will fail
and a variable(register) will be spilled ( moved to memory ).
From a software engineering point of view, this mean you should always minimize a variable live so you can minimize the chance of having a spill.
When you want to optimize code you should look for those kinds of things since a spill will give an extra time to read/write memory. I was looking for a tool or a compiler flag that could tell me where is spill so I can optimize.
I'm aware of no such tool.
Because decisions about spills vary from compiler to compiler, and version of the compiler and even by settings within a given version of a given compiler, any such tool would have to be tightly coupled to a compiler and would likely only support one.
On the other hand, you can always look at the generated assembly yourself and see if a given variable is spilled or not.
Generally either disassemble or compile to assembler instead of an object.
For specific compilers like gcc and llvm (where you have the source and can easily re-build the compiler), modify the compiler to print some sort of output to indicate how many times it had to spill, as you call it, to memory. Perhaps as you find the register allocation routine, you may find that the compiler already has such output. Personally I just disassemble or compile to assembler.
A generic assembler analysis tool is possible, but is it worth the effort? You would want to know where function/optimization boundaries are. You would want to distinguish volatile variables, or hardware registers where the write to ram was intentional. You could just look for stack based writes only. Or look for cases where there is a write to the stack that is not a push, where the register is destroyed on the next instruction. Actually it would be pretty easy to search for writes to a stack pointer relative address, with the next instruction destroying the register, with that stack based relative address being read back in a relatively nearby execution path where the stack frame has not been cleaned up in that execution path. Do I know of such a tool? Nope.

What kind of C is an operating system written in?

It makes sense that something like an operating system would be written in C. But how much of it, and what kind of C? I mean, in C, if you needed some heap memory, you would call malloc. But, does an OS even have a heap? As far as I know, malloc asks the operating system for memory and then adds it to a linked list, or binary tree, or something. What about a call stack? The OS is responsible for setting up all of this stuff that other applications use, but how does it do that? When you want to open or create a file in C, the appropriate functions ask the operating system for that file. so... What kind of C is on the other side of that call? Or on the other end of a memory allocation?
Also, how much of an operating system would actually be written in C? All of it? What about architecture dependent code? What about the higher levels of abstraction--does that ever get written in higher level languages, like C++?
I mean, I'm just asking this out of sheer curiosity. I'm downloading the latest linux kernel now but it's taking forever. I'm not sure if I'll wind up being able to follow the code--or if I'll be caught in an inescapably complex web of stuff I've never seen before.
Excellent questions, all. The answer is: little to none of the standard C library is available in the "dialect" of C used to write an operating system. In the Linux kernel, for example, the standard memory allocation functions malloc, nmalloc, free etc. are replaced with special kernel-internel memory allocation functions kmalloc and kfree, with special restrictions on their use. The operating system must provide its own "heap" -- in the Linux kernel, physical memory pages that have been allocated for kernel use must be non-pageable and often physically continguous. See This linux journal article on kmalloc and kfree. Similarly, the operating system kernel maintains its own special call stack, the use of which requires, from memory, special support from the GCC compiler.
Also, how much of an operating system would actually be written in C? All of
it?
As far as I'm aware, operating systems are overwhelmingly written in C. Some architecture-specific features are coded in assembler, but usually very little to improve portability and maintainability: the Linux kernel has some assembler but tries to minimize it as much as possible.
What about architecture dependent
code? What about the higher levels of
abstraction--does that ever get
written in higher level languages,
like C++?
Usually the kernel will be written in pure C, but sometimes the higher level frameworks and APIs are written in a higher level language. For example, the Cocoa framework/API on MacOS is written in Objective C, and the BeOS higher level APIs were written in C++. Much of Microsoft's .NET framework was written in C#, with the "Common Language Runtime" written in a mix of C++ and assembler. The QT widget set most often used on Linux is written in C++. Of course, this introduces philosophical questions about what counts as "the operating system."
The Linux kernel is definitely worth looking at for this, although, it must be said, it is huge and intimidating for anyone to read from scratch.
What kind of C?
Mostly ANSI C, with a lot of time looking at the machine code it generates.
But, does an OS even have a heap?
Malloc asks the operating system for a pointer to some memory it is allowed to use. If a program running on an OS (user mode) tries to access memory it doesn't own, it will give a segmentation fault. An OS is allowed to directly access all the physical memory on the system, malloc not needed, no seg-faults on any address that exists.
What about a call stack?
The call stack actually often works at the hardware level, with a link register.
For file access, the OS needs access to a disk driver, which needs to know how to read the file system that's on the disk (there are a lot of different kinds) Sometimes the OS has one built in, but I think it's more common that the boot loader hands it one to start with, and it loads another (bigger) one. The disk driver has access to the hardware IO of the physical disk, and builds from that.
C is a very low level language, and you can do a lot of things directly. Any of the C library methods (like malloc, printf, crlscr etc) need to be implemented first, to invoke them from C (Have a look at libc concepts for example). I'll give an example below.
Let us see how the C library methods are implemented under the hood. We'll go with a clrscr example. When you implement such methods, you'll access system devices directly. For ex, for clrscr (clearing the screen) we know that the video memory is resident at 0xB8000. Hence, to write to screen or to clear it, we start by assigning a pointer to that location.
In video.c
void clrscr()
{
unsigned char *vidmem = (unsigned char *)0xB8000;
const long size = 80*25;
long loop;
for (loop=0; loop<size; loop++) {
*vidmem++ = 0;
*vidmem++ = 0xF;
}
}
Let us write our mini kernel now. This will clear the screen when the control is handed over to our 'kernel' from the boot loader. In main.c
void main()
{
clrscr();
for(;;);
}
To compile our 'kernel', you might use gcc to compile it to a pure bin format.
gcc -ffreestanding -c main.c -o main.o
gcc -c video.c -o video.o
ld -e _main -Ttext 0x1000 -o kernel.o main.o video.o
ld -i -e _main -Ttext 0x1000 -o kernel.o main.o video.o
objcopy -R .note -R .comment -S -O binary kernel.o kernel.bin
If you noticed the ld parameters above, you see that we are specifying the default load location of your Kernel as 0x1000. Now, you need to create a boot loader. From your boot loader logic, you might want to pass control to your Kernel, like
jump 08h:01000h
You normally write your boot loader logic in Asm. Even before that, you may need to have a look at how a PC Boots - Click Here.
Better start with a tinier Operating system to explore. See this Roll Your Own OS Tutorial
http://www.acm.uiuc.edu/sigops/roll_your_own/
But how much of it, and what kind of C?
Some parts must be written in assembly
I mean, in C, if you needed some heap memory, you would call malloc. But, does an OS even have a heap? As far as I know, malloc asks the operating system for memory and then adds it to a linked list, or binary tree, or something.
Some OS's have a heap. At a lowest level, they are slabs of memory that are dolled out called pages. Your C library then partitions with its own scheme in a variable sized manner with malloc. You should learn about virtual memory which is a common memory scheme in modern OS's.
When you want to open or create a file in C, the appropriate functions ask the operating system for that file. so... What kind of C is on the other side of that call?
You call into assembly routines that query hardware with instructions like IN and OUT. With raw memory access sometimes you have regions of memory that are dedicated to communicating to and from hardware. This is called DMA.
I'm not sure if I'll wind up being able to follow the code--or if I'll be caught in an inescapably complex web of stuff I've never seen before.
Yes you will. You should pick up a book on hardware and OS's first.
I mean, in C, if you needed some heap memory, you would call malloc. But, does an OS even have a heap? As far as I know, malloc asks the operating system for memory and then adds it to a linked list, or binary tree, or something. What about a call stack?
A lot of what you say in your question is actually done by the runtime library in userspace.
All that OS needs to do is to load the program into memory and jump to it's entry point, most details after that can be done by the user space program. Heap and stack are just areas of the processes virtual memory. Stack is just a pointer register in the cpu.
Allocating physical memory is something that is done on the OS level. OS usually allocates fixed size pages, which are then mapped to a user space process.
You should read the Linux Device Drivers 3. It explains pretty well the internals of the linux kernel.
I wouldn't start reading the Linux kernel, It's too complicated for starters.
Osdev is an excellent place to start reading.
I have done a little os with information from Osdev for an school subject. It runs on vmware, bochs, and qemu so it's easy to test it. Here is the source code.
Traditionally, C is mostly needed for the kernel and device drivers due to interaction with hardware. However, languages such as C++ and Java could be used for the entire operating system
For more information, I've found Operating Systems Design and Implementation by Andrew Tannenbaum particularly useful with LOTS of code samples.
malloc and memory management functions aren't keywords in C. This is functions of standard OS libraries. I don't know the name of this standard (it is unlikely that it's POSIX standard - I haven't found any mention), but it's exists - you use malloc in C applications on most platforms.
If you want to know how Linux kernel works I advice this book http://oreilly.com/catalog/9780596005658/ . I think it's good explanation with some C code inserted :).

Resources