How to break up C-Functions without increasing stacksize - c

Working in embedded Systems design, I am often confronted with Legacy Code, where somebody wrote some ISR which consists of a huge if/else-Jungle, sometimes spanning multiple screen-lengths. Now, trying to be a good programmer I try to refactor the function, using the paradigms I learned, one of them being: "A function should do one thing only".
So I break down the function into multiple static sub-function, which have descriptive names and encapsulate variables. But since I am working on an embedded device, I need to be considerate of stacksize and the number of jumps (especially in an ISR that might get called often and might itself get interrupted by something else).
Now, of cause most(or even all) Compilers can be forced to inline a function (as __always_inline does with gcc). But even that increases stacksize, if I have to pass parameters (they do not necessarily get optimized away), even if it is just a few bytes per parameter.
Now for my actual question: Is there a way not to increase stacksize while breaking up functions in C?
To make my Question clearer: Here is an Example of some Code, where I just shifted some of the code to inline-functions.
Static stack usage is 144 without inline-functions and 160 with inline functions.
#include <stdio.h>
#include <string.h>
int main(){
char inputString[100];
static char delimiterArray[] = {' ','+','-','/','*','='};
for(int i = 0; i<sizeof(inputString); i++){
char* inputChar = inputString + i;
for(int j = 0; j<sizeof(delimiterArray);j++){
if( *inputChar == delimiterArray[j]){
printf("DELIMITER: %c",delimiterArray[j]);
if(inputString[i] == '\0'){
printf("\nNuberOfChars: %d\n",i);
return 0;
With inline-functions:
#include <stdio.h>
#include <string.h>
static inline void checkForDelimiters(char* inputChar)__attribute__((always_inline));
static inline void decomposeString(char* inputString)__attribute__((always_inline));
int main(){
char inputString[100];
return 0;
static void checkForDelimiters(char* inputChar){
static char delimiterArray[] = {' ','+','-','/','*','='};
for(int j = 0; j<sizeof(delimiterArray);j++){
if(*inputChar == delimiterArray[j]){
printf("DELIMITER: %c",delimiterArray[j]);
static void decomposeString(char* inputString){
for(int i = 0; i<sizeof(inputString); i++){
if(inputString[i] == '\0'){
printf("\nNuberOfChars: %d\n",i);


Optimization with interrupts or threads and global variables

I find that I have some difficulty with how to best write communication between functions that are out of the normal flow of code. A simple example is:
int a = 0;
volatile int v = 0;
void __attribute__((used)) interrupt() {
int main() {
while(1) {
// asm("nop");
if (v > 10 && a > 10)
return 0;
It is not surprising that the main while loop can optimize the a variable to a register and thus never see any changes from the interrupt. If the variable is volatile then it is annoying in that every time it is used in needs to be reread from or rewritten to memory. And in that technique any communication variable across threads would need to be volatile. A synchronization primitive (or even the commented out "nop") solves the problem because it seemingly has a side effect to create a compiler barrier. But if I understand correctly that would mean flushing the entire state of all the registers used in main, where maybe it's less harsh to just have a few variables as volatile. I currently use the two techniques but I wish I had a more standard method for dealing with the issue. Can anyone comment on best strategies here?
A link to some assembly
So you want a means of reducing the number of times a is looked up. The following reduces it to once a loop:
volatile int a = 0;
volatile int v = 0;
void __attribute__((used)) interrupt() {
int main() {
while(1) {
int b = --a;
if (v > 10 && b > 10)
return 0;
Nothing stops you from checking even less often similarly.

Is it possible to make a loading animation in a Console Application using C?

I would like to know if it is possible to make a loading animation in a Console Application that would always appear in the same line, like a flashing dot or a more complex ASCII animation.
Perhaps like this
#include <stdio.h>
#include <time.h>
#define INTERVAL (0.1 * CLOCKS_PER_SEC) // tenth second
int main(void) {
int i = 0;
clock_t target;
char spin[] = "\\|/-"; // '\' needs escape seq
printf(" ");
while(1) {
printf("\b%c", spin[i]);
i = (i + 1) % 4;
target = clock() + (clock_t)INTERVAL;
while (clock() < target);
return 0;
The more portable way would be to use termcap/terminfo or (n)curses.
If you send ANSI escape sequences you assume the terminal to be capable of interpreting them (and if it isn't it'll result in a big mess.)
It's essentially a system that describes the capabilities of the terminal (if there's one connected at all).
In these days one tends to forget but the original tty didn't have a way to remove ink from the paper it typed the output on ...
Termcap tutorials are easy enough to find on Google. Just one in the GNU flavor here: (old, but should still be good)
(n)curses is a library that will allow you control and build entire text based user interfaces if you want to.
Yes it is.
One line
At first if you want to make animation only at one line, you could use putchar('\b') to remove last character and putchar('\r') to return to line beginning and then rewrite it.
int main() {
int num;
while (1) {
for (num = 1; num <= 3; num++) {
printf("\r \r"); // or printf("\b\b\b");
return 0;
But if you want to place it at specified line, you can clear and re-draw every frame, or use libs.
Clearing method
You can do this with system("clear") or with printf("\e[1;1H\e[2J").
After that you'll need to re-draw your frame. I don't recommend this method.
But this is really unportable.
Other libraries
You can use ncurses.h or conio.h depending on system type.
Ncurses example:
#include <stdio.h>
#include <unistd.h>
#include <ncurses.h>
int main() {
int row, col;
getmaxyx(stdscr, row, col);
char loading[] = "-\\|/";
while (1) {
for (int i = 0; i < 8; i++) {
mvaddch(row/2, col/2, loading[i%4]);
mvaddch(row/2, col/2, '\b');
return 0;

c - Avoid if in loop

Debian 64.
Core 2 duo.
Fiddling with a loop. I came with different variations of the same loop but I would like to avoid conditional branching if possible.
But, even if I think it will be difficult to beat.
I thought about SSE or bit shifting but still, it would require a jump (look at the computed goto below). Spoiler : a computed jump doesn't seems to be the way to go.
The code is compiled without PGO. Because on this piece of code, it makes the code slower..
flags :
gcc -march=native -O3 -std=c11 test_comp.c
Unrolling the loop didn't help here..
63 in ascii is '?'.
The printf is here to force the code to execute. Nothing more.
My need :
A logic to avoid the condition. I assume this as a challenge to make my holydays :)
The code :
Test with the sentence. The character '?' is guaranteed to be there but at a random position.
#include <stdlib.h>
#include <stdio.h>
int main(int argc, char **argv){
/* This is quite slow. Average actually.
Executes in 369,041 cycles here (cachegrind) */
for (int x = 0; x < 100; ++x){
if (argv[1][x] == 63){
/* This is the slowest.
Executes in 370,385 cycles here (cachegrind) */
register unsigned int i = 0;
static void * restrict table[] = {&&keep,&&end};
goto *table[(argv[1][i-1] == 63)];
printf("i = %d",i-1);
/* This is slower. Because of the calculation..
Executes in 369,109 cycles here (cachegrind) */
for (int x = 100; ; --x){
if (argv[1][100 - x ] == 63){printf("%d\n",100-x);break;}
return 0;
Is there a way to make it faster, avoiding the branch maybe ?
The branch miss is huge with 11.3% (cachegrind with --branch-sim=yes).
I cannot think it is the best one can achieve.
If some of you manage assembly with enough talent, please come in.
Assuming you have a buffer of well know size being able to hold the maximum amount of chars to test against, like
char buffer[100];
make it one byte larger
char buffer[100 + 1];
then fill it with the sequence to test against
read(fileno(stdin), buffer, 100);
and put your test-char '?' at the very end
buffer[100] = '?';
This allows you for a loop with only one test condition:
size_t i = 0;
while ('?' != buffer[i])
if (100 == i)
/* test failed */
/* test passed for i */
All other optimisation leave to the compiler.
However I couldn't resist, so here's a possible approach to do micro optimisation
char buffer[100 + 1];
read(fileno(stdin), buffer, 100);
buffer[100] = '?';
char * p = buffer;
while ('?' != *p)
if ((p - buffer) == 100)
/* test failed */
/* test passed for (p - buffer) */

program for interpretation of a simple instruction set

I have a problem I need to solve and I have no freaking idea how to do it. If someone would be willing to help I would very much appreciate it. I know I'm asking for a lot, but I really need it.
Create a program for interpretation of a simple instruction set consisting of the instructions: MVI, MOV, AND, OR, NOT, LESS, LEQ, GRE, GEQ, JMP, PRN, SUM, SUB, PRB, SL and SR, described in this document. Your task is to make a program that takes as an input a binary representation of a list of instructions, and as an output it prints the corresponding result (after the execution of the instructions). The input can contain all the instructions except SUB and PRB that you do not have to implement. Conversion from binary system to any other numeral system should not be made, except at the moment when you need to find the line that should be executed next when the condition is satisfied (GRE, GEQ, LESS, LEQ, JMP), but the comparison of the numbers in the condition should be made based on the binary representatio/pn. All data are represented in SM binary system. There are eight 16-bit registers available enumerated from 0 to 7.
#define MAX 1000
char registers[8][16];
void MVI(int reg, char *value) {
// code here
void MOV(int reg1, int reg2) {
// code here
void AND(int reg1, int reg2, int reg3) {
// code here
void OR(int reg1, int reg2, int reg3) {
// code here
void NOT(int reg1, int reg2) {
// code here
void PRN(int reg) {
// code here
void SUM(int reg1, int reg2, int reg3) {
// code here
void SL(int reg) {
// code here
void SR(int reg) {
// code here
int main() {
int i,j,k;
int N = 0; // number of lines in the input
char c;
char lines[MAX][16];
while (1) {
scanf("%c", &c);
if (c == '\n') {
lines[N][0] = c;
for (i=1;i<16;i++) {
scanf("%c", &lines[N][i]);
scanf("%c", &c);
for (i = 0; i < 8; i++) {
for (j = 0; j < 16; j++) {
registers[i][j] = '0';
// code here
return 0;
I think the big piece you need is dispatching the functions based on the source line. There are a number of ways you can do this, but a useful piece for all of them is strstr(a,b)==a which will check if the string a begins with the contents of the string b.
You can do a chain of if-else blocks.
if (strstr(line[i], "SUM")==line[i]){
} else if (strstr(line[i], "AND")==line[i]) {
Or you can precompile the user program by scanning for the opcodes when you read the source and store them as single-byte small codes. You would want the uppercase identifiers to be enum values, and use the lowercase versions for the function names. Then the chain is simpler.
if (line[i][0] == SUM) {
} else if (line[i][0] == AND) {
But, with small integer codes, there are even better ways. A switch.
case SUM: sum(...); break;
case AND: and(...); break;
A function table. But this is where you have be clever. A function must always be called with arguments of the correct type, but function-pointers allow you to bypass the compiler's ability
to check that this is so. So for this method, all functions should have the same arguments since
they are all called by a single function-call line.
void (*optab[])(...) = { sum, and, ... };
optab[ line[i][0] ](...); // calls sum() or and() by using the opcode in the array lookup

Simple C implementation to track memory malloc/free?

programming language: C
platform: ARM
Compiler: ADS 1.2
I need to keep track of simple melloc/free calls in my project. I just need to get very basic idea of how much heap memory is required when the program has allocated all its resources. Therefore, I have provided a wrapper for the malloc/free calls. In these wrappers I need to increment a current memory count when malloc is called and decrement it when free is called. The malloc case is straight forward as I have the size to allocate from the caller. I am wondering how to deal with the free case as I need to store the pointer/size mapping somewhere. This being C, I do not have a standard map to implement this easily.
I am trying to avoid linking in any libraries so would prefer *.c/h implementation.
So I am wondering if there already is a simple implementation one may lead me to. If not, this is motivation to go ahead and implement one.
EDIT: Purely for debugging and this code is not shipped with the product.
EDIT: Initial implementation based on answer from Makis. I would appreciate feedback on this.
EDIT: Reworked implementation
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <limits.h>
static size_t gnCurrentMemory = 0;
static size_t gnPeakMemory = 0;
void *MemAlloc (size_t nSize)
void *pMem = malloc(sizeof(size_t) + nSize);
if (pMem)
size_t *pSize = (size_t *)pMem;
memcpy(pSize, &nSize, sizeof(nSize));
gnCurrentMemory += nSize;
if (gnCurrentMemory > gnPeakMemory)
gnPeakMemory = gnCurrentMemory;
printf("PMemAlloc (%#X) - Size (%d), Current (%d), Peak (%d)\n",
pSize + 1, nSize, gnCurrentMemory, gnPeakMemory);
return(pSize + 1);
return NULL;
void MemFree (void *pMem)
size_t *pSize = (size_t *)pMem;
// Get the size
assert(gnCurrentMemory >= *pSize);
printf("PMemFree (%#X) - Size (%d), Current (%d), Peak (%d)\n",
pMem, *pSize, gnCurrentMemory, gnPeakMemory);
gnCurrentMemory -= *pSize;
#define BUFFERSIZE (1024*1024)
typedef struct
bool flag;
int buffer[BUFFERSIZE];
bool bools[BUFFERSIZE];
} sample_buffer;
typedef struct
unsigned int whichbuffer;
char ch;
} buffer_info;
int main(void)
unsigned int i;
buffer_info *bufferinfo;
sample_buffer *mybuffer;
char *pCh;
printf("Tesint MemAlloc - MemFree\n");
mybuffer = (sample_buffer *) MemAlloc(sizeof(sample_buffer));
if (mybuffer == NULL)
printf("ERROR ALLOCATING mybuffer\n");
bufferinfo = (buffer_info *) MemAlloc(sizeof(buffer_info));
if (bufferinfo == NULL)
printf("ERROR ALLOCATING bufferinfo\n");
pCh = (char *)MemAlloc(sizeof(char));
printf("finished malloc\n");
// fill allocated memory with integers and read back some values
for(i = 0; i < BUFFERSIZE; ++i)
mybuffer->buffer[i] = i;
mybuffer->bools[i] = true;
bufferinfo->whichbuffer = (unsigned int)(i/100);
You could allocate a few extra bytes in your wrapper and put either an id (if you want to be able to couple malloc() and free()) or just the size there. Just malloc() that much more memory, store the information at the beginning of your memory block and and move the pointer you return that many bytes forward.
This can, btw, also easily be used for fence pointers/finger-prints and such.
Either you can have access to internal tables used by malloc/free (see this question: Where Do malloc() / free() Store Allocated Sizes and Addresses? for some hints), or you have to manage your own tables in your wrappers.
You could always use valgrind instead of rolling your own implementation. If you don't care about the amount of memory you allocate you could use an even simpler implementation: (I did this really quickly so there could be errors and I realize that it is not the most efficient implementation. The pAllocedStorage should be given an initial size and increase by some factor for a resize etc. but you get the idea.)
EDIT: I missed that this was for ARM, to my knowledge valgrind is not available on ARM so that might not be an option.
static size_t indexAllocedStorage = 0;
static size_t *pAllocedStorage = NULL;
static unsigned int free_calls = 0;
static unsigned long long int total_mem_alloced = 0;
void *
my_malloc(size_t size){
size_t *temp;
void *p = malloc(size);
if(p == NULL){
fprintf(stderr,"my_malloc malloc failed, %s", strerror(errno));
total_mem_alloced += size;
temp = (size_t *)realloc(pAllocedStorage, (indexAllocedStorage+1) * sizeof(size_t));
if(temp == NULL){
fprintf(stderr,"my_malloc realloc failed, %s", strerror(errno));
pAllocedStorage = temp;
pAllocedStorage[indexAllocedStorage++] = (size_t)p;
return p;
my_free(void *p){
size_t i;
int found = 0;
for(i = 0; i < indexAllocedStorage; i++){
if(pAllocedStorage[i] == (size_t)p){
pAllocedStorage[i] = (size_t)NULL;
found = 1;
printf("Free Called on unknown\n");
free_check(void) {
size_t i;
printf("checking freed memeory\n");
for(i = 0; i < indexAllocedStorage; i++){
if(pAllocedStorage[i] != (size_t)NULL){
printf( "Memory leak %X\n", (unsigned int)pAllocedStorage[i]);
free((void *)pAllocedStorage[i]);
pAllocedStorage = NULL;
I would use rmalloc. It is a simple library (actually it is only two files) to debug memory usage, but it also has support for statistics. Since you already wrapper functions it should be very easy to use rmalloc for it. Keep in mind that you also need to replace strdup, etc.
Your program may also need to intercept realloc(), calloc(), getcwd() (as it may allocate memory when buffer is NULL in some implementations) and maybe strdup() or a similar function, if it is supported by your compiler
If you are running on x86 you could just run your binary under valgrind and it would gather all this information for you, using the standard implementation of malloc and free. Simple.
I've been trying out some of the same techniques mentioned on this page and wound up here from a google search. I know this question is old, but wanted to add for the record...
1) Does your operating system not provide any tools to see how much heap memory is in use in a running process? I see you're talking about ARM, so this may well be the case. In most full-featured OSes, this is just a matter of using a cmd-line tool to see the heap size.
2) If available in your libc, sbrk(0) on most platforms will tell you the end address of your data segment. If you have it, all you need to do is store that address at the start of your program (say, startBrk=sbrk(0)), then at any time your allocated size is sbrk(0) - startBrk.
3) If shared objects can be used, you're dynamically linking to your libc, and your OS's runtime loader has something like an LD_PRELOAD environment variable, you might find it more useful to build your own shared object that defines the actual libc functions with the same symbols (malloc(), not MemAlloc()), then have the loader load your lib first and "interpose" the libc functions. You can further obtain the addresses of the actual libc functions with dlsym() and the RTLD_NEXT flag so you can do what you are doing above without having to recompile all your code to use your malloc/free wrappers. It is then just a runtime decision when you start your program (or any program that fits the description in the first sentence) where you set an environment variable like and then run it. (google for shared object interposition.. it's a great technique and one used by many debuggers/profilers)
