Simple C implementation to track memory malloc/free? - c

programming language: C
platform: ARM
Compiler: ADS 1.2
I need to keep track of simple melloc/free calls in my project. I just need to get very basic idea of how much heap memory is required when the program has allocated all its resources. Therefore, I have provided a wrapper for the malloc/free calls. In these wrappers I need to increment a current memory count when malloc is called and decrement it when free is called. The malloc case is straight forward as I have the size to allocate from the caller. I am wondering how to deal with the free case as I need to store the pointer/size mapping somewhere. This being C, I do not have a standard map to implement this easily.
I am trying to avoid linking in any libraries so would prefer *.c/h implementation.
So I am wondering if there already is a simple implementation one may lead me to. If not, this is motivation to go ahead and implement one.
EDIT: Purely for debugging and this code is not shipped with the product.
EDIT: Initial implementation based on answer from Makis. I would appreciate feedback on this.
EDIT: Reworked implementation
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <limits.h>
static size_t gnCurrentMemory = 0;
static size_t gnPeakMemory = 0;
void *MemAlloc (size_t nSize)
{
void *pMem = malloc(sizeof(size_t) + nSize);
if (pMem)
{
size_t *pSize = (size_t *)pMem;
memcpy(pSize, &nSize, sizeof(nSize));
gnCurrentMemory += nSize;
if (gnCurrentMemory > gnPeakMemory)
{
gnPeakMemory = gnCurrentMemory;
}
printf("PMemAlloc (%#X) - Size (%d), Current (%d), Peak (%d)\n",
pSize + 1, nSize, gnCurrentMemory, gnPeakMemory);
return(pSize + 1);
}
return NULL;
}
void MemFree (void *pMem)
{
if(pMem)
{
size_t *pSize = (size_t *)pMem;
// Get the size
--pSize;
assert(gnCurrentMemory >= *pSize);
printf("PMemFree (%#X) - Size (%d), Current (%d), Peak (%d)\n",
pMem, *pSize, gnCurrentMemory, gnPeakMemory);
gnCurrentMemory -= *pSize;
free(pSize);
}
}
#define BUFFERSIZE (1024*1024)
typedef struct
{
bool flag;
int buffer[BUFFERSIZE];
bool bools[BUFFERSIZE];
} sample_buffer;
typedef struct
{
unsigned int whichbuffer;
char ch;
} buffer_info;
int main(void)
{
unsigned int i;
buffer_info *bufferinfo;
sample_buffer *mybuffer;
char *pCh;
printf("Tesint MemAlloc - MemFree\n");
mybuffer = (sample_buffer *) MemAlloc(sizeof(sample_buffer));
if (mybuffer == NULL)
{
printf("ERROR ALLOCATING mybuffer\n");
return EXIT_FAILURE;
}
bufferinfo = (buffer_info *) MemAlloc(sizeof(buffer_info));
if (bufferinfo == NULL)
{
printf("ERROR ALLOCATING bufferinfo\n");
MemFree(mybuffer);
return EXIT_FAILURE;
}
pCh = (char *)MemAlloc(sizeof(char));
printf("finished malloc\n");
// fill allocated memory with integers and read back some values
for(i = 0; i < BUFFERSIZE; ++i)
{
mybuffer->buffer[i] = i;
mybuffer->bools[i] = true;
bufferinfo->whichbuffer = (unsigned int)(i/100);
}
MemFree(bufferinfo);
MemFree(mybuffer);
if(pCh)
{
MemFree(pCh);
}
return EXIT_SUCCESS;
}

You could allocate a few extra bytes in your wrapper and put either an id (if you want to be able to couple malloc() and free()) or just the size there. Just malloc() that much more memory, store the information at the beginning of your memory block and and move the pointer you return that many bytes forward.
This can, btw, also easily be used for fence pointers/finger-prints and such.

Either you can have access to internal tables used by malloc/free (see this question: Where Do malloc() / free() Store Allocated Sizes and Addresses? for some hints), or you have to manage your own tables in your wrappers.

You could always use valgrind instead of rolling your own implementation. If you don't care about the amount of memory you allocate you could use an even simpler implementation: (I did this really quickly so there could be errors and I realize that it is not the most efficient implementation. The pAllocedStorage should be given an initial size and increase by some factor for a resize etc. but you get the idea.)
EDIT: I missed that this was for ARM, to my knowledge valgrind is not available on ARM so that might not be an option.
static size_t indexAllocedStorage = 0;
static size_t *pAllocedStorage = NULL;
static unsigned int free_calls = 0;
static unsigned long long int total_mem_alloced = 0;
void *
my_malloc(size_t size){
size_t *temp;
void *p = malloc(size);
if(p == NULL){
fprintf(stderr,"my_malloc malloc failed, %s", strerror(errno));
exit(EXIT_FAILURE);
}
total_mem_alloced += size;
temp = (size_t *)realloc(pAllocedStorage, (indexAllocedStorage+1) * sizeof(size_t));
if(temp == NULL){
fprintf(stderr,"my_malloc realloc failed, %s", strerror(errno));
exit(EXIT_FAILURE);
}
pAllocedStorage = temp;
pAllocedStorage[indexAllocedStorage++] = (size_t)p;
return p;
}
void
my_free(void *p){
size_t i;
int found = 0;
for(i = 0; i < indexAllocedStorage; i++){
if(pAllocedStorage[i] == (size_t)p){
pAllocedStorage[i] = (size_t)NULL;
found = 1;
break;
}
}
if(!found){
printf("Free Called on unknown\n");
}
free_calls++;
free(p);
}
void
free_check(void) {
size_t i;
printf("checking freed memeory\n");
for(i = 0; i < indexAllocedStorage; i++){
if(pAllocedStorage[i] != (size_t)NULL){
printf( "Memory leak %X\n", (unsigned int)pAllocedStorage[i]);
free((void *)pAllocedStorage[i]);
}
}
free(pAllocedStorage);
pAllocedStorage = NULL;
}

I would use rmalloc. It is a simple library (actually it is only two files) to debug memory usage, but it also has support for statistics. Since you already wrapper functions it should be very easy to use rmalloc for it. Keep in mind that you also need to replace strdup, etc.

Your program may also need to intercept realloc(), calloc(), getcwd() (as it may allocate memory when buffer is NULL in some implementations) and maybe strdup() or a similar function, if it is supported by your compiler

If you are running on x86 you could just run your binary under valgrind and it would gather all this information for you, using the standard implementation of malloc and free. Simple.

I've been trying out some of the same techniques mentioned on this page and wound up here from a google search. I know this question is old, but wanted to add for the record...
1) Does your operating system not provide any tools to see how much heap memory is in use in a running process? I see you're talking about ARM, so this may well be the case. In most full-featured OSes, this is just a matter of using a cmd-line tool to see the heap size.
2) If available in your libc, sbrk(0) on most platforms will tell you the end address of your data segment. If you have it, all you need to do is store that address at the start of your program (say, startBrk=sbrk(0)), then at any time your allocated size is sbrk(0) - startBrk.
3) If shared objects can be used, you're dynamically linking to your libc, and your OS's runtime loader has something like an LD_PRELOAD environment variable, you might find it more useful to build your own shared object that defines the actual libc functions with the same symbols (malloc(), not MemAlloc()), then have the loader load your lib first and "interpose" the libc functions. You can further obtain the addresses of the actual libc functions with dlsym() and the RTLD_NEXT flag so you can do what you are doing above without having to recompile all your code to use your malloc/free wrappers. It is then just a runtime decision when you start your program (or any program that fits the description in the first sentence) where you set an environment variable like LD_PRELOAD=mymemdebug.so and then run it. (google for shared object interposition.. it's a great technique and one used by many debuggers/profilers)

Related

String parsing failes when using a function doing substring in C

I am having an issue with parsing a string in C. It causes a HardFault eventually.
MCU: LPC1769,
OS: FreeRTOS 10,
Toolchain: IAR
In order to test, If I keep sending the same data frame (you may see the sample below in message variable in parseMessage function),
after 5-6 times parsing it goes OK, parsing works as I expected, and then suddenly falls in HardFault when I send one more the exact same string to the function.
I tested the function in OnlineGDB. I haven't observed any issue.
I have couple of slightly different version of that function below although the result is the same;
char *substr3(char const *input, size_t start, size_t len) {
char *ret = malloc(len+1);
memcpy(ret, input+start, len);
ret[len] = '\0';
return ret;
}
I've extracted the function piece for a better overveiw:
(don't pay attention to stripEOL(message); call, it just strips out end-of-line characters, but you can see it in the gdbonline share of mine)
void parseMessage(char * message){
//char* message= "7E00002A347C31323030302D3132353330387C33302E30372E323032307C31317C33307C33317C31352D31367C31357C317C57656E67657274880D";
// Parsing the frame
char* start;
char* len;
char* cmd;
char* data;
char* chksum;
char* end;
stripEOL(message);
unsigned int messagelen = strlen(message);
start = substr3(message, 0, 2);
len = substr3(message, 2, 4);
cmd = substr3(message, 6, 2);
data = substr3(message, 8, messagelen-8-4);
chksum = substr3(message, messagelen-4, 2);
end = substr3(message, messagelen-2, 2);
}
Only the data variable differs in length.
e.g. data --> "347C31323030302D3132353330387C33302E30372E323032307C31317C33307C33317C31352D31367C31357C317C57656E67657274"
A HardFault debug log:
LR = 0x8667 in disassembly
PC = 0x2dd0 in disassembly
I appreciate to the contributors which they led me to find the solution for my instance.
Since there wasn't a total solution by the contributors and I found a working solution, I'd better be writing for whom may interest in future.
Since I am developing my application on top of FreeRTOS 10 and using malloc from the C library, apparently it wasn't cooping at least with my implementations. It's been said in some resources, you can use standard malloc within FreeRTOS, I couldn't manage myself for some unknown reason. It might have been a help, if I had increased the heap memory, I don't know but I didn't have intention on that as well.
I've just placed that two wrapper functions (somewhere in a common file) without even changing my malloc and free calls.;
Creating a malloc/free functions that work with the built-in FreeRTOS heap is quite simple. We just wrap the pvPortMalloc/pvPortFree calls:
void* malloc(size_t size)
{
void* ptr = NULL;
if(size > 0)
{
// We simply wrap the FreeRTOS call into a standard form
ptr = pvPortMalloc(size);
} // else NULL if there was an error
return ptr;
}
void free(void* ptr)
{
if(ptr)
{
// We simply wrap the FreeRTOS call into a standard form
vPortFree(ptr);
}
}
Note that: You can't use that with heap schema #1 but with the others (2, 3, 4 and 5).
I would recommend start using portable/MemMang/heap_4.c

What's a good way to implement simple clone() based multithread library?

I'm trying to build simple multithread library based on linux using clone() and other kernel utilities.I've come to a point where I'm not really sure what's the correct way to do things. I tried going trough original NPTL code but it's a bit too much.
That's how for instance I imagine the create method:
typedef int sk_thr_id;
typedef void *sk_thr_arg;
typedef int (*sk_thr_func)(sk_thr_arg);
sk_thr_id sk_thr_create(sk_thr_func f, sk_thr_arg a){
void* stack;
stack = malloc( 1024*64 );
if ( stack == 0 ){
perror( "malloc: could not allocate stack" );
exit( 1 );
}
return ( clone(f, (char*) stack + FIBER_STACK, SIGCHLD | CLONE_FS | CLONE_FILES | CLONE_SIGHAND | CLONE_VM, a ) );
}
1: I'm not really sure what the correct clone() flags should be. I just found these being used in a simple example. Any general directions here will be welcome.
Here are parts of the mutex primitives created using futexes(not my own code for now):
#define cmpxchg(P, O, N) __sync_val_compare_and_swap((P), (O), (N))
#define cpu_relax() asm volatile("pause\n": : :"memory")
#define barrier() asm volatile("": : :"memory")
static inline unsigned xchg_32(void *ptr, unsigned x)
{
__asm__ __volatile__("xchgl %0,%1"
:"=r" ((unsigned) x)
:"m" (*(volatile unsigned *)ptr), "0" (x)
:"memory");
return x;
}
static inline unsigned short xchg_8(void *ptr, char x)
{
__asm__ __volatile__("xchgb %0,%1"
:"=r" ((char) x)
:"m" (*(volatile char *)ptr), "0" (x)
:"memory");
return x;
}
int sys_futex(void *addr1, int op, int val1, struct timespec *timeout, void *addr2, int val3)
{
return syscall(SYS_futex, addr1, op, val1, timeout, addr2, val3);
}
typedef union mutex mutex;
union mutex
{
unsigned u;
struct
{
unsigned char locked;
unsigned char contended;
} b;
};
int mutex_init(mutex *m, const pthread_mutexattr_t *a)
{
(void) a;
m->u = 0;
return 0;
}
int mutex_lock(mutex *m)
{
int i;
/* Try to grab lock */
for (i = 0; i < 100; i++)
{
if (!xchg_8(&m->b.locked, 1)) return 0;
cpu_relax();
}
/* Have to sleep */
while (xchg_32(&m->u, 257) & 1)
{
sys_futex(m, FUTEX_WAIT_PRIVATE, 257, NULL, NULL, 0);
}
return 0;
}
int mutex_unlock(mutex *m)
{
int i;
/* Locked and not contended */
if ((m->u == 1) && (cmpxchg(&m->u, 1, 0) == 1)) return 0;
/* Unlock */
m->b.locked = 0;
barrier();
/* Spin and hope someone takes the lock */
for (i = 0; i < 200; i++)
{
if (m->b.locked) return 0;
cpu_relax();
}
/* We need to wake someone up */
m->b.contended = 0;
sys_futex(m, FUTEX_WAKE_PRIVATE, 1, NULL, NULL, 0);
return 0;
}
2: The main question for me is how to implement the "join" primitive? I know it's supposed to be based on futexes too. It's a struggle for me for now to come up with something.
3: I need some way to cleanup stuff(like the allocated stack) after a thread has finished. I can't really thing of a good way to do this too.
Probably for these I'll need to have additional structure in user space for every thread with some information saved in it. Can someone point me in good direction for solving these issues?
4: I'll want to have a way to tell how much time a thread has been running, how long it's been since it's last being scheduled and other stuff like that. Are there some kernel calls providing such info?
Thanks in advance!
The idea that there can exist a "multithreading library" as a third-party library separate from the rest of the standard library is an outdated and flawed notion. If you want to do this, you'll have to first drop all use of the standard library; particularly, your call to malloc is completely unsafe if you're calling clone yourself, because:
malloc will have no idea that multiple threads exist, and therefore may fail to perform proper synchronization.
Even if it knew they existed, malloc will need to access an unspecified, implementation-specific structure located at the address given by the thread pointer. As this structure is implementation-specific, you have no way of creating such a structure that will be interpreted correctly by both the current and all future versions of your system's libc.
These issues don't apply just to malloc but to most of the standard library; even async-signal-safe functions may be unsafe to use, as they might dereference the thread pointer for cancellation-related purposes, performing optimal syscall mechanisms, etc.
If you really insist on making your own threads implementation, you'll have to abstain from using glibc or any modern libc that's integrated with threads, and instead opt for something much more naive like klibc. This could be an educational experiment, but it would not be appropriate for a deployed application.
1) You are using an example of LinuxThreads. I will not rewrite good references for directions, but I advise you "The Linux Programming interface" of Michael Kerrisk, chapter 28. It explains in 25 pages, what you need.
2) If you set the CLONE_CHILD_CLEARID flag, when the child terminates, the ctid argument of clone is cleared. If you treat that pointer as a futex, you can implement the join primitive. Good luck :-) If you don't want to use futexes, have also a look to wait3 and wait4.
3) I do not know what you want to cleanup, but you can use the clone tls arugment. This is a thread local storage buffer. If the thread is finished, you can clean that buffer.
4) See getrusage.

C - shared memory - dynamic array inside shared struct

i'm trying to share a struct like this
example:
typedef struct {
int* a;
int b;
int c;
} ex;
between processes, the problem is that when I initialize 'a' with a malloc, it becomes private to the heap of the process that do this(or at least i think this is what happens). Is there any way to create a shared memory (with shmget, shmat) with this struct that works?
EDIT: I'm working on Linux.
EDIT: I have a process that initialize the buffer like this:
key_t key = ftok("gr", 'p');
int mid = shmget(key, sizeof(ex), IPC_CREAT | 0666);
ex* e = NULL;
status b_status = init(&e, 8); //init gives initial values to b c and allocate space for 'a' with a malloc
e = (ex*)shmat(mid, NULL, 0);
the other process attaches himself to the shared memory like this:
key_t key = ftok("gr", 'p');
int shmid = shmget(key, sizeof(ex), 0);
ex* e;
e = (ex*)shmat(shmid, NULL, 0);
and later get an element from a, in this case that in position 1
int i = get_el(e, 1);
First of all, to share the content pointed by your int *a field, you will need to copy the whole memory related to it. Thus, you will need a shared memory that can hold at least size_t shm_size = sizeof(struct ex) + get_the_length_of_your_ex();.
From now on, since you mentioned shmget and shmat, I will assume you run a Linux system.
The first step is the shared memory segment creation. It would be a good thing if you can determine an upper bound to the size of the int *a content. This way you would not have to create/delete the shared memory segment over and over again. But if you do so, an extra overhead to state how long is the actual data will be needed. I will assume that a simple size_t will do the trick for this purpose.
Then, after you created your segment, you must set the data correctly to make it hold what you want. Notice that while the physical address of the memory segment is always the same, when calling shmat you will get virtual pointers, which are only usable in the process that called shmat. The example code below should give you some tricks to do so.
#include <sys/types.h>
#include <sys/ipc.h>
/* Assume a cannot point towards an area larger than 4096 bytes. */
#define A_MAX_SIZE (size_t)4096
struct ex {
int *a;
int b;
int c;
}
int shm_create(void)
{
/*
* If you need to share other structures,
* You'll need to pass the key_t as an argument
*/
key_t k = ftok("/a/path/of/yours");
int shm_id = 0;
if (0 > (shm_id = shmget(
k, sizeof(struct ex) + A_MAX_SIZE + sizeof(size_t), IPC_CREAT|IPC_EXCL|0666))) {
/* An error occurred, add desired error handling. */
}
return shm_id;
}
/*
* Fill the desired shared memory segment with the structure
*/
int shm_fill(int shmid, struct ex *p_ex)
{
void *p = shmat(shmid, NULL, 0);
void *tmp = p;
size_t data_len = get_my_ex_struct_data_len(p_ex);
if ((void*)(-1) == p) {
/* Add desired error handling */
return -1;
}
memcpy(tmp, p_ex, sizeof(struct ex));
tmp += sizeof(struct ex);
memcpy(tmp, &data_len, sizeof(size_t);
tmp += 4;
memcpy(tmp, p_ex->a, data_len);
shmdt(p);
/*
* If you want to keep the reference so that
* When modifying p_ex anywhere, you update the shm content at the same time :
* - Don't call shmdt()
* - Make p_ex->a point towards the good area :
* p_ex->a = p + sizeof(struct ex) + sizeof(size_t);
* Never ever modify a without detaching the shm ...
*/
return 0;
}
/* Get the ex structure from a shm segment */
int shm_get_ex(int shmid, struct ex *p_dst)
{
void *p = shmat(shmid, NULL, SHM_RDONLY);
void *tmp;
size_t data_len = 0;
if ((void*)(-1) == p) {
/* Error ... */
return -1;
}
data_len = *(size_t*)(p + sizeof(struct ex))
if (NULL == (tmp = malloc(data_len))) {
/* No memory ... */
shmdt(p);
return -1;
}
memcpy(p_dst, p, sizeof(struct ex));
memcpy(tmp, (p + sizeof(struct ex) + sizeof(size_t)), data_len);
p_dst->a = tmp;
/*
* If you want to modify "globally" the structure,
* - Change SHM_RDONLY to 0 in the shmat() call
* - Make p_dst->a point to the good offset :
* p_dst->a = p + sizeof(struct ex) + sizeof(size_t);
* - Remove from the code above all the things made with tmp (malloc ...)
*/
return 0;
}
/*
* Detach the given p_ex structure from a shm segment.
* This function is useful only if you use the shm segment
* in the way I described in comment in the other functions.
*/
void shm_detach_struct(struct ex *p_ex)
{
/*
* Here you could :
* - alloc a local pointer
* - copy the shm data into it
* - detach the segment using the current p_ex->a pointer
* - assign your local pointer to p_ex->a
* This would save locally the data stored in the shm at the call
* Or if you're lazy (like me), just detach the pointer and make p_ex->a = NULL;
*/
shmdt(p_ex->a - sizeof(struct ex) - sizeof(size_t));
p_ex->a = NULL;
}
Excuse my laziness, it would be space-optimized to not copy at all the value of the int *a pointer of the struct ex since it is completely unused in the shared memory, but I spared myself extra-code to handle this (and some pointer checkings like the p_ex arguments integrity).
But when you are done, you must find a way to share the shm ID between your processes. This could be done using sockets, pipes ... Or using ftok with the same input.
The memory you allocate to a pointer using malloc() is private to that process. So, when you try to access the pointer in another process(other than the process which malloced it) you are likely going to access an invalid memory page or a memory page mapped in another process address space. So, you are likely to get a segfault.
If you are using the shared memory, you must make sure all the data you want to expose to other processes is "in" the shared memory segment and not private memory segments of the process.
You could try, leaving the data at a specified offset in the memory segment, which can be concretely defined at compile time or placed in a field at some known location in the shared memory segment.
Eg:
If you are doing this
char *mem = shmat(shmid2, (void*)0, 0);
// So, the mystruct type is at offset 0.
mystruct *structptr = (mystruct*)mem;
// Now we have a structptr, use an offset to get some other_type.
other_type *other = (other_type*)(mem + structptr->offset_of_other_type);
Other way would be to have a fixed size buffer to pass the information using the shared memory approach, instead of using the dynamically allocated pointer.
Hope this helps.
Are you working in Windows or Linux?
In any case what you need is a memory mapped file. Documentation with code examples here,
http://msdn.microsoft.com/en-us/library/aa366551%28VS.85%29.aspx
http://menehune.opt.wfu.edu/Kokua/More_SGI/007-2478-008/sgi_html/ch03.html
You need to use shared memory/memory mapped files/whatever your OS gives you.
In general, IPC and sharing memory between processes is quite OS dependent, especially in low-level languages like C (higher-level languages usually have libraries for that - for example, even C++ has support for it using boost).
If you are on Linux, I usually use shmat for small amount, and mmap (http://en.wikipedia.org/wiki/Mmap) for larger amounts.
On Win32, there are many approaches; the one I prefer is usually using page-file backed memory mapped files (http://msdn.microsoft.com/en-us/library/ms810613.aspx)
Also, you need to pay attention to where you are using these mechanism inside your data structures: as mentioned in the comments, without using precautions the pointer you have in your "source" process is invalid in the "target" process, and needs to be replaced/adjusted (IIRC, pointers coming from mmap are already OK(mapped); at least, under windows pointers you get out of MapViewOfFile are OK).
EDIT: from your edited example:
What you do here:
e = (ex*)shmat(mid, NULL, 0);
(other process)
int shmid = shmget(key, sizeof(ex), 0);
ex* e = (ex*)shmat(shmid, NULL, 0);
is correcty, but you need to do it for each pointer you have, not only for the "main" pointer to the struct. E.g. you need to do:
e->a = (int*)shmat(shmget(another_key, dim_of_a, IPC_CREAT | 0666), NULL, 0);
instead of creating the array with malloc.
Then, on the other process, you also need to do shmget/shmat for the pointer.
This is why, in the comments, I said that I usually prefer to pack the structs: so I do not need to go through the hassle to to these operations for every pointer.
Convert the struct:
typedef struct {
int b;
int c;
int a[];
} ex;
and then on parent process:
int mid = shmget(key, sizeof(ex) + arraysize*sizeof(int), 0666);
it should work.
In general, it is difficult to work with dynamic arrays inside structs in c, but in this way you are able to allocate the proper memory (this will also work in malloc: How to include a dynamic array INSIDE a struct in C?)

Run-time mocking in C?

This has been pending for a long time in my list now. In brief - I need to run mocked_dummy() in the place of dummy() ON RUN-TIME, without modifying factorial(). I do not care on the entry point of the software. I can add up any number of additional functions (but cannot modify code within /*---- do not modify ----*/).
Why do I need this?
To do unit tests of some legacy C modules. I know there are a lot of tools available around, but if run-time mocking is possible I can change my UT approach (add reusable components) make my life easier :).
Platform / Environment?
Linux, ARM, gcc.
Approach that I'm trying with?
I know GDB uses trap/illegal instructions for adding up breakpoints (gdb internals).
Make the code self modifiable.
Replace dummy() code segment with illegal instruction, and return as immediate next instruction.
Control transfers to trap handler.
Trap handler is a reusable function that reads from a unix domain socket.
Address of mocked_dummy() function is passed (read from map file).
Mock function executes.
There are problems going ahead from here. I also found the approach is tedious and requires good amount of coding, some in assembly too.
I also found, under gcc each function call can be hooked / instrumented, but again not very useful since the the function is intended to be mocked will anyway get executed.
Is there any other approach that I could use?
#include <stdio.h>
#include <stdlib.h>
void mocked_dummy(void)
{
printf("__%s__()\n",__func__);
}
/*---- do not modify ----*/
void dummy(void)
{
printf("__%s__()\n",__func__);
}
int factorial(int num)
{
int fact = 1;
printf("__%s__()\n",__func__);
while (num > 1)
{
fact *= num;
num--;
}
dummy();
return fact;
}
/*---- do not modify ----*/
int main(int argc, char * argv[])
{
int (*fp)(int) = atoi(argv[1]);
printf("fp = %x\n",fp);
printf("factorial of 5 is = %d\n",fp(5));
printf("factorial of 5 is = %d\n",factorial(5));
return 1;
}
test-dept is a relatively recent C unit testing framework that allows you to do runtime stubbing of functions. I found it very easy to use - here's an example from their docs:
void test_stringify_cannot_malloc_returns_sane_result() {
replace_function(&malloc, &always_failing_malloc);
char *h = stringify('h');
assert_string_equals("cannot_stringify", h);
}
Although the downloads section is a little out of date, it seems fairly actively developed - the author fixed an issue I had very promptly. You can get the latest version (which I've been using without issues) with:
svn checkout http://test-dept.googlecode.com/svn/trunk/ test-dept-read-only
the version there was last updated in Oct 2011.
However, since the stubbing is achieved using assembler, it may need some effort to get it to support ARM.
This is a question I've been trying to answer myself. I also have the requirement that I want the mocking method/tools to be done in the same language as my application. Unfortunately this cannot be done in C in a portable way, so I've resorted to what you might call a trampoline or detour. This falls under the "Make the code self modifiable." approach you mentioned above. This is were we change the actually bytes of a function at runtime to jump to our mock function.
#include <stdio.h>
#include <stdlib.h>
// Additional headers
#include <stdint.h> // for uint32_t
#include <sys/mman.h> // for mprotect
#include <errno.h> // for errno
void mocked_dummy(void)
{
printf("__%s__()\n",__func__);
}
/*---- do not modify ----*/
void dummy(void)
{
printf("__%s__()\n",__func__);
}
int factorial(int num)
{
int fact = 1;
printf("__%s__()\n",__func__);
while (num > 1)
{
fact *= num;
num--;
}
dummy();
return fact;
}
/*---- do not modify ----*/
typedef void (*dummy_fun)(void);
void set_run_mock()
{
dummy_fun run_ptr, mock_ptr;
uint32_t off;
unsigned char * ptr, * pg;
run_ptr = dummy;
mock_ptr = mocked_dummy;
if (run_ptr > mock_ptr) {
off = run_ptr - mock_ptr;
off = -off - 5;
}
else {
off = mock_ptr - run_ptr - 5;
}
ptr = (unsigned char *)run_ptr;
pg = (unsigned char *)(ptr - ((size_t)ptr % 4096));
if (mprotect(pg, 5, PROT_READ | PROT_WRITE | PROT_EXEC)) {
perror("Couldn't mprotect");
exit(errno);
}
ptr[0] = 0xE9; //x86 JMP rel32
ptr[1] = off & 0x000000FF;
ptr[2] = (off & 0x0000FF00) >> 8;
ptr[3] = (off & 0x00FF0000) >> 16;
ptr[4] = (off & 0xFF000000) >> 24;
}
int main(int argc, char * argv[])
{
// Run for realz
factorial(5);
// Set jmp
set_run_mock();
// Run the mock dummy
factorial(5);
return 0;
}
Portability explanation...
mprotect() - This changes the memory page access permissions so that we can actually write to memory that holds the function code. This isn't very portable, and in a WINAPI env, you may need to use VirtualProtect() instead.
The memory parameter for mprotect is aligned to the previous 4k page, this also can change from system to system, 4k is appropriate for vanilla linux kernel.
The method that we use to jmp to the mock function is to actually put down our own opcodes, this is probably the biggest issue with portability because the opcode I've used will only work on a little endian x86 (most desktops). So this would need to be updated for each arch you plan to run on (which could be semi-easy to deal with in CPP macros.)
The function itself has to be at least five bytes. The is usually the case because every function normally has at least 5 bytes in its prologue and epilogue.
Potential Improvements...
The set_mock_run() call could easily be setup to accept parameters for reuse. Also, you could save the five overwritten bytes from the original function to restore later in the code if you desire.
I'm unable to test, but I've read that in ARM... you'd do similar but you can jump to an address (not an offset) with the branch opcode... which for an unconditional branch you'd have the first bytes be 0xEA and the next 3 bytes are the address.
Chenz
An approach that I have used in the past that has worked well is the following.
For each C module, publish an 'interface' that other modules can use. These interfaces are structs that contain function pointers.
struct Module1
{
int (*getTemperature)(void);
int (*setKp)(int Kp);
}
During initialization, each module initializes these function pointers with its implementation functions.
When you write the module tests, you can dynamically changes these function pointers to its mock implementations and after testing, restore the original implementation.
Example:
void mocked_dummy(void)
{
printf("__%s__()\n",__func__);
}
/*---- do not modify ----*/
void dummyFn(void)
{
printf("__%s__()\n",__func__);
}
static void (*dummy)(void) = dummyFn;
int factorial(int num)
{
int fact = 1;
printf("__%s__()\n",__func__);
while (num > 1)
{
fact *= num;
num--;
}
dummy();
return fact;
}
/*---- do not modify ----*/
int main(int argc, char * argv[])
{
void (*oldDummy) = dummy;
/* with the original dummy function */
printf("factorial of 5 is = %d\n",factorial(5));
/* with the mocked dummy */
oldDummy = dummy; /* save the old dummy */
dummy = mocked_dummy; /* put in the mocked dummy */
printf("factorial of 5 is = %d\n",factorial(5));
dummy = oldDummy; /* restore the old dummy */
return 1;
}
You can replace every function by the use of LD_PRELOAD. You have to create a shared library, which gets loaded by LD_PRELOAD. This is a standard function used to turn programs without support for SOCKS into SOCKS aware programs. Here is a tutorial which explains it.

Copy a function in memory and execute it

I would like to know how in C in can copy the content of a function into memory and the execute it?
I'm trying to do something like this:
typedef void(*FUN)(int *);
char * myNewFunc;
char *allocExecutablePages (int pages)
{
template = (char *) valloc (getpagesize () * pages);
if (mprotect (template, getpagesize (),
PROT_READ|PROT_EXEC|PROT_WRITE) == -1) {
perror ("mprotect");
}
}
void f1 (int *v) {
*v = 10;
}
// allocate enough spcae but how much ??
myNewFunc = allocExecutablePages(...)
/* Copy f1 somewere else
* (how? assume that i know the size of f1 having done a (nm -S foo.o))
*/
((FUN)template)(&val);
printf("%i",val);
Thanks for your answers
You seem to have figured out the part about protection flags. If you know the size of the function, now you can just do memcpy() and pass the address of f1 as the source address.
One big caveat is that, on many platforms, you will not be able to call any other functions from the one you're copying (f1), because relative addresses are hardcoded into the binary code of the function, and moving it into a different location it the memory can make those relative addresses turn bad.
This happens to work because function1 and function2 are exactly the same size in memory.
We need the length of function2 for our memcopy so what should be done is:
int diff = (&main - &function2);
You'll notice you can edit function 2 to your liking and it keeps working just fine!
Btw neat trick. Unfurtunate the g++ compiler does spit out invalid conversion from void* to int... But indeed with gcc it compiles perfectly ;)
Modified sources:
//Hacky solution and simple proof of concept that works for me (and compiles without warning on Mac OS X/GCC 4.2.1):
//fixed the diff address to also work when function2 is variable size
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#include <sys/mman.h>
int function1(int x){
return x-5;
}
int function2(int x){
//printf("hello world");
int k=32;
int l=40;
return x+5+k+l;
}
int main(){
int diff = (&main - &function2);
printf("pagesize: %d, diff: %d\n",getpagesize(),diff);
int (*fptr)(int);
void *memfun = malloc(4096);
if (mprotect(memfun, 4096, PROT_READ|PROT_EXEC|PROT_WRITE) == -1) {
perror ("mprotect");
}
memcpy(memfun, (const void*)&function2, diff);
fptr = &function1;
printf("native: %d\n",(*fptr)(6));
fptr = memfun;
printf("memory: %d\n",(*fptr)(6) );
fptr = &function1;
printf("native: %d\n",(*fptr)(6));
free(memfun);
return 0;
}
Output:
Walter-Schrepperss-MacBook-Pro:cppWork wschrep$ gcc memoryFun.c
Walter-Schrepperss-MacBook-Pro:cppWork wschrep$ ./a.out
pagesize: 4096, diff: 35
native: 1
memory: 83
native: 1
Another to note is calling printf will segfault because printf is most likely not found due to relative address going wrong...
Hacky solution and simple proof of concept that works for me (and compiles without warning on Mac OS X/GCC 4.2.1):
#include "stdio.h"
#include "stdlib.h"
#include "string.h"
#include <sys/mman.h>
int function1(int x){
return x-5;
}
int function2(int x){
return x+5;
}
int main(){
int diff = (&function2 - &function1);
printf("pagesize: %d, diff: %d\n",getpagesize(),diff);
int (*fptr)(int);
void *memfun = malloc(4096);
if (mprotect(memfun, 4096, PROT_READ|PROT_EXEC|PROT_WRITE) == -1) {
perror ("mprotect");
}
memcpy(memfun, (const void*)&function2, diff);
fptr = &function1;
printf("native: %d\n",(*fptr)(6));
fptr = memfun;
printf("memory: %d\n",(*fptr)(6) );
fptr = &function1;
printf("native: %d\n",(*fptr)(6));
free(memfun);
return 0;
}
I have tried this issue many times in C and came to the conclusion that it cannot be accomplished using only the C language. My main thorn was finding the length of the function to copy.
The Standard C language does not provide any methods to obtain the length of a function. However, one can use assembly language and "sections" to find the length. Once the length is found, copying and executing is easy.
The easiest solution is to create or define a linker segment that contains the function. Write an assembly language module to calculate and publicly declare the length of this segment. Use this constant for the size of the function.
There are other methods that involve setting up the linker, such as predefined areas or fixed locations and copying those locations.
In embedded systems land, most of the code that copies executable stuff into RAM is written in assembly.
This might be a hack solution here. Could you make a dummy variable or function directly after the function (to be copied), obtain that dummy variable's/function's address and then take the functions address to do sum sort of arithmetic using addresses to obtain the function size? This might be possible since memory is allocated linearly and orderly (rather than randomly). This would also keep function copying within a ANSI C portable nature rather than delving into system specific assembly code. I find C to be rather flexible, one just needs to think things out.

Resources