Optimal Memory Utilization in realloc (splitting?) - c

I'm having difficulty with coding my realloc function.
I have it working through standard memcpy procedure, but I can't get it optimized. I know there are two other cases I need to accommodate for: expanding the current block forward, and checking if the current sized block is large enough (and if too large, split it to free memory).
However, I can't seem to get it right. I always get errors. To clarify, these are not compile errors... these are heap integrity checks that fail through a trace driver. If I do it without splitting, I I run out of memory, and if I try to split, it says it "failed to preserve the original block/data."
Below is my normal memcpy code. The commented section in the middle is my attempt to expand, but I think I need to split because it's causing a ton of fragmentation. This is leading to me running out of memory and erroring out during (one) of the realloc tests. If I do it without the comment block, it works fine, but there is zero optimization.
My attempts to split always fail; commented code at the bottom is my attempt. What am I doing wrong here?
I would very much appreciate any assistance, thank you. :)
#define PACK(size, alloc) ((size) | (alloc))
#define GET_SIZE(p) (GET(p) & ~0x7)
#define GET_ALLOC(p) (GET(p) & 0x1)
#define HDRP(bp) ((char *)(bp) - WSIZE)
#define FTRP(bp) ((char *)(bp) + GET_SIZE(HDRP(bp)) - DSIZE)
#define NEXT_BLKP(bp) ((char *)(bp) + GET_SIZE(((char *)(bp) - WSIZE)))
void *mm_realloc(void *oldptr, size_t size)
{
void *newptr;
size_t copySize;
copySize = GET_SIZE(HDRP(oldptr));
size_t next_alloc = GET_ALLOC(HDRP(NEXT_BLKP(oldptr)));
// if (copySize > size) return oldptr;
/*if (!next_alloc) {
if ((GET_SIZE(HDRP(oldptr)) + GET_SIZE(HDRP(NEXT_BLKP(oldptr))))>size) {
copySize += GET_SIZE(HDRP(NEXT_BLKP(oldptr)));
PUT(HDRP(oldptr), PACK(copySize,1));
PUT(FTRP(oldptr), PACK(copySize,1));
return oldptr;
}
}*/
newptr = mm_malloc(size);
if (newptr == NULL)
return NULL;
if (size < copySize)
copySize = size;
memcpy(newptr, oldptr, copySize);
PUT(newptr,GET(oldptr));
mm_free(oldptr);
return newptr;
}
// int total_avail = (GET_SIZE(HDRP(oldptr)) + GET_SIZE(HDRP(NEXT_BLKP(oldptr))));
// copySize -= (total_avail - size);

Related

String parsing failes when using a function doing substring in C

I am having an issue with parsing a string in C. It causes a HardFault eventually.
MCU: LPC1769,
OS: FreeRTOS 10,
Toolchain: IAR
In order to test, If I keep sending the same data frame (you may see the sample below in message variable in parseMessage function),
after 5-6 times parsing it goes OK, parsing works as I expected, and then suddenly falls in HardFault when I send one more the exact same string to the function.
I tested the function in OnlineGDB. I haven't observed any issue.
I have couple of slightly different version of that function below although the result is the same;
char *substr3(char const *input, size_t start, size_t len) {
char *ret = malloc(len+1);
memcpy(ret, input+start, len);
ret[len] = '\0';
return ret;
}
I've extracted the function piece for a better overveiw:
(don't pay attention to stripEOL(message); call, it just strips out end-of-line characters, but you can see it in the gdbonline share of mine)
void parseMessage(char * message){
//char* message= "7E00002A347C31323030302D3132353330387C33302E30372E323032307C31317C33307C33317C31352D31367C31357C317C57656E67657274880D";
// Parsing the frame
char* start;
char* len;
char* cmd;
char* data;
char* chksum;
char* end;
stripEOL(message);
unsigned int messagelen = strlen(message);
start = substr3(message, 0, 2);
len = substr3(message, 2, 4);
cmd = substr3(message, 6, 2);
data = substr3(message, 8, messagelen-8-4);
chksum = substr3(message, messagelen-4, 2);
end = substr3(message, messagelen-2, 2);
}
Only the data variable differs in length.
e.g. data --> "347C31323030302D3132353330387C33302E30372E323032307C31317C33307C33317C31352D31367C31357C317C57656E67657274"
A HardFault debug log:
LR = 0x8667 in disassembly
PC = 0x2dd0 in disassembly
I appreciate to the contributors which they led me to find the solution for my instance.
Since there wasn't a total solution by the contributors and I found a working solution, I'd better be writing for whom may interest in future.
Since I am developing my application on top of FreeRTOS 10 and using malloc from the C library, apparently it wasn't cooping at least with my implementations. It's been said in some resources, you can use standard malloc within FreeRTOS, I couldn't manage myself for some unknown reason. It might have been a help, if I had increased the heap memory, I don't know but I didn't have intention on that as well.
I've just placed that two wrapper functions (somewhere in a common file) without even changing my malloc and free calls.;
Creating a malloc/free functions that work with the built-in FreeRTOS heap is quite simple. We just wrap the pvPortMalloc/pvPortFree calls:
void* malloc(size_t size)
{
void* ptr = NULL;
if(size > 0)
{
// We simply wrap the FreeRTOS call into a standard form
ptr = pvPortMalloc(size);
} // else NULL if there was an error
return ptr;
}
void free(void* ptr)
{
if(ptr)
{
// We simply wrap the FreeRTOS call into a standard form
vPortFree(ptr);
}
}
Note that: You can't use that with heap schema #1 but with the others (2, 3, 4 and 5).
I would recommend start using portable/MemMang/heap_4.c

Producer-consumer algorithm to use full buffer

I was reading Galvin OS book about producer consumer problem and came through this piece of code.
Global definitions
#define BUFFER_SIZE 10
typedef struct {
. . .
} item;
int in = 0;
int out = 0;
Producer
while (((in + 1) % BUFFER_SIZE) == out)
; /* do nothing */
buffer[in] = next_produced;
in = (in + 1) % BUFFER_SIZE ;
Consumer
while (in == out)
; /* do nothing */
next_consumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
Now this is what Galvin book says:
This scheme allows at most BUFFER_SIZE − 1 items in the buffer at the
same time. We leave it as an exercise for you to provide a solution in which
BUFFER_SIZE items can be in the buffer at the same time.
This is what I came up with. Is this correct?
Producer
buffer[in] = next_produced; //JUST MOVED THIS LINE!
while (((in + 1) % BUFFER_SIZE ) == out)
; /* do nothing */
in = (in + 1) % BUFFER_SIZE;
Consumer
while (in == out)
; /* do nothing */
next_consumed = buffer[out];
out = (out + 1) % BUFFER_SIZE;
I think this solves, but is this correct? Any other better solution possible?
In the original piece of code, when in == out it could mean the buffer is empty OR full. So to avoid such an ambiguity, the original code do not allow buffer to full, always leaving at least one empty item.
I am not sure you are solving this problem with your change: you will be able to put BUFFER_SIZE items, but you will not be able to consume them. So, literally you solved it, but it will not function properly.
Basically, to solve this problem, you should have an extra piece of information, so you can distinct between an empty buffer and full. There are a variety solutions for that, the most obvious is to add an extra flag.
The most elegant IMO is to use in and out counters as is, wrapping them only to access the buffer, so:
when in == out -- the buffer is empty
when abs(in - out) == BUFFER_SIZE -- the buffer is full
to access the buffer we should use buffer[in % BUFFER_SIZE] or buffer[out % BUFFER_SIZE]
We leave it as an exercise for you to provide a complete solution ;)

Memory allocation threshold (mmap vs malloc)

Id like to point out I'm new to this so I'm trying to understand / explain it best i can.
I am basically trying to figure out if its possible to keep memory allocation under a threshold due to memory limitation of my project.
Here is how memory is allocated currently using third party libsodium:
alloc_region(escrypt_region_t *region, size_t size)
{
uint8_t *base, *aligned;
#if defined(MAP_ANON) && defined(HAVE_MMAP)
if ((base = (uint8_t *) mmap(NULL, size, PROT_READ | PROT_WRITE,
#ifdef MAP_NOCORE
MAP_ANON | MAP_PRIVATE | MAP_NOCORE,
#else
MAP_ANON | MAP_PRIVATE,
#endif
-1, 0)) == MAP_FAILED)
base = NULL; /* LCOV_EXCL_LINE */
aligned = base;
#elif defined(HAVE_POSIX_MEMALIGN)
if ((errno = posix_memalign((void **) &base, 64, size)) != 0) {
base = NULL;
}
aligned = base;
#else
base = aligned = NULL;
if (size + 63 < size)
errno = ENOMEM;
else if ((base = (uint8_t *) malloc(size + 63)) != NULL) {
aligned = base + 63;
aligned -= (uintptr_t) aligned & 63;
}
#endif
region->base = base;
region->aligned = aligned;
region->size = base ? size : 0;
return aligned;
}
So for example, this currently calls posix_memalign to allocate (e.g.) 32mb of memory.
32mb exceeds my 'memory cap' given to me (but does not throw memory warnings as the memory capacity is far greater, its just what I'm 'allowed' to use)
From some googling, I'm under the impression i can either use mmap and virtual memory.
I can see that the function above already has some mmap implemented but is never called.
Is it possible to convert the above code so that i never exceed my 30mb memory limit?
From my understanding, if this allocation would exceed my free memory, it would automatically allocate in virtual memory? So can i force this to happen and pretend that my free space is lower than available?
Any help is appreciated
UPDATE
/* Allocate memory. */
B_size = (size_t) 128 * r * p;
V_size = (size_t) 128 * r * N;
need = B_size + V_size;
if (need < V_size) {
errno = ENOMEM;
return -1;
}
XY_size = (size_t) 256 * r + 64;
need += XY_size;
if (need < XY_size) {
errno = ENOMEM;
return -1;
}
if (local->size < need) {
if (free_region(local)) {
return -1;
}
if (!alloc_region(local, need)) {
return -1;
}
}
B = (uint8_t *) local->aligned;
V = (uint32_t *) ((uint8_t *) B + B_size);
XY = (uint32_t *) ((uint8_t *) V + V_size);
I am basically trying to figure out if its possible to keep memory allocation under a threshold due to memory limitation of my project.
On Linux or POSIX systems, you might consider using setrlimit(2) with RLIMIT_AS:
This is the maximum size of the process's virtual memory
(address space) in bytes. This limit affects calls to brk(2),
mmap(2), and mremap(2), which fail with the error ENOMEM upon
exceeding this limit.
Above this limit, the mmap would fail, and so would fail for instance the call to malloc(3) that triggered that particular use of mmap.
I'm under the impression i can either use mmap
Notice that malloc(3) will call mmap(2) (or sometimes sbrk(2)...) to retrieve (virtual) memory from the kernel, thus growing your virtual address space. However, malloc often prefer to reused previously free-d memory (when available). And free usually won't call munmap(2) to release memory chunks but prefer to keep it for future malloc-s. Actually most C standard libraries segregate between "small" and "large" allocation (in practice a malloc for a gigabyte will use mmap, and the corresponding free would mmap immediately).
See also mallopt(3) and madvise(2). In case you need to lock some pages (obtained by mmap) into physical RAM, consider mlock(2).
Look also into this answer (explaining that the notion of RAM used by a particular process is not that easy).
For malloc related bugs (including memory leaks) use valgrind.

Sending/Handling partial data writes in TCP

So, I have the following code which sends out my packet on TCP. Its working pretty well. I just have to test partial writes. So I write 1 byte at time either by setting sendbuf to 1 or do a hack as shown below. When i took a tcpdump, it was all incorrect except the first byte.. what am i doing wrong?
int tmi_transmit_packet(struct tmi_msg_pdu *tmi_pkt, int len, int *written_len)
{
int bytes;
// This works
bytes = write(g_tmi_mgr->tmi_conn_fd, (void*) tmi_pkt, len);
// This doesn't:
// bytes = write(g_tmi_mgr->tmi_conn_fd, (void*) tmi_pkt, 1);
if (bytes < 0) {
if (errno == EAGAIN) {
return (TMI_SOCK_FULL);
}
return (TMI_WRITE_FAILED);
} else if (bytes < len) {
*written_len += bytes;
tmi_pkt += bytes;
return (tmi_transmit_packet(tmi_pkt, len - bytes, written_len));
} else {
*written_len += len;
}
return TMI_SUCCESS;
}
This line
tmi_pkt += bytes;
most propably does not do what you expect.
It does increment tmi_pkt by sizeof(*tmp_pkt) * bytes and not only by bytes. For a nice explanation on pointer arithmetics you might like to click here and have a look at binky.
To get around this you might mod you code as follows:
...
else if (bytes < len) {
void * pv = ((char *) tmp_pkt) + bytes;
*written_len += bytes;
return (tmi_transmit_packet(pv, len - bytes, written_len));
}
...
Anyhow this somehow smells dirty as the data pointed to by the pointer passed into the write function does not necessarly need to correspond to it's type.
So a cleaner solution would be to not used struct tmi_msg_pdu *tmi_pkt but void * or char * as the function parameter declaration.
Although quiet extravagant the use of recursive calls here is not necessary nor recommended. For much data and/or a slow transmission it may run out of stack memory. A simple loop would do also. The latter has the advantage that you could use a temporary pointer to the buffer to be written and could stick to a typed interface.

Simple C implementation to track memory malloc/free?

programming language: C
platform: ARM
Compiler: ADS 1.2
I need to keep track of simple melloc/free calls in my project. I just need to get very basic idea of how much heap memory is required when the program has allocated all its resources. Therefore, I have provided a wrapper for the malloc/free calls. In these wrappers I need to increment a current memory count when malloc is called and decrement it when free is called. The malloc case is straight forward as I have the size to allocate from the caller. I am wondering how to deal with the free case as I need to store the pointer/size mapping somewhere. This being C, I do not have a standard map to implement this easily.
I am trying to avoid linking in any libraries so would prefer *.c/h implementation.
So I am wondering if there already is a simple implementation one may lead me to. If not, this is motivation to go ahead and implement one.
EDIT: Purely for debugging and this code is not shipped with the product.
EDIT: Initial implementation based on answer from Makis. I would appreciate feedback on this.
EDIT: Reworked implementation
#include <stdlib.h>
#include <stdio.h>
#include <assert.h>
#include <string.h>
#include <limits.h>
static size_t gnCurrentMemory = 0;
static size_t gnPeakMemory = 0;
void *MemAlloc (size_t nSize)
{
void *pMem = malloc(sizeof(size_t) + nSize);
if (pMem)
{
size_t *pSize = (size_t *)pMem;
memcpy(pSize, &nSize, sizeof(nSize));
gnCurrentMemory += nSize;
if (gnCurrentMemory > gnPeakMemory)
{
gnPeakMemory = gnCurrentMemory;
}
printf("PMemAlloc (%#X) - Size (%d), Current (%d), Peak (%d)\n",
pSize + 1, nSize, gnCurrentMemory, gnPeakMemory);
return(pSize + 1);
}
return NULL;
}
void MemFree (void *pMem)
{
if(pMem)
{
size_t *pSize = (size_t *)pMem;
// Get the size
--pSize;
assert(gnCurrentMemory >= *pSize);
printf("PMemFree (%#X) - Size (%d), Current (%d), Peak (%d)\n",
pMem, *pSize, gnCurrentMemory, gnPeakMemory);
gnCurrentMemory -= *pSize;
free(pSize);
}
}
#define BUFFERSIZE (1024*1024)
typedef struct
{
bool flag;
int buffer[BUFFERSIZE];
bool bools[BUFFERSIZE];
} sample_buffer;
typedef struct
{
unsigned int whichbuffer;
char ch;
} buffer_info;
int main(void)
{
unsigned int i;
buffer_info *bufferinfo;
sample_buffer *mybuffer;
char *pCh;
printf("Tesint MemAlloc - MemFree\n");
mybuffer = (sample_buffer *) MemAlloc(sizeof(sample_buffer));
if (mybuffer == NULL)
{
printf("ERROR ALLOCATING mybuffer\n");
return EXIT_FAILURE;
}
bufferinfo = (buffer_info *) MemAlloc(sizeof(buffer_info));
if (bufferinfo == NULL)
{
printf("ERROR ALLOCATING bufferinfo\n");
MemFree(mybuffer);
return EXIT_FAILURE;
}
pCh = (char *)MemAlloc(sizeof(char));
printf("finished malloc\n");
// fill allocated memory with integers and read back some values
for(i = 0; i < BUFFERSIZE; ++i)
{
mybuffer->buffer[i] = i;
mybuffer->bools[i] = true;
bufferinfo->whichbuffer = (unsigned int)(i/100);
}
MemFree(bufferinfo);
MemFree(mybuffer);
if(pCh)
{
MemFree(pCh);
}
return EXIT_SUCCESS;
}
You could allocate a few extra bytes in your wrapper and put either an id (if you want to be able to couple malloc() and free()) or just the size there. Just malloc() that much more memory, store the information at the beginning of your memory block and and move the pointer you return that many bytes forward.
This can, btw, also easily be used for fence pointers/finger-prints and such.
Either you can have access to internal tables used by malloc/free (see this question: Where Do malloc() / free() Store Allocated Sizes and Addresses? for some hints), or you have to manage your own tables in your wrappers.
You could always use valgrind instead of rolling your own implementation. If you don't care about the amount of memory you allocate you could use an even simpler implementation: (I did this really quickly so there could be errors and I realize that it is not the most efficient implementation. The pAllocedStorage should be given an initial size and increase by some factor for a resize etc. but you get the idea.)
EDIT: I missed that this was for ARM, to my knowledge valgrind is not available on ARM so that might not be an option.
static size_t indexAllocedStorage = 0;
static size_t *pAllocedStorage = NULL;
static unsigned int free_calls = 0;
static unsigned long long int total_mem_alloced = 0;
void *
my_malloc(size_t size){
size_t *temp;
void *p = malloc(size);
if(p == NULL){
fprintf(stderr,"my_malloc malloc failed, %s", strerror(errno));
exit(EXIT_FAILURE);
}
total_mem_alloced += size;
temp = (size_t *)realloc(pAllocedStorage, (indexAllocedStorage+1) * sizeof(size_t));
if(temp == NULL){
fprintf(stderr,"my_malloc realloc failed, %s", strerror(errno));
exit(EXIT_FAILURE);
}
pAllocedStorage = temp;
pAllocedStorage[indexAllocedStorage++] = (size_t)p;
return p;
}
void
my_free(void *p){
size_t i;
int found = 0;
for(i = 0; i < indexAllocedStorage; i++){
if(pAllocedStorage[i] == (size_t)p){
pAllocedStorage[i] = (size_t)NULL;
found = 1;
break;
}
}
if(!found){
printf("Free Called on unknown\n");
}
free_calls++;
free(p);
}
void
free_check(void) {
size_t i;
printf("checking freed memeory\n");
for(i = 0; i < indexAllocedStorage; i++){
if(pAllocedStorage[i] != (size_t)NULL){
printf( "Memory leak %X\n", (unsigned int)pAllocedStorage[i]);
free((void *)pAllocedStorage[i]);
}
}
free(pAllocedStorage);
pAllocedStorage = NULL;
}
I would use rmalloc. It is a simple library (actually it is only two files) to debug memory usage, but it also has support for statistics. Since you already wrapper functions it should be very easy to use rmalloc for it. Keep in mind that you also need to replace strdup, etc.
Your program may also need to intercept realloc(), calloc(), getcwd() (as it may allocate memory when buffer is NULL in some implementations) and maybe strdup() or a similar function, if it is supported by your compiler
If you are running on x86 you could just run your binary under valgrind and it would gather all this information for you, using the standard implementation of malloc and free. Simple.
I've been trying out some of the same techniques mentioned on this page and wound up here from a google search. I know this question is old, but wanted to add for the record...
1) Does your operating system not provide any tools to see how much heap memory is in use in a running process? I see you're talking about ARM, so this may well be the case. In most full-featured OSes, this is just a matter of using a cmd-line tool to see the heap size.
2) If available in your libc, sbrk(0) on most platforms will tell you the end address of your data segment. If you have it, all you need to do is store that address at the start of your program (say, startBrk=sbrk(0)), then at any time your allocated size is sbrk(0) - startBrk.
3) If shared objects can be used, you're dynamically linking to your libc, and your OS's runtime loader has something like an LD_PRELOAD environment variable, you might find it more useful to build your own shared object that defines the actual libc functions with the same symbols (malloc(), not MemAlloc()), then have the loader load your lib first and "interpose" the libc functions. You can further obtain the addresses of the actual libc functions with dlsym() and the RTLD_NEXT flag so you can do what you are doing above without having to recompile all your code to use your malloc/free wrappers. It is then just a runtime decision when you start your program (or any program that fits the description in the first sentence) where you set an environment variable like LD_PRELOAD=mymemdebug.so and then run it. (google for shared object interposition.. it's a great technique and one used by many debuggers/profilers)

Resources