Memory usage by arrays in C - c

int main() {
int i = 0, ARRAY_SIZE = 500000;
char **char_array;
char_array = (char **)malloc(ARRAY_SIZE * sizeof(char*));
//physical memory used before loop = M KB
for (i = 0; i < ARRAY_SIZE; i++) {
char_array[i] = (char *)malloc(16 * sizeof(char));
}
//physical memory usage after loop = M+19532 KB
return 0;
}
I have the above piece of code. I don't understand where the 19532 KB of memory use is coming from. In my machine (64 bit), sizeof(char*) should be 8 bytes. For row initialization of an array, how is memory used? I'm a beginner in C, so any help would be appreciated.

Every call to malloc uses some additional "overhead" memory beyond the amount you actually request. The library needs some space to keep track of those blocks to know how to free them later, and it may round up the size of small allocations for alignment.
Your allocation of char_array itself needs 500000 * 8, about 4 megabytes of memory, and the overhead is probably negligible. So it looks like your 500000 allocations of 1 byte each are actually using about 32 bytes each. This wouldn't be too unusual; for instance, malloc might be rounding up the size to 16 bytes for alignment, and then needing an additional 16 bytes for bookkeeping (say, 8 bytes each for a size and a pointer to the next block to make a linked list).
Obviously, this is a very inefficient way to allocate space for 500000 characters. You should instead just create an array of 500000 char (not pointers) and index into it directly.

malloc is allowed to arrange and "align" the blocks of memory you allocate in really any way it wishes, and in practice there are both minimum allocation sizes and internal bookkeeping overhead for each block. Single-byte allocations (inside the loop you're allocating chars and not char *s) are uncommon because they're inefficient to do like this; in practice you're probably actually allocating (at least) 8 bytes plus the internal overhead, each.
Whatever "physical memory" measure you're looking at does not map precisely to what you're looking for here. malloc generally takes pools of memory from the OS and then doles it out into the smaller allocated blocks (with internal overhead) that you see at the level of your program. I would not expect any kind of exact correspondence between bytes allocated by the user program via malloc, and what the OS reports about the process. Perhaps a directional correlation holding all else equal, but not something worth spending much time trying to deduce conclusions from.

Related

calling malloc multiple times - higher than expected memory usage

When I run the following 32 bit application (debug mode) under windows a memory usage reaches 2GB limit and loop breaks when i equals 42885988:
for(int i = 0; i < 104857600; ++i)
{
uint8_t* ptr = (uint8_t*)malloc(1);
if (!ptr)
{
break;
}
*ptr = 0;
}
104857600 that's 100mb so how to explain a behavior of the above program ?
malloc(1) doesn't allocate one byte.
The malloc man page notes that the memory returned "is suitably aligned for any built-in type." So if the first call to malloc returns address 0x1000, the second call probably can't return 0x1001, because that address might not be "suitably aligned for any built-in type." (Some processors can't access words at odd addresses, or generally N-byte values at addresses not evenly divisible by N, and some of those that can do so less efficiently.) So the second malloc call will have to return at least 0x1004 or even 0x1008.
Also, malloc has to allocate extra memory to store information about the buffer it returns to you. When you later call free, that function has to know the size of the buffer, for example. On a 64-bit machine that's at least another 8 bytes. Depending on how the runtime manages the heap, it may have to store additional information.
If you assume that each malloc actually allocates at least 8 bytes (for alignment) plus another 8 or 16 for housekeeping, you can see that 100 million calls to malloc of one byte each can get you over 2GB.
I'm not sure if each of your calls is actually using 16 or 24 bytes or whatever; the point is that it's a lot more than one.
2GB/42885988 is a shade over 50 bytes per allocation.
This is more that would be expected from a simple Windows heap allocation, so I suspect you are running a DEBUG build, in which case there is extra overhead of guard bytes around your allocated memory. More details can be found in this article - http://www.nobugs.org/developer/win32/debug_crt_heap.html .

Having malloced some memory,I could't calculator the proper size of the memories I malloced.I don't know why

Having malloced some memories,I could't calculator the proper size of the memories I malloced.The system told me that I had malloced 2GBytes,but my code told me that I had just malloced 119 MBytes.I dont know what was wrong.
#include <stdio.h>
#include <stdlib.h>
int main(void){
long long size = 0;
while(malloc(1) != NULL){
size = size + 1;
}
long long res = size>>20;
printf("%d MBytes\n",res);
scanf("%d",&size);
return 0;
}
image_res_1
image_res_2
The code for malloc and free has to maintain data structures keeping track of each block of memory it's given you. Let's imagine there's a 16-byte data structure per block of memory returned by malloc. You said you allocated 119 MB, which since you allocated 1-byte blocks, suggests you had something like 124780544 blocks. If each block has 16 bytes of overhead, that's 124780544 x 16 = 1996488704 bytes of overhead. 124780544 + 1996488704 = 2121269248, or just about exactly 2 GB.
(This doesn't prove that your system is, in fact, using exactly 16 bytes of overhead for each returned block -- it's probably more complicated than that -- but the result is certainly suggestive.)
The moral is that allocating lots and lots of tiny blocks of memory can be pretty wasteful.
If you change your test program to allocate blocks of, say, 1k at a time, you'll probably get a more palatable result.

Why are address are not consecutive when allocating single bytes?

I am dynamically allocating memory as follows:
char* heap_start1 = (char*) malloc(1);
char* heap_start2 = (char*) malloc(1);
When I do printf as follows surprisingly the addresses are not consecutives.
printf("%p, %p \n",heap_start1,heap_start2);
Result:
0x8246008, 0x8246018
As you can see there is a 15 bytes of extra memory that are left defragmented. It's definitely not because of word alignment. Any idea behind this peculiar alignment?
Thanks in advance!
I am using gcc in linux if that matters.
glibc's malloc, for small memory allocations less than 16 bytes, simply allocates the memory as 16 bytes. This is to prevent external fragmentation upon the freeing of this memory, where blocks of free memory are too small to be used in the general case to fulfill new malloc operations.
A block allocated by malloc must also be large enough to store the data required to track it in the data structure which stores free blocks.
This behaviour, while increasing internal fragmentation, decreases overall fragmentation throughout the system.
Source:
http://repo.or.cz/w/glibc.git/blob/HEAD:/malloc/malloc.c
(Read line 108 in particular)
/*
...
Minimum allocated size: 4-byte ptrs: 16 bytes (including 4 overhead)
...
*/
Furthermore, all addresses returned by the malloc call in glibc are aligned to: 2 * sizeof(size_t) bytes. Which is 64 bits for 32-bit systems (such as yours) and 128 bits for 64-bit systems.
At least three possible reasons:
malloc needs to produce memory that is suitably-aligned for all primitive types. Data for SSE instructions needs to be 128-bit aligned. (There may also be other 128-bit primitive types that your platform supports that don't occur to me at the moment.)
A typical implementation of malloc involves "over-allocation" in order to store bookkeeping information for a speedy free. Not sure if GCC on Linux does this.
It may be allocating guard bytes in order to allow detection of buffer overflows and so on.
malloc guarantees that returned memory is properly aligned for any basic type. Moreover, memory block could be padded with some guard bytes to check for memory corruption, it depends on settings.
If you want to allocate consecutive addresses you should allocate them on the same malloc
char *heap_start1, *heap_start2;
heap_start1 = (char*) malloc(2 * sizeof(char));
heap_start2 = heap_start1 + 1;

What is the difference between the two allocation methods?

I want to test how much the OS does allocate when I request 24M memory.
for (i = 0; i < 1024*1024; i++)
ptr = (char *)malloc(24);
When I write like this I get RES is 32M from the top command.
ptr = (char *)malloc(24*1024*1024);
But when I do a little change the RES is 244. What is the difference between them? Why is the result 244?
The allocator has its own data structures about the bookkeeping that require memory as well. When you allocate in small chunks (the first case), the allocator has to keep a lot of additional data about where each chunk is allocated and how long it is. Moreover, you may get gaps of unused memory in between the chunks because malloc has a requirement to return a sufficiently aligned block, most usually on an 8-byte boundary.
In the second case, the allocator gives you just one contiguous block and does bookkeeping only for that block.
Always be careful with a large number of small allocations, as the bookkeeping memory overhead may even outweigh the amount of the data itself.
The second allocation barely touches the memory. The allocator tells you "okay, you can have it" but if you don't actually touch the memory, the OS never actually gives it to you, hoping you'll never use it. Bit like a Ponzi scheme. On the other hand, the other method writes something (a few bytes at most) to many pages, so the OS is forced to actually give you the memory.
Try this to verify, you should get about 24m usage:
memset(ptr, 1, 1024 * 1024 * 24);
In short, top doesn't tell you how much you allocated, i.e. what you asked from malloc. It tells you what the OS allocated to your process.
In addition to what has been said:
It could be that some compilers notice how you allocate multiple 24 Byte Blocks in a loop, assigning their addresses to the same pointer and keeping only the last block you allocated, effectively rendering every other malloc from before useless. So it may optimize your whole loop into something like this:
ptr = (char *)malloc(24);
i = 1024*1024;

Manual allocation in a stringbuffer object

For a small to-be-embedded application, I wrote a few functions + struct that work as String Buffer (similar to std::stringstream in C++).
While the code as such works fine, There are a few not-so-minor problems:
I never before wrote functions in C that manually allocate and use growing memory, thus I'm afraid there are still some quirks that yet need to be adressed
It seems the code allocates far more memory than it actually needs, which is VERY BAD
Due to warnings reported by valgrind I have switched from malloc to calloc in one place in the code, which sucessfully removed the warning, but I'm not entirely sure if i'm actually using it correctly
Example of what I mean that it allocates more than it really needs (using a 56k file):
==23668== HEAP SUMMARY:
==23668== in use at exit: 0 bytes in 0 blocks
==23668== total heap usage: 49,998 allocs, 49,998 frees, 1,249,875,362 bytes allocated
... It just doesn't look right ...
The code in question is here (too large to copy it in a <code> field on SO): http://codepad.org/LQzphUzd
Help is needed, and I'm grateful for any advice!
The way you are growing your buffer is rather inefficient. For each little piece of string, you realloc() memory, which can mean new memory is allocated and the contents of the "old" memory are copied over. That is slow and fragments your heap.
Better is to grow in fixed amounts, or in fixed percentages, i.e. make the new size 1.5 or 2 times the size of the old size. That also wastes some memory, but will keep the heap more usable and not so many copies are made.
This means you'll have to keep track of two values: capacity (number of bytes allocated) and length (actual length of the string). But that should not be too hard.
I would introduce a function "FstrBuf_Grow" which takes care of all of that. You just call it with the amount of memory you want to add, and FstrBuf_Grow will take care that the capacity matches the requirements by reallocing when necessary and at least as much as necessary.
...
void FstrBuf_Grow(FstringBuf *buf, size_t more)
{
while (buf->length + more) > buf->capacity
buf->capacity = 3 * buf->capacity / 2;
buf->data = realloc(buf->data, buf->capacity + 1);
}
That multiplies capacity by 1.5 until data is large enough. You can choose different strategies, depending on your needs.
The strncat(ptr->data, str, len);, move before the ptr->length = ((ptr->length) + len); and use strncpy(ptr->data+ptr->length.... And the ptr = NULL; in the Destroy is useless.
The code of the "library" seems to be correct BUT be aware that you are continously reallocating the buffer. Normally you should try to grow the buffer only rarely (for example every time you need to grow the buffer you use max(2* the current size, 4) as the new size) because growing the buffer is O(n). The big memory allocation is probably because the first time you allocate a small buffer. Then you realloc it in a bigger buffer. Then you need to realloc it in a buffer even bigger and so the heap grows.
It looks like you're re-allocating the buffer on every append. Shouldn't you grow it only when you want to append more than it can hold?
When reallocating you want to increase the size of the buffer using a strategy that gives you the best trade off between the number of allocations and the amount of memory allocated. Just doubling the size of the buffer every time you hit the limit might not be ideal for an embedded program.
Generally for embedded applications it is much better to allocate a circular FIFO buffer 1-3 times the maximum message size.

Resources