gcc memory alignment using malloc

gcc memory alignment using malloc - c

I've the following struct:
#define M 3
#pragma pack(push)
#pragma pack(1)
struct my_btree_node {
struct my_btree_node *pointers[M];
unsigned char *keys[M - 1];
int data[M - 1];
unsigned char number_of_keys;
};
#pragma pack(pop)
The sizeof(struct my_btree_node) function returns a value of 49 byte for this struct. Does allocating memory for this struct using malloc return a 64 byte block because on 64 bit systems pointers are 16-byte-aligned or will it indeed be 49 bytes?
Is there a way to align memory with a smaller power of two than 16 and is it possible to get the true size of the allocated memory inside the application?
I want to reduce the number of padding bytes in order to save memory. My applications allocates millions of those structs and I do not want to waste memory.

malloc(3) is defined to
The malloc() and calloc() functions return a pointer to the allocated
memory, which is suitably aligned for any built-in type. On error,
these functions return NULL. NULL may also be returned by a
successful call to malloc() with a size of zero, or by a successful
call to calloc() with nmemb or size equal to zero.
So a conforming implementation has to return a pointer aligned to the largest possible machine alignment (with GCC, it is the macro __BIGGEST_ALIGNMENT__)
If you want less, implement your own allocation routine. You could for example allocate a large array of char and do your allocation inside it. That would be painful, perhaps slower (processors dislike unaligned data, e.g. because of CPU cache constraints), and probably not worthwhile (current computers have several gigabytes of RAM, so a few millions of hundred-byte sized data chunks is not a big deal).
BTW, malloc is practically implemented in the C standard library (but -on Linux at least- the compiler knows about it, thanks to __attribute__-s in GNU glibc headers; so some internal optimizations inside GCC know and take care of calls to malloc).

malloc uses internal heap-structure. It is implementation-dependent yet one may expect that the memory is allocated by a whole number of (internal) blocks. So usually it's not possible to allocate exactly 49 bytes by a single malloc call. You can build some subsystem of your own on top of malloc to do this, yet I see no reason why you may want it.
P.S. To reduce memory wasting, you can pre-allocate an array consisting of, say, 100 structs, when you need just one more, and return &a[i] until all free indexes are wasted. As arrays are never padded, the memory wasting would be reduced in about 100 times.

Related

Calloc vs. malloc for pointer to struct [duplicate]

What is the difference between doing:
ptr = malloc(MAXELEMS * sizeof(char *));
And:
ptr = calloc(MAXELEMS, sizeof(char*));
When is it a good idea to use calloc over malloc or vice versa?

calloc() gives you a zero-initialized buffer, while malloc() leaves the memory uninitialized.
For large allocations, most calloc implementations under mainstream OSes will get known-zeroed pages from the OS (e.g. via POSIX mmap(MAP_ANONYMOUS) or Windows VirtualAlloc) so it doesn't need to write them in user-space. This is how normal malloc gets more pages from the OS as well; calloc just takes advantage of the OS's guarantee.
This means calloc memory can still be "clean" and lazily-allocated, and copy-on-write mapped to a system-wide shared physical page of zeros. (Assuming a system with virtual memory.) The effects are visible with performance experiments on Linux, for example.
Some compilers even can optimize malloc + memset(0) into calloc for you, but it's best to just use calloc in the source if you want zeroed memory. (Or if you were trying to pre-fault it to avoid page faults later, that optimization will defeat your attempt.)
If you aren't going to ever read memory before writing it, use malloc so it can (potentially) give you dirty memory from its internal free list instead of getting new pages from the OS. (Or instead of zeroing a block of memory on the free list for a small allocation).
Embedded implementations of calloc may leave it up to calloc itself to zero memory if there's no OS, or it's not a fancy multi-user OS that zeros pages to stop information leaks between processes.
On embedded Linux, malloc could mmap(MAP_UNINITIALIZED|MAP_ANONYMOUS), which is only enabled for some embedded kernels because it's insecure on a multi-user system.

A less known difference is that in operating systems with optimistic memory allocation, like Linux, the pointer returned by malloc isn't backed by real memory until the program actually touches it.
calloc does indeed touch the memory (it writes zeroes on it) and thus you'll be sure the OS is backing the allocation with actual RAM (or swap). This is also why it is slower than malloc (not only does it have to zero it, the OS must also find a suitable memory area by possibly swapping out other processes)
See for instance this SO question for further discussion about the behavior of malloc

One often-overlooked advantage of calloc is that (conformant implementations of) it will help protect you against integer overflow vulnerabilities. Compare:
size_t count = get_int32(file);
struct foo *bar = malloc(count * sizeof *bar);
vs.
size_t count = get_int32(file);
struct foo *bar = calloc(count, sizeof *bar);
The former could result in a tiny allocation and subsequent buffer overflows, if count is greater than SIZE_MAX/sizeof *bar. The latter will automatically fail in this case since an object that large cannot be created.
Of course you may have to be on the lookout for non-conformant implementations which simply ignore the possibility of overflow... If this is a concern on platforms you target, you'll have to do a manual test for overflow anyway.

The documentation makes the calloc look like malloc, which just does zero-initialize the memory; this is not the primary difference! The idea of calloc is to abstract copy-on-write semantics for memory allocation. When you allocate memory with calloc it all maps to same physical page which is initialized to zero. When any of the pages of the allocated memory is written into a physical page is allocated. This is often used to make HUGE hash tables, for example since the parts of hash which are empty aren't backed by any extra memory (pages); they happily point to the single zero-initialized page, which can be even shared between processes.
Any write to virtual address is mapped to a page, if that page is the zero-page, another physical page is allocated, the zero page is copied there and the control flow is returned to the client process. This works same way memory mapped files, virtual memory, etc. work.. it uses paging.
Here is one optimization story about the topic:
http://blogs.fau.de/hager/2007/05/08/benchmarking-fun-with-calloc-and-zero-pages/

There's no difference in the size of the memory block allocated. calloc just fills the memory block with physical all-zero-bits pattern. In practice it is often assumed that the objects located in the memory block allocated with calloc have initilial value as if they were initialized with literal 0, i.e. integers should have value of 0, floating-point variables - value of 0.0, pointers - the appropriate null-pointer value, and so on.
From the pedantic point of view though, calloc (as well as memset(..., 0, ...)) is only guaranteed to properly initialize (with zeroes) objects of type unsigned char. Everything else is not guaranteed to be properly initialized and may contain so called trap representation, which causes undefined behavior. In other words, for any type other than unsigned char the aforementioned all-zero-bits patterm might represent an illegal value, trap representation.
Later, in one of the Technical Corrigenda to C99 standard, the behavior was defined for all integer types (which makes sense). I.e. formally, in the current C language you can initialize only integer types with calloc (and memset(..., 0, ...)). Using it to initialize anything else in general case leads to undefined behavior, from the point of view of C language.
In practice, calloc works, as we all know :), but whether you'd want to use it (considering the above) is up to you. I personally prefer to avoid it completely, use malloc instead and perform my own initialization.
Finally, another important detail is that calloc is required to calculate the final block size internally, by multiplying element size by number of elements. While doing that, calloc must watch for possible arithmetic overflow. It will result in unsuccessful allocation (null pointer) if the requested block size cannot be correctly calculated. Meanwhile, your malloc version makes no attempt to watch for overflow. It will allocate some "unpredictable" amount of memory in case overflow happens.

from an article Benchmarking fun with calloc() and zero pages on Georg Hager's Blog
When allocating memory using calloc(), the amount of memory requested is not allocated right away. Instead, all pages that belong to the memory block are connected to a single page containing all zeroes by some MMU magic (links below). If such pages are only read (which was true for arrays b, c and d in the original version of the benchmark), the data is provided from the single zero page, which – of course – fits into cache. So much for memory-bound loop kernels. If a page gets written to (no matter how), a fault occurs, the “real” page is mapped and the zero page is copied to memory. This is called copy-on-write, a well-known optimization approach (that I even have taught multiple times in my C++ lectures). After that, the zero-read trick does not work any more for that page and this is why performance was so much lower after inserting the – supposedly redundant – init loop.

Number of blocks:
malloc() assigns single block of requested memory,
calloc() assigns multiple blocks of the requested memory
Initialization:
malloc() - doesn't clear and initialize the allocated memory.
calloc() - initializes the allocated memory by zero.
Speed:
malloc() is fast.
calloc() is slower than malloc().
Arguments & Syntax:
malloc() takes 1 argument:
bytes
The number of bytes to be allocated
calloc() takes 2 arguments:
length
the number of blocks of memory to be allocated
bytes
the number of bytes to be allocated at each block of memory
void *malloc(size_t bytes);
void *calloc(size_t length, size_t bytes);
Manner of memory Allocation:
The malloc function assigns memory of the desired 'size' from the available heap.
The calloc function assigns memory that is the size of what’s equal to ‘num *size’.
Meaning on name:
The name malloc means "memory allocation".
The name calloc means "contiguous allocation".

calloc is generally malloc+memset to 0
It is generally slightly better to use malloc+memset explicitly, especially when you are doing something like:
ptr=malloc(sizeof(Item));
memset(ptr, 0, sizeof(Item));
That is better because sizeof(Item) is know to the compiler at compile time and the compiler will in most cases replace it with the best possible instructions to zero memory. On the other hand if memset is happening in calloc, the parameter size of the allocation is not compiled in in the calloc code and real memset is often called, which would typically contain code to do byte-by-byte fill up until long boundary, than cycle to fill up memory in sizeof(long) chunks and finally byte-by-byte fill up of the remaining space. Even if the allocator is smart enough to call some aligned_memset it will still be a generic loop.
One notable exception would be when you are doing malloc/calloc of a very large chunk of memory (some power_of_two kilobytes) in which case allocation may be done directly from kernel. As OS kernels will typically zero out all memory they give away for security reasons, smart enough calloc might just return it withoud additional zeroing. Again - if you are just allocating something you know is small, you may be better off with malloc+memset performance-wise.

There are two differences.
First, is in the number of arguments. malloc() takes a single argument (memory required in bytes), while calloc() needs two arguments.
Secondly, malloc() does not initialize the memory allocated, while calloc() initializes the allocated memory to ZERO.
calloc() allocates a memory area, the length will be the product of its parameters. calloc fills the memory with ZERO's and returns a pointer to first byte. If it fails to locate enough space it returns a NULL pointer.
Syntax: ptr_var = calloc(no_of_blocks, size_of_each_block);
i.e. ptr_var = calloc(n, s);
malloc() allocates a single block of memory of REQUSTED SIZE and returns a pointer to first byte. If it fails to locate requsted amount of memory it returns a null pointer.
Syntax: ptr_var = malloc(Size_in_bytes);
The malloc() function take one argument, which is the number of bytes to allocate, while the calloc() function takes two arguments, one being the number of elements, and the other being the number of bytes to allocate for each of those elements. Also, calloc() initializes the allocated space to zeroes, while malloc() does not.

Difference 1:
malloc() usually allocates the memory block and it is initialized memory segment.
calloc() allocates the memory block and initialize all the memory block to 0.
Difference 2:
If you consider malloc() syntax, it will take only 1 argument. Consider the following example below:
data_type ptr = (cast_type *)malloc( sizeof(data_type)*no_of_blocks );
Ex: If you want to allocate 10 block of memory for int type,
int *ptr = (int *) malloc(sizeof(int) * 10 );
If you consider calloc() syntax, it will take 2 arguments. Consider the following example below:
data_type ptr = (cast_type *)calloc(no_of_blocks, (sizeof(data_type)));
Ex: if you want to allocate 10 blocks of memory for int type and Initialize all that to ZERO,
int *ptr = (int *) calloc(10, (sizeof(int)));
Similarity:
Both malloc() and calloc() will return void* by default if they are not type casted .!

The calloc() function that is declared in the <stdlib.h> header offers a couple of advantages over the malloc() function.
It allocates memory as a number of elements of a given size, and
It initializes the memory that is allocated so that all bits are
zero.

malloc() and calloc() are functions from the C standard library that allow dynamic memory allocation, meaning that they both allow memory allocation during runtime.
Their prototypes are as follows:
void *malloc( size_t n);
void *calloc( size_t n, size_t t)
There are mainly two differences between the two:
Behavior: malloc() allocates a memory block, without initializing it, and reading the contents from this block will result in garbage values. calloc(), on the other hand, allocates a memory block and initializes it to zeros, and obviously reading the content of this block will result in zeros.
Syntax: malloc() takes 1 argument (the size to be allocated), and calloc() takes two arguments (number of blocks to be allocated and size of each block).
The return value from both is a pointer to the allocated block of memory, if successful. Otherwise, NULL will be returned indicating the memory allocation failure.
Example:
int *arr;
// allocate memory for 10 integers with garbage values
arr = (int *)malloc(10 * sizeof(int));
// allocate memory for 10 integers and sets all of them to 0
arr = (int *)calloc(10, sizeof(int));
The same functionality as calloc() can be achieved using malloc() and memset():
// allocate memory for 10 integers with garbage values
arr= (int *)malloc(10 * sizeof(int));
// set all of them to 0
memset(arr, 0, 10 * sizeof(int));
Note that malloc() is preferably used over calloc() since it's faster. If zero-initializing the values is wanted, use calloc() instead.

A difference not yet mentioned: size limit
void *malloc(size_t size) can only allocate up to SIZE_MAX.
void *calloc(size_t nmemb, size_t size); can allocate up about SIZE_MAX*SIZE_MAX.
This ability is not often used in many platforms with linear addressing. Such systems limit calloc() with nmemb * size <= SIZE_MAX.
Consider a type of 512 bytes called disk_sector and code wants to use lots of sectors. Here, code can only use up to SIZE_MAX/sizeof disk_sector sectors.
size_t count = SIZE_MAX/sizeof disk_sector;
disk_sector *p = malloc(count * sizeof *p);
Consider the following which allows an even larger allocation.
size_t count = something_in_the_range(SIZE_MAX/sizeof disk_sector + 1, SIZE_MAX)
disk_sector *p = calloc(count, sizeof *p);
Now if such a system can supply such a large allocation is another matter. Most today will not. Yet it has occurred for many years when SIZE_MAX was 65535. Given Moore's law, suspect this will be occurring about 2030 with certain memory models with SIZE_MAX == 4294967295 and memory pools in the 100 of GBytes.

Both malloc and calloc allocate memory, but calloc initialises all the bits to zero whereas malloc doesn't.
Calloc could be said to be equivalent to malloc + memset with 0 (where memset sets the specified bits of memory to zero).
So if initialization to zero is not necessary, then using malloc could be faster.

What is a memory block?

The C books keep talking about memory blocks, but I never understood what exactly they are. Is a memory block an array? A large memory cell? For example:
malloc(2*sizeof(int)); /*This allocates a block*/

A "memory block" is a contiguous chunk of memory.
An array in C is also a contiguous chunk of memory. However, using the less-generic term "array" has implications that the generic term "memory block" does not (you can, after all, have data of multiple types within a block, whereas the term "array" implies uniformity of use).
Using malloc gives you memory dynamically -- the alternative is to allocate memory from the stack, as with int my_ints[2], but that doesn't give you control over the size of the block after your function is already running, or let you allocate more blocks after your function has started.
Also, stack size is relatively limited.

A memory block is a group of one or more contiguous chars ("bytes" - see note) of (real or virtual) memory.
The malloc(size_t size) function allocates a memory block. The size is how large (in chars) the block should be. Note that sizeof(int) is the number of chars that an int consumes, so malloc(2*sizeof(int)); allocates a memory block that is large enough to store 2 ints.
Because C is designed for many very different architectures; there's no guarantee that there's any relationship between things in different memory blocks. For example, you can't allocate 2 memory blocks and then calculate the difference between them (at least, not without relying on implementation defined behaviour).
For arrays, there must be a relationship between elements in the array; and for structures there must be a relationship between members of the structure. For this reason, each array and each structure must be contained within a memory block.
Note: Historically, char was the smallest unit C handled, and "byte" was defined as 1 char (e.g. sizeof(char) == 1), even if CHAR_BIT happens to be something bizarre, like 9 or 16 or 32. Outside of C, "byte" has become synonymous with an 8-bit quantity, and is defined as an 8-bit unit in International standards (IEC 80000-13, IEEE 1541). The result of this is that the definition that C uses for "byte" is not an actual (International standard) byte, and it would be wrong to say that malloc() allocates (International standard) bytes, but correct to say that malloc() allocates chars or "non-standard things that were unfortunately called bytes by the C standard once upon a time".

Why are address are not consecutive when allocating single bytes?

I am dynamically allocating memory as follows:
char* heap_start1 = (char*) malloc(1);
char* heap_start2 = (char*) malloc(1);
When I do printf as follows surprisingly the addresses are not consecutives.
printf("%p, %p \n",heap_start1,heap_start2);
Result:
0x8246008, 0x8246018
As you can see there is a 15 bytes of extra memory that are left defragmented. It's definitely not because of word alignment. Any idea behind this peculiar alignment?
Thanks in advance!
I am using gcc in linux if that matters.

glibc's malloc, for small memory allocations less than 16 bytes, simply allocates the memory as 16 bytes. This is to prevent external fragmentation upon the freeing of this memory, where blocks of free memory are too small to be used in the general case to fulfill new malloc operations.
A block allocated by malloc must also be large enough to store the data required to track it in the data structure which stores free blocks.
This behaviour, while increasing internal fragmentation, decreases overall fragmentation throughout the system.
Source:
http://repo.or.cz/w/glibc.git/blob/HEAD:/malloc/malloc.c
(Read line 108 in particular)
/*
...
Minimum allocated size: 4-byte ptrs: 16 bytes (including 4 overhead)
...
*/
Furthermore, all addresses returned by the malloc call in glibc are aligned to: 2 * sizeof(size_t) bytes. Which is 64 bits for 32-bit systems (such as yours) and 128 bits for 64-bit systems.

At least three possible reasons:
malloc needs to produce memory that is suitably-aligned for all primitive types. Data for SSE instructions needs to be 128-bit aligned. (There may also be other 128-bit primitive types that your platform supports that don't occur to me at the moment.)
A typical implementation of malloc involves "over-allocation" in order to store bookkeeping information for a speedy free. Not sure if GCC on Linux does this.
It may be allocating guard bytes in order to allow detection of buffer overflows and so on.

malloc guarantees that returned memory is properly aligned for any basic type. Moreover, memory block could be padded with some guard bytes to check for memory corruption, it depends on settings.

If you want to allocate consecutive addresses you should allocate them on the same malloc
char *heap_start1, *heap_start2;
heap_start1 = (char*) malloc(2 * sizeof(char));
heap_start2 = heap_start1 + 1;

Aligned memory management?

I have a few related questions about managing aligned memory blocks. Cross-platform answers would be ideal. However, as I'm pretty sure a cross-platform solution does not exist, I'm mainly interested in Windows and Linux and to a (much) lesser extent Mac OS and FreeBSD.
What's the best way of getting a chunk of memory aligned on 16-byte boundaries? (I'm aware of the trivial method of using malloc(), allocating a little extra space and then bumping the pointer up to a properly aligned value. I'm hoping for something a little less kludge-y, though. Also, see below for additional issues.)
If I use plain old malloc(), allocate extra space, and then move the pointer up to where it would be correctly aligned, is it necessary to keep the pointer to the beginning of the block around for freeing? (Calling free() on pointers to the middle of the block seems to work in practice on Windows, but I'm wondering what the standard says and, even if the standard says you can't, whether it works in practice on all major OS's. I don't care about obscure DS9K-like OS's.)
This is the hard/interesting part. What's the best way to reallocate a memory block while preserving alignment? Ideally this would be something more intelligent than calling malloc(), copying, and then calling free() on the old block. I'd like to do it in place where possible.

If your implementation has a standard data type that needs 16-byte alignment (long long for example), malloc already guarantees that your returned blocks will be aligned correctly. Section 7.20.3 of C99 states The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object.
You have to pass back the exact same address into free as you were given by malloc. No exceptions. So yes, you need to keep the original copy.
See (1) above if you already have a 16-byte-alignment-required type.
Beyond that, you may well find that your malloc implementation gives you 16-byte-aligned addresses anyway for efficiency although it's not guaranteed by the standard. If you require it, you can always implement your own allocator.
Myself, I'd implement a malloc16 layer on top of malloc that would use the following structure:
some padding for alignment (0-15 bytes)
size of padding (1 byte)
16-byte-aligned area
Then have your malloc16() function call malloc to get a block 16 bytes larger than requested, figure out where the aligned area should be, put the padding length just before that and return the address of the aligned area.
For free16, you would simply look at the byte before the address given to get the padding length, work out the actual address of the malloc'ed block from that, and pass that to free.
This is untested but should be a good start:
void *malloc16 (size_t s) {
unsigned char *p;
unsigned char *porig = malloc (s + 0x10); // allocate extra
if (porig == NULL) return NULL; // catch out of memory
p = (porig + 16) & (~0xf); // insert padding
*(p-1) = p - porig; // store padding size
return p;
}
void free16(void *p) {
unsigned char *porig = p; // work out original
porig = porig - *(porig-1); // by subtracting padding
free (porig); // then free that
}
The magic line in the malloc16 is p = (porig + 16) & (~0xf); which adds 16 to the address then sets the lower 4 bits to 0, in effect bringing it back to the next lowest alignment point (the +16 guarantees it is past the actual start of the maloc'ed block).
Now, I don't claim that the code above is anything but kludgey. You would have to test it in the platforms of interest to see if it's workable. Its main advantage is that it abstracts away the ugly bit so that you never have to worry about it.

I'm not aware of any way of requesting malloc return memory with stricter alignment than usual. As for "usual" on Linux, from man posix_memalign (which you can use instead of malloc() to get more strictly aligned memory if you like):
GNU libc malloc() always returns 8-byte aligned memory addresses, so
these routines are only needed if you require larger alignment values.
You must free() memory using the same pointer returned by malloc(), posix_memalign() or realloc().
Use realloc() as usual, including sufficient extra space so if a new address is returned that isn't already aligned you can memmove() it slightly to align it. Nasty, but best I can think of.

You could write your own slab allocator to handle your objects, it could allocate pages at a time using mmap, maintain a cache of recently-freed addresses for fast allocations, handle all your alignment for you, and give you the flexibility to move/grow objects exactly as you need. malloc is quite good for general-purpose allocations, but if you know your data layout and allocation needs, you can design a system to hit those requirements exactly.

The trickiest requirement is obviously the third one, since any malloc() / realloc() based solution is hostage to realloc() moving the block to a different alignment.
On Linux, you could use anonymous mappings created with mmap() instead of malloc(). Addresses returned by mmap() are by necessity page-aligned, and the mapping can be extended with mremap().

Starting a C11, you have void *aligned_alloc( size_t alignment, size_t size ); primitives, where the parameters are:
alignment - specifies the alignment. Must be a valid alignment supported by the implementation.
size - number of bytes to allocate. An integral multiple of alignment
Return value
On success, returns the pointer to the beginning of newly allocated memory. The returned pointer must be deallocated with free() or realloc().
On failure, returns a null pointer.
Example:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *p1 = malloc(10*sizeof *p1);
printf("default-aligned addr: %p\n", (void*)p1);
free(p1);
int *p2 = aligned_alloc(1024, 1024*sizeof *p2);
printf("1024-byte aligned addr: %p\n", (void*)p2);
free(p2);
}
Possible output:
default-aligned addr: 0x1e40c20
1024-byte aligned addr: 0x1e41000

Experiment on your system. On many systems (especially 64-bit ones), you get 16-byte aligned memory out of malloc() anyway. If not, you will have to allocate the extra space and move the pointer (by at most 8 bytes on almost every machine).
For example, 64-bit Linux on x86/64 has a 16-byte long double, which is 16-byte aligned - so all memory allocations are 16-byte aligned anyway. However, with a 32-bit program, sizeof(long double) is 8 and memory allocations are only 8-byte aligned.
Yes - you can only free() the pointer returned by malloc(). Anything else is a recipe for disaster.
If your system does 16-byte aligned allocations, there isn't a problem. If it doesn't, then you'll need your own reallocator, which does a 16-byte aligned allocation and then copies the data - or that uses the system realloc() and adjusts the realigned data when necessary.
Double check the manual page for your malloc(); there may be options and mechanisms to tweak it so it behaves as you want.
On MacOS X, there is posix_memalign() and valloc() (which gives a page-aligned allocation), and there is a whole series of 'zoned malloc' functions identified by man malloc_zoned_malloc and the header is <malloc/malloc.h>.

You might be able to jimmy (in Microsoft VC++ and maybe other compilers):
#pragma pack(16)
such that malloc( ) is forced to return a 16-byte-aligned pointer. Something along the lines of:
ptr_16byte = malloc( 10 * sizeof( my_16byte_aligned_struct ));
If it worked at all for malloc( ), I'd think it would work for realloc( ) just as well.
Just a thought.
-- pete

Difference between malloc and calloc?

What is the difference between doing:
ptr = malloc(MAXELEMS * sizeof(char *));
And:
ptr = calloc(MAXELEMS, sizeof(char*));
When is it a good idea to use calloc over malloc or vice versa?

A less known difference is that in operating systems with optimistic memory allocation, like Linux, the pointer returned by malloc isn't backed by real memory until the program actually touches it.
calloc does indeed touch the memory (it writes zeroes on it) and thus you'll be sure the OS is backing the allocation with actual RAM (or swap). This is also why it is slower than malloc (not only does it have to zero it, the OS must also find a suitable memory area by possibly swapping out other processes)
See for instance this SO question for further discussion about the behavior of malloc

One often-overlooked advantage of calloc is that (conformant implementations of) it will help protect you against integer overflow vulnerabilities. Compare:
size_t count = get_int32(file);
struct foo *bar = malloc(count * sizeof *bar);
vs.
size_t count = get_int32(file);
struct foo *bar = calloc(count, sizeof *bar);
The former could result in a tiny allocation and subsequent buffer overflows, if count is greater than SIZE_MAX/sizeof *bar. The latter will automatically fail in this case since an object that large cannot be created.
Of course you may have to be on the lookout for non-conformant implementations which simply ignore the possibility of overflow... If this is a concern on platforms you target, you'll have to do a manual test for overflow anyway.

The documentation makes the calloc look like malloc, which just does zero-initialize the memory; this is not the primary difference! The idea of calloc is to abstract copy-on-write semantics for memory allocation. When you allocate memory with calloc it all maps to same physical page which is initialized to zero. When any of the pages of the allocated memory is written into a physical page is allocated. This is often used to make HUGE hash tables, for example since the parts of hash which are empty aren't backed by any extra memory (pages); they happily point to the single zero-initialized page, which can be even shared between processes.
Any write to virtual address is mapped to a page, if that page is the zero-page, another physical page is allocated, the zero page is copied there and the control flow is returned to the client process. This works same way memory mapped files, virtual memory, etc. work.. it uses paging.
Here is one optimization story about the topic:
http://blogs.fau.de/hager/2007/05/08/benchmarking-fun-with-calloc-and-zero-pages/

from an article Benchmarking fun with calloc() and zero pages on Georg Hager's Blog
When allocating memory using calloc(), the amount of memory requested is not allocated right away. Instead, all pages that belong to the memory block are connected to a single page containing all zeroes by some MMU magic (links below). If such pages are only read (which was true for arrays b, c and d in the original version of the benchmark), the data is provided from the single zero page, which – of course – fits into cache. So much for memory-bound loop kernels. If a page gets written to (no matter how), a fault occurs, the “real” page is mapped and the zero page is copied to memory. This is called copy-on-write, a well-known optimization approach (that I even have taught multiple times in my C++ lectures). After that, the zero-read trick does not work any more for that page and this is why performance was so much lower after inserting the – supposedly redundant – init loop.

Number of blocks:
malloc() assigns single block of requested memory,
calloc() assigns multiple blocks of the requested memory
Initialization:
malloc() - doesn't clear and initialize the allocated memory.
calloc() - initializes the allocated memory by zero.
Speed:
malloc() is fast.
calloc() is slower than malloc().
Arguments & Syntax:
malloc() takes 1 argument:
bytes
The number of bytes to be allocated
calloc() takes 2 arguments:
length
the number of blocks of memory to be allocated
bytes
the number of bytes to be allocated at each block of memory
void *malloc(size_t bytes);
void *calloc(size_t length, size_t bytes);
Manner of memory Allocation:
The malloc function assigns memory of the desired 'size' from the available heap.
The calloc function assigns memory that is the size of what’s equal to ‘num *size’.
Meaning on name:
The name malloc means "memory allocation".
The name calloc means "contiguous allocation".

calloc is generally malloc+memset to 0
It is generally slightly better to use malloc+memset explicitly, especially when you are doing something like:
ptr=malloc(sizeof(Item));
memset(ptr, 0, sizeof(Item));
That is better because sizeof(Item) is know to the compiler at compile time and the compiler will in most cases replace it with the best possible instructions to zero memory. On the other hand if memset is happening in calloc, the parameter size of the allocation is not compiled in in the calloc code and real memset is often called, which would typically contain code to do byte-by-byte fill up until long boundary, than cycle to fill up memory in sizeof(long) chunks and finally byte-by-byte fill up of the remaining space. Even if the allocator is smart enough to call some aligned_memset it will still be a generic loop.
One notable exception would be when you are doing malloc/calloc of a very large chunk of memory (some power_of_two kilobytes) in which case allocation may be done directly from kernel. As OS kernels will typically zero out all memory they give away for security reasons, smart enough calloc might just return it withoud additional zeroing. Again - if you are just allocating something you know is small, you may be better off with malloc+memset performance-wise.

There are two differences.
First, is in the number of arguments. malloc() takes a single argument (memory required in bytes), while calloc() needs two arguments.
Secondly, malloc() does not initialize the memory allocated, while calloc() initializes the allocated memory to ZERO.
calloc() allocates a memory area, the length will be the product of its parameters. calloc fills the memory with ZERO's and returns a pointer to first byte. If it fails to locate enough space it returns a NULL pointer.
Syntax: ptr_var = calloc(no_of_blocks, size_of_each_block);
i.e. ptr_var = calloc(n, s);
malloc() allocates a single block of memory of REQUSTED SIZE and returns a pointer to first byte. If it fails to locate requsted amount of memory it returns a null pointer.
Syntax: ptr_var = malloc(Size_in_bytes);
The malloc() function take one argument, which is the number of bytes to allocate, while the calloc() function takes two arguments, one being the number of elements, and the other being the number of bytes to allocate for each of those elements. Also, calloc() initializes the allocated space to zeroes, while malloc() does not.

Difference 1:
malloc() usually allocates the memory block and it is initialized memory segment.
calloc() allocates the memory block and initialize all the memory block to 0.
Difference 2:
If you consider malloc() syntax, it will take only 1 argument. Consider the following example below:
data_type ptr = (cast_type *)malloc( sizeof(data_type)*no_of_blocks );
Ex: If you want to allocate 10 block of memory for int type,
int *ptr = (int *) malloc(sizeof(int) * 10 );
If you consider calloc() syntax, it will take 2 arguments. Consider the following example below:
data_type ptr = (cast_type *)calloc(no_of_blocks, (sizeof(data_type)));
Ex: if you want to allocate 10 blocks of memory for int type and Initialize all that to ZERO,
int *ptr = (int *) calloc(10, (sizeof(int)));
Similarity:
Both malloc() and calloc() will return void* by default if they are not type casted .!

The calloc() function that is declared in the <stdlib.h> header offers a couple of advantages over the malloc() function.
It allocates memory as a number of elements of a given size, and
It initializes the memory that is allocated so that all bits are
zero.

malloc() and calloc() are functions from the C standard library that allow dynamic memory allocation, meaning that they both allow memory allocation during runtime.
Their prototypes are as follows:
void *malloc( size_t n);
void *calloc( size_t n, size_t t)
There are mainly two differences between the two:
Behavior: malloc() allocates a memory block, without initializing it, and reading the contents from this block will result in garbage values. calloc(), on the other hand, allocates a memory block and initializes it to zeros, and obviously reading the content of this block will result in zeros.
Syntax: malloc() takes 1 argument (the size to be allocated), and calloc() takes two arguments (number of blocks to be allocated and size of each block).
The return value from both is a pointer to the allocated block of memory, if successful. Otherwise, NULL will be returned indicating the memory allocation failure.
Example:
int *arr;
// allocate memory for 10 integers with garbage values
arr = (int *)malloc(10 * sizeof(int));
// allocate memory for 10 integers and sets all of them to 0
arr = (int *)calloc(10, sizeof(int));
The same functionality as calloc() can be achieved using malloc() and memset():
// allocate memory for 10 integers with garbage values
arr= (int *)malloc(10 * sizeof(int));
// set all of them to 0
memset(arr, 0, 10 * sizeof(int));
Note that malloc() is preferably used over calloc() since it's faster. If zero-initializing the values is wanted, use calloc() instead.

A difference not yet mentioned: size limit
void *malloc(size_t size) can only allocate up to SIZE_MAX.
void *calloc(size_t nmemb, size_t size); can allocate up about SIZE_MAX*SIZE_MAX.
This ability is not often used in many platforms with linear addressing. Such systems limit calloc() with nmemb * size <= SIZE_MAX.
Consider a type of 512 bytes called disk_sector and code wants to use lots of sectors. Here, code can only use up to SIZE_MAX/sizeof disk_sector sectors.
size_t count = SIZE_MAX/sizeof disk_sector;
disk_sector *p = malloc(count * sizeof *p);
Consider the following which allows an even larger allocation.
size_t count = something_in_the_range(SIZE_MAX/sizeof disk_sector + 1, SIZE_MAX)
disk_sector *p = calloc(count, sizeof *p);
Now if such a system can supply such a large allocation is another matter. Most today will not. Yet it has occurred for many years when SIZE_MAX was 65535. Given Moore's law, suspect this will be occurring about 2030 with certain memory models with SIZE_MAX == 4294967295 and memory pools in the 100 of GBytes.

Both malloc and calloc allocate memory, but calloc initialises all the bits to zero whereas malloc doesn't.
Calloc could be said to be equivalent to malloc + memset with 0 (where memset sets the specified bits of memory to zero).
So if initialization to zero is not necessary, then using malloc could be faster.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight