Why does calloc require two parameters and malloc just one? - c

It's very bothersome for me to write calloc(1, sizeof(MyStruct)) all the time. I don't want to use an idea like wrapping this method and etc. I mean I want to know what two parameters gives me? If it gives something, why doesn't mallochave two parameters too?
By the way, I searched for an answer to this question but I didn't find a really good answer. Those answers was that calloc can allocate larger blocks than malloc can and etc.
I saw another answer that calloc allocates an array. With malloc I can multiply and I'll get an array and I can use it without 1, at the start.

Historical reasons.
At the time of when calloc was introduced, the malloc function didn't exist and the calloc function would provide the correct alignment for one element object.
When malloc was introduced afterwards, it was decided the memory returned would be properly aligned for any use (which costs more memory) and so only one parameter was necessary. The API for calloc was not changed but calloc now also returns memory properly aligned for any use.
EDIT:
See the discussion in the comments and the interesting input from #JimBalter.
My first statement regarding the introduction of malloc and calloc may be totally wrong.
Also the real reasons could also be well unrelated to alignment. C history has been changed a lot by compiler implementers. malloc and calloc could come from different groups / compilers implementers and this would explain the API difference. And I actually favor this explanation as the real reason.

The only reason I could come up with is that
int *foo = calloc(42, sizeof *foo);
is one character shorter than
int *foo = malloc(42 * sizeof *foo);
The real reason is apparently lost to the millennia centuries decades of C history and needs a programming language archaeologist to unearth, but might be related to the following fact:
In contrast to malloc() - which needs to return a memory block aligned in accordance to the full block size - when using calloc() as intended, the memory block would only need to be aligned in accordance to the size passed as second argument. However, the C standard forbids this optimization in conforming implementations.

it is just by design.
you could write your own calloc
void *mycalloc(size_t num, size_t size)
{
void *block = malloc(num * size);
if(block != NULL)
memset(block, 0, num * size);
return block;
}

You shouldn't allocate objects with calloc (or malloc or anything like that). Even though calloc zero-initializes it, the object is still hasn't been constructed as far as C++ is concerned. Use constructors for that:
class MyClass
{
private:
short m_a;
int m_b;
long m_c;
float m_d;
public:
MyClass() : m_a(0), m_b(0), m_c(0), m_d(0.0) {}
};
And then instantiate it with new (or on the stack if you can):
MyClass* mc = new MyClass();

Related

Can realloc fail when shrinking buffer? [duplicate]

If do the next:
int* array = malloc(10 * sizeof(int));
and them I use realloc:
array = realloc(array, 5 * sizeof(int));
On the second line (and only it), can it return NULL?
Yes, it can. There are no implementation guarantees on realloc(), and it can return a different pointer even when shrinking.
For example, if a particular implementation uses different pools for different object sizes, realloc() may actually allocate a new block in the pool for smaller objects and free the block in the pool for larger objects. Thus, if the pool for smaller objects is full, it will fail and return NULL.
Or it may simply decide it's better to move the block
I just used the following program to get size of actually allocated memory with glibc:
#include <stdlib.h>
#include <stdio.h>
int main()
{
int n;
for (n = 0; n <= 10; ++n)
{
void* array = malloc(n * sizeof(int));
size_t* a2 = (size_t*) array;
printf("%d -> %zu\n", n, a2[-1]);
}
}
and for n <= 6, it allocates 32 bytes, and for 7-10 it is 48.
So, if it shrank int[10] to int[5], the allocated size would shrink from 48 to 32, effectively giving 16 free bytes. Since (as it just has been noted) it won't allocate anything less than 32 bytes, those 16 bytes are lost.
If it moved the block elsewhere, the whole 48 bytes will be freed, and something could actually be put in there. Of course, that's just a science-fiction story and not a real implementation ;).
The most relevant quote from the C99 standard (7.20.3.4 The realloc function):
Returns
4 The realloc function returns a pointer to the new object (which may have the same value as a pointer to the old object), or a null pointer if the new object could not be allocated.
'May' is the key-word here. It doesn't mention any specific circumstances when that can happen, so you can't rely on any of them, even if they sound obvious at a first glance.
By the way, I think you could consider realloc() somewhat deprecated. If you'd take a look at C++, the newer memory allocation interfaces (new / delete and allocators) don't even support such a thing. They always expect you to allocate a new block. But that's just a loose comment.
The other answers have already nailed the question, but assuming you know the realloc call is a "trimming", you can wrap it with:
void *safe_trim(void *p, size_t n) {
void *p2 = realloc(p, n);
return p2 ? p2 : p;
}
and the return value will always point to an object of size n.
In any case, since the implementation of realloc knows the size of the object and can therefore determine that it's "trimming", it would be pathologically bad from a quality-of-implementation standpoint not to perform the above logic internally. But since realloc is not required to do this, you should do it yourself, either with the above wrapper or with analogous inline logic when you call realloc.
The language (and library) specification makes no such guarantee, just like it does not guarantee that a "trimming" realloc will preserve the pointer value.
An implementation might decide to implement realloc in the most "primitive" way: by doing an unconditional malloc for a new memory block, copying the data and free-ing the old block. Obviously, such implementation can fail in low-memory situations.
Don't count on it. The standard makes no such provision; it merely states "or a null pointer if the new object could not be allocated".
You'd be hard-pressed to find such an implementation, but according to the standard it would still be compliant.
I suspect there may be a theoretical possibility for failure in the scenario you describe.
Depending on the heap implementation, there may be no such a thing as trimming an existing allocation block. Instead a smaller block is allocated first, then the data is copied from the old one, and then it's freed.
For instance this may be the case with bucket-heap strategy (used by some popular heaps, such as tcmalloc).
A bit late, but there is at least one popular implementation which realloc() with a smaler size can fail: TCMalloc. (At least as far as i understand the code)
If you read the file tcmalloc.cc, in the function do_realloc_with_callback(), you will see that if you shrink enough (50% of alloced memory, otherwise it will be ignored), TCMalloc will alloc the new memory first (and possible fail) and then copy it and remove the old memory.
I do not copy the source code, because i am not sure if the copyrights (of TCMalloc and Stackoverflow) will allow that, but here is a link to the source (revision as at May 17, 2019).
realloc will not fails in shrinking the existing memory, so it will not return NULL. It can return NULL only if fails during expansion.
But shrinking can fail in some architecture, where realloc can be implemented in a different manner like allocating a smaller size memory separately and freeing the old memory to avoid fragmentation. In that case shrinking memory can return NULL. But its very rare implementation.
But its better to be in a safer side, to keep NULL checks after shrinking the memory also.

C malloc function's size parameter

I am reading in a book that the malloc function in C takes the number of 'chunks' of memory you wish to allocate as a parameter and determines how many bytes the chunks are based on what you cast the value returned by malloc to. For example on my system an int is 4 bytes:
int *pointer;
pointer = (int *)malloc(10);
Would allocate 40 bytes because the compiler knows that ints are 4 bytes.
This confuses me for two reasons:
I was reading up, and the size parameter is actually the number of bytes you want to allocate and is not related to the sizes of any types.
Malloc is a function that returns an address. How does it adjust the size of the memory it allocated based on an external cast of the address it returned from void to a different type? Is it just some compiler magic I am supposed to accept?
I feel like the book is wrong. Any help or clarification is greatly appreciated!
Here is what the book said:
char *string;
string = (char *)malloc(80);
The 80 sets aside 80 chunks of storage. The chunk size is set by the typecast, (char *), which means that malloc() is finding storage for 80 characters of text.
Yes the book is wrong and you are correct please throw away that book.
Also, do let everyone know of the name of the book so we can permanently put it in our never to recommend black list.
Good Read:
What is the Best Practice for malloc?
When using malloc(), use the sizeof operator and apply it to the object being allocated, not the type of it.
Not a good idea:
int *pointer = malloc (10 * sizeof (int)); /* Wrong way */
Better method:
int *pointer = malloc (10 * sizeof *pointer);
Rationale: If you change the data type that pointer points to, you don't need to change the malloc() call as well. Maintenance win.
Also, this method is less prone to errors during code development. You can check that it's correct without having to look at the declaration in cases where the malloc() call occurs apart from the variable declaration.
Regarding your question about casting malloc(), note that there is no need for a cast on malloc() in C today. Also, if the data type should change in a future revision, any cast there would have to be changed as well or be another error source. Also, always make sure you have <stdlib.h> included. Often people put in the cast to get rid of a warning that is a result from not having the include file. The other reason is that it is required in C++, but you typically don't write C++ code that uses malloc().
The exact details about how malloc() works internally is not defined in the spec, in effect it is a black box with a well-defined interface. To see how it works in an implementation, you'd want to look at an open source implementation for examples. But malloc() can (and does) vary wildly on various platforms.

Can realloc fail (return NULL) when trimming?

If do the next:
int* array = malloc(10 * sizeof(int));
and them I use realloc:
array = realloc(array, 5 * sizeof(int));
On the second line (and only it), can it return NULL?
Yes, it can. There are no implementation guarantees on realloc(), and it can return a different pointer even when shrinking.
For example, if a particular implementation uses different pools for different object sizes, realloc() may actually allocate a new block in the pool for smaller objects and free the block in the pool for larger objects. Thus, if the pool for smaller objects is full, it will fail and return NULL.
Or it may simply decide it's better to move the block
I just used the following program to get size of actually allocated memory with glibc:
#include <stdlib.h>
#include <stdio.h>
int main()
{
int n;
for (n = 0; n <= 10; ++n)
{
void* array = malloc(n * sizeof(int));
size_t* a2 = (size_t*) array;
printf("%d -> %zu\n", n, a2[-1]);
}
}
and for n <= 6, it allocates 32 bytes, and for 7-10 it is 48.
So, if it shrank int[10] to int[5], the allocated size would shrink from 48 to 32, effectively giving 16 free bytes. Since (as it just has been noted) it won't allocate anything less than 32 bytes, those 16 bytes are lost.
If it moved the block elsewhere, the whole 48 bytes will be freed, and something could actually be put in there. Of course, that's just a science-fiction story and not a real implementation ;).
The most relevant quote from the C99 standard (7.20.3.4 The realloc function):
Returns
4 The realloc function returns a pointer to the new object (which may have the same value as a pointer to the old object), or a null pointer if the new object could not be allocated.
'May' is the key-word here. It doesn't mention any specific circumstances when that can happen, so you can't rely on any of them, even if they sound obvious at a first glance.
By the way, I think you could consider realloc() somewhat deprecated. If you'd take a look at C++, the newer memory allocation interfaces (new / delete and allocators) don't even support such a thing. They always expect you to allocate a new block. But that's just a loose comment.
The other answers have already nailed the question, but assuming you know the realloc call is a "trimming", you can wrap it with:
void *safe_trim(void *p, size_t n) {
void *p2 = realloc(p, n);
return p2 ? p2 : p;
}
and the return value will always point to an object of size n.
In any case, since the implementation of realloc knows the size of the object and can therefore determine that it's "trimming", it would be pathologically bad from a quality-of-implementation standpoint not to perform the above logic internally. But since realloc is not required to do this, you should do it yourself, either with the above wrapper or with analogous inline logic when you call realloc.
The language (and library) specification makes no such guarantee, just like it does not guarantee that a "trimming" realloc will preserve the pointer value.
An implementation might decide to implement realloc in the most "primitive" way: by doing an unconditional malloc for a new memory block, copying the data and free-ing the old block. Obviously, such implementation can fail in low-memory situations.
Don't count on it. The standard makes no such provision; it merely states "or a null pointer if the new object could not be allocated".
You'd be hard-pressed to find such an implementation, but according to the standard it would still be compliant.
I suspect there may be a theoretical possibility for failure in the scenario you describe.
Depending on the heap implementation, there may be no such a thing as trimming an existing allocation block. Instead a smaller block is allocated first, then the data is copied from the old one, and then it's freed.
For instance this may be the case with bucket-heap strategy (used by some popular heaps, such as tcmalloc).
A bit late, but there is at least one popular implementation which realloc() with a smaler size can fail: TCMalloc. (At least as far as i understand the code)
If you read the file tcmalloc.cc, in the function do_realloc_with_callback(), you will see that if you shrink enough (50% of alloced memory, otherwise it will be ignored), TCMalloc will alloc the new memory first (and possible fail) and then copy it and remove the old memory.
I do not copy the source code, because i am not sure if the copyrights (of TCMalloc and Stackoverflow) will allow that, but here is a link to the source (revision as at May 17, 2019).
realloc will not fails in shrinking the existing memory, so it will not return NULL. It can return NULL only if fails during expansion.
But shrinking can fail in some architecture, where realloc can be implemented in a different manner like allocating a smaller size memory separately and freeing the old memory to avoid fragmentation. In that case shrinking memory can return NULL. But its very rare implementation.
But its better to be in a safer side, to keep NULL checks after shrinking the memory also.

How can we find MEMORY SIZE from given memory pointer?

void func( int *p)
{
// Add code to print MEMORY SIZE which is pointed by pointer P.
}
int main()
{
int *p = (int *) malloc (10);
func(p);
}
How can we find MEMORY SIZE from memory pointer P in func() ?
There is no legal way to do this in C (or even C++ I believe). Yes, somehow free knows how much was malloced but it does so in a way that is not visible or accessible to the programmer. To you, the programmer, it might as well have done it by magic!
If you do decide to try and decode what malloc and free does then it will lead you down the road to proprietary compiler implementations. Be warned that even on the same OS, different compilers (or even different versions of the same compiler, or even the same compiler but using a third party malloc implementation (yes such things exist)) are allowed to do it differently.
When developing applications, to know the memory size allocated to a pointer, we actually pay attention at the moment we allocate memory for it, which in your case is:
int *p = (int *) malloc(10);
and then we store this information somewhere if we need to use it in the future. Something like this:
void func(int *p, size_t size)
{
printf("Memory address 0x%x has %d bytes allocated for it!\n", p, size);
}
int main()
{
int my_bytes = 10;
int *p = malloc(my_bytes);
func(p, my_bytes);
return 0;
}
Many years ago, I programmed on a UNIX-like system that had a msize stdlib function that would return pretty much what you want. Unfortunately, it never became part of any standard.
msize called on a pointer returned from malloc or realloc would return the actual amount of memory the system had allocated for the user program at that address (which might be more than was requested, if it got rounded up for alignment reasons or whatever.)
If you program for Microsoft Windows, you can use the Windows API Heap* functions instead of the functions provided by your programming language (C in your case). You allocate memory with HeapAlloc, reallocate with HeapReAlloc, free memory with HeapFree, and, finally, obtain the size of a previously allocated block with the HeapSize function.
Another option, of course, is to write wrapper functions for malloc and friends, that store an index of allocated blocks and their sizes. This way you can work with your own functions for allocating, reallocating, freeing, and measuring memory blocks. Writing such wrapper functions should be trivial (although I do not know C, so I cannot do it for you...).
Firstly, as all the others have said, there is no portable way to know the size in allocation unless you keep this information.
All that matters is the implementation of the standard C library. Most libraries only keep the size of memory allocated to a pointer, which is usually larger than the size your program requests. Letting the library keep the requested size is a bad idea because this costs extra memory and at times we do not care about the requested size.
Strictly speaking, recording the requested size is NOT a feature of a compiler. It seems to be sometimes because a compiler may reimplement part of the standard library and override the system default. However, I would not use a library recording requested size because it is likely to have bigger memory footprint due to the reason I said above.

Does malloc() allocate a contiguous block of memory?

I have a piece of code written by a very old school programmer :-) . it goes something like this
typedef struct ts_request
{
ts_request_buffer_header_def header;
char package[1];
} ts_request_def;
ts_request_def* request_buffer =
malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));
the programmer basically is working on a buffer overflow concept. I know the code looks dodgy. so my questions are:
Does malloc always allocate contiguous block of memory? because in this code if the blocks are not contiguous, the code will fail big time
Doing free(request_buffer) , will it free all the bytes allocated by malloc i.e sizeof(ts_request_def) + (2 * 1024 * 1024),
or only the bytes of the size of the structure sizeof(ts_request_def)
Do you see any evident problems with this approach, I need to discuss this with my boss and would like to point out any loopholes with this approach
To answer your numbered points.
Yes.
All the bytes. Malloc/free doesn't know or care about the type of the object, just the size.
It is strictly speaking undefined behaviour, but a common trick supported by many implementations. See below for other alternatives.
The latest C standard, ISO/IEC 9899:1999 (informally C99), allows flexible array members.
An example of this would be:
int main(void)
{
struct { size_t x; char a[]; } *p;
p = malloc(sizeof *p + 100);
if (p)
{
/* You can now access up to p->a[99] safely */
}
}
This now standardized feature allowed you to avoid using the common, but non-standard, implementation extension that you describe in your question. Strictly speaking, using a non-flexible array member and accessing beyond its bounds is undefined behaviour, but many implementations document and encourage it.
Furthermore, gcc allows zero-length arrays as an extension. Zero-length arrays are illegal in standard C, but gcc introduced this feature before C99 gave us flexible array members.
In a response to a comment, I will explain why the snippet below is technically undefined behaviour. Section numbers I quote refer to C99 (ISO/IEC 9899:1999)
struct {
char arr[1];
} *x;
x = malloc(sizeof *x + 1024);
x->arr[23] = 42;
Firstly, 6.5.2.1#2 shows a[i] is identical to (*((a)+(i))), so x->arr[23] is equivalent to (*((x->arr)+(23))). Now, 6.5.6#8 (on the addition of a pointer and an integer) says:
"If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined."
For this reason, because x->arr[23] is not within the array, the behaviour is undefined. You might still think that it's okay because the malloc() implies the array has now been extended, but this is not strictly the case. Informative Annex J.2 (which lists examples of undefined behaviour) provides further clarification with an example:
An array subscript is out of range, even if an object is apparently accessible with the
given subscript (as in the lvalue expression a[1][7] given the declaration int
a[4][5]) (6.5.6).
3 - That's a pretty common C trick to allocate a dynamic array at the end of a struct. The alternative would be to put a pointer into the struct and then allocate the array separately, and not forgetting to free it too. That the size is fixed to 2mb seems a bit unusual though.
This is a standard C trick, and isn't more dangerous that any other buffer.
If you are trying to show to your boss that you are smarter than "very old school programmer", this code isn't a case for you. Old school not necessarily bad. Seems the "old school" guy knows enough about memory management ;)
1) Yes it does, or malloc will fail if there isn't a large enough contiguous block available. (A failure with malloc will return a NULL pointer)
2) Yes it will. The internal memory allocation will keep track of the amount of memory allocated with that pointer value and free all of it.
3)It's a bit of a language hack, and a bit dubious about it's use. It's still subject to buffer overflows as well, just may take attackers slightly longer to find a payload that will cause it. The cost of the 'protection' is also pretty hefty (do you really need >2mb per request buffer?). It's also very ugly, although your boss may not appreciate that argument :)
I don't think the existing answers quite get to the essence of this issue. You say the old-school programmer is doing something like this;
typedef struct ts_request
{
ts_request_buffer_header_def header;
char package[1];
} ts_request_def;
ts_request_buffer_def* request_buffer =
malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));
I think it's unlikely he's doing exactly that, because if that's what he wanted to do he could do it with simplified equivalent code that doesn't need any tricks;
typedef struct ts_request
{
ts_request_buffer_header_def header;
char package[2*1024*1024 + 1];
} ts_request_def;
ts_request_buffer_def* request_buffer =
malloc(sizeof(ts_request_def));
I'll bet that what he's really doing is something like this;
typedef struct ts_request
{
ts_request_buffer_header_def header;
char package[1]; // effectively package[x]
} ts_request_def;
ts_request_buffer_def* request_buffer =
malloc( sizeof(ts_request_def) + x );
What he wants to achieve is allocation of a request with a variable package size x. It is of course illegal to declare the array's size with a variable, so he is getting around this with a trick. It looks as if he knows what he's doing to me, the trick is well towards the respectable and practical end of the C trickery scale.
As for #3, without more code it's hard to answer. I don't see anything wrong with it, unless its happening a lot. I mean, you don't want to allocate 2mb chunks of memory all the time. You also don't want to do it needlessly, e.g. if you only ever use 2k.
The fact that you don't like it for some reason isn't sufficient to object to it, or justify completely re-writing it. I would look at the usage closely, try to understand what the original programmer was thinking, look closely for buffer overflows (as workmad3 pointed out) in the code that uses this memory.
There are lots of common mistakes that you may find. For example, does the code check to make sure malloc() succeeded?
The exploit (question 3) is really up to the interface towards this structure of yours. In context this allocation might make sense, and without further information it is impossible to say if it's secure or not.
But if you mean problems with allocating memory bigger than the structure, this is by no means a bad C design (I wouldn't even say it's THAT old school... ;) )
Just a final note here - the point with having a char[1] is that the terminating NULL will always be in the declared struct, meaning there can be 2 * 1024 * 1024 characters in the buffer, and you don't have to account for the NULL by a "+1". Might look like a small feat, but I just wanted to point out.
I've seen and used this pattern frequently.
Its benefit is to simplify memory management and thus avoid risk of memory leaks. All it takes is to free the malloc'ed block. With a secondary buffer, you'll need two free. However one should define and use a destructor function to encapsulate this operation so you can always change its behavior, like switching to secondary buffer or add additional operations to be performed when deleting the structure.
Access to array elements is also slightly more efficient but that is less and less significant with modern computers.
The code will also correctly work if memory alignment changes in the structure with different compilers as it is quite frequent.
The only potential problem I see is if the compiler permutes the order of storage of the member variables because this trick requires that the package field remains last in the storage. I don't know if the C standard prohibits permutation.
Note also that the size of the allocated buffer will most probably be bigger than required, at least by one byte with the additional padding bytes if any.
Yes. malloc returns only a single pointer - how could it possibly tell a requester that it had allocated multiple discontiguous blocks to satisfy a request?
Would like to add that not is it common but I might also called it a standard practice because Windows API is full of such use.
Check the very common BITMAP header structure for example.
http://msdn.microsoft.com/en-us/library/aa921550.aspx
The last RBG quad is an array of 1 size, which depends on exactly this technique.
This common C trick is also explained in this StackOverflow question (Can someone explain this definition of the dirent struct in solaris?).
In response to your third question.
free always releases all the memory allocated at a single shot.
int* i = (int*) malloc(1024*2);
free(i+1024); // gives error because the pointer 'i' is offset
free(i); // releases all the 2KB memory
The answer to question 1 and 2 is Yes
About ugliness (ie question 3) what is the programmer trying to do with that allocated memory?
the thing to realize here is that malloc does not see the calculation being made in this
malloc(sizeof(ts_request_def) + (2 * 1024 * 1024));
Its the same as
int sz = sizeof(ts_request_def) + (2 * 1024 * 1024);
malloc(sz);
YOu might think that its allocating 2 chunks of memory , and in yr mind they are "the struct", "some buffers". But malloc doesnt see that at all.

Resources