Given the following code snippet in C:
int* x;
x = (int *) malloc(40);
We know that this is an explicit heap dynamic allocation.
If I change the code to this though:
int* x = (int *) malloc(40);
is it still an explicit heap dynamic? My friend thinks it's a stack dynamic, but I think its an explicit heap dynamic because we're allocating memory from the heap.
Explicit heap dynamic is defined as variables that are allocated and deallocated by explicit run-time instructions written by the programmer. Wouldn't that imply that any malloc/calloc call would be explicit heap?
Edit: I spoke to my professor and she clarified some stuff for me.
When we declare something like
char * str = (char *) malloc(15);
We say that str is of data type char pointer, and has a stack dynamic storage binding. However, when we are referring to the object referenced by str, we say that it is explicit heap dynamic.
First off, if you are going to store the results of malloc in a variable, then declare it properly as a pointer, not an int.
The problem with storing the malloc results in an int, is that you could possibly truncate the pointer and blow up when you dereference it later. For instance if you ran that expression on a 64 bit system, the malloc would come back with a 8 byte pointer. But since you are assigning it to a 4 byte int, it gets truncated. Not good.
Your code should be like this:
int* x = (int*)malloc(bla);
Anyways, the int pointer x itself is stored on the stack. But don't get the two confused. X itself is a pointer on the stack, but it points to memory allocated on the heap.
Note:
32 bit applications (usually) have 4 byte pointers.
64 bit applications (usually) have 8 byte pointers.
Yes, malloc allocates requested memory on the heap. Heap memory is used for dynamic data structures that grow and shrink.
From standard 6.3.2.3
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.
This claims why the type of the pointer should be int*. Also the casting is not needed - void* to int* conversion will be implicitly done over here.
Also why do you think declaring a variable with initializer or without it would affect the storage duration of the memory allocated by *alloc and it's friend functions. It is not.
Interesting thing x has automatic storage duration (usually this is realized using stack) but the memory it contains (yes type of x would be int*) - it will be of allocated storage duration. The thing is heap/stack are not something specified or mentioned by C standard. Most implementations usually realize allocated storage duration using heap.
Related
I read that:
sizeof operator returns the size of the data type, not the amount of memory allocated to the variable.
Isn't the amount of memory allocated depends on the size of the data type? I mean that sizeof will return 4 (architecture-dependent) when I pass int to it.
Am I missing something here?
sizeof returns the number of bytes that a variable or stack allocated array occupies.
Examples:
sizeof(char)=1 (in most configurations)
But sizeof(char*)=8 (depending on the platform)
If you dynamically allocate memory with malloc, you will receive a pointer to that block of memory. If use the sizeof on it, you will just get the size of the pointer.
However, sizeof() a stack allocated array like when you write int a[10] is the size of the allocated memory (so 4*10)
The size of the pointer doesn't depend on the size of the datatype it represents. (On 32 bit platforms, a pointer is 32bit)
The text you quote is technically incorrect. sizeof variable_name does return the size of memory that the variable called variable_name occupies.
The text makes a common mistake of conflating a pointer with the memory it points to. Those are two separate things. If a pointer points to an allocated block, then that block is not allocated to the pointer. (Nor are the contents of the block stored in the pointer -- another common mistake).
The allocation exists in its own right, the pointer variable exists elsewhere, and the pointer variable points to the allocation. The pointer variable could be changed to point elsewhere without disturbing the allocation.
sizeof returns the number of bytes
The sizeof operator yields the size (in bytes) of its operand, which may be an expression or the parenthesized name of a type.
but the size of each byte is not guaranteed to be 8. So you don't obtain directly the amount of memory allocated.
A byte is composed of a contiguous sequence of bits, the number of which is implementation-defined
anyway you can deduce the amount of memory allocated using the CHAR_BIT constant, which contains the number of bit is a byte.
"Memory allocation" in C typically refers to explicit allocation (i.e: on the heap - malloc() and friends), or implicit allocation (i.e: on the stack).
As you've defined, sizeof() returns the size of the data type:
sizeof(char) - a single char
sizeof(void *) - an void pointer
If you call malloc(sizeof(int)), you're requesting "enough memory to hold the data for an int", which may be 4 bytes on your system... you may find that more memory than you requested is allocated (though this will typically be hidden from you, see canaries).
Additionally, if you call int *x = malloc(1024), and sizeof(*x), you might get 4, because an int happens to be 4 bytes... even though the memory you've allocated is 1 KiB. If you were to incorrectly call sizeof(x), then you'll get the size of a pointer returned, not the size of the type it points to. Neither of these (sizeof(*x) or sizeof(x)) will return 1024.
malloc() function forms a single block of memory (say 20 bytes typecasted to int), so how it can be used as an array of int blocks like as calloc() function? Shouldn't it be used to store just one int value in whole 20 bytes (20*8 bits)?
(say 20 bytes typecasted to int)
No, the returned memory is given as a pointer to void, an incomplete type.
We assign the returned pointer to a variable of pointer to some type, and we can use that variable to access the memory.
Quoting C11, chapter §7.22.3, Memory management functions
[....] The
pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to
a pointer to any type of object with a fundamental alignment requirement and then used
to access such an object or an array of such objects in the space allocated (until the space
is explicitly deallocated). [...] The pointer returned points to the start (lowest byte address) of the
allocated space. [....]
Since the allocated memory is contiguous, pointer arithmetic works, just as in case of arrays, since in arrays also, elements are placed in contiguous memory.
One point to clarify, a pointer is not an array.
There's an abstract concept in C formally known as effective type, meaning the actual type of the data stored in memory. This is something the compiler keeps track of internally.
Most objects in C have such an effective type at the point when the variable is declared, for example if we type int a; then the effective type of what's stored in a is int.
Meaning it is legal to do evil things like this:
int a;
double* d = (double*)&a;
*(int*)d = 1;
This works because the effective type of the actual memory remains an int, even though we pointed at it with a wildly incompatible type. As long as we access it with the same type as the effective type, all is well. If we access the data using the wrong type, very bad things will happen, such as program crashes or dormant bugs.
But when we call malloc family of functions, we only tell them to reserve n number of bytes, with no type specified. This memory is guaranteed to be allocated in adjacent memory cells, but nothing else. The only difference between malloc and calloc is that the latter sets all values in this raw memory to zero. Neither function knows anything about types or arrays.
The returned chunk of raw memory has no effective type. Not until the point when we access it, then it gets the effective type which corresponds to the type used for the access.
So just as in the previous example, it doesn't matter which type of pointer we set to point at the data. It doesn't matter if we write int* i = malloc(n); or bananas_t* b = malloc(n);, because the pointed-at memory does not yet have a type. It does not get one until at the point where we access it for the first time.
There is nothing special about memory returned from malloc compared to memory returned from calloc, other that the fact that the bytes of the memory block returned by calloc are initialized to 0. Memory returned by malloc does not have to be used for a single object but may also be used for an array.
This means that the following are equivalent:
int *p1 = malloc(3 * sizeof(int));
p1[0] = 1;
p1[2] = 2;
p1[3] = 3;
...
int *p2 = calloc(3, sizeof(int));
p2[0] = 1;
p2[2] = 2;
p2[3] = 3;
Both will return 3 * sizeof(int) bytes of memory which can be used as an array of int of size 3.
What malloc returns back to you is just a pointer to the starting memory address where the contiguous block of memory was allocated.
The size of the contiguous block of memory that you allocated using malloc depends on the argument you passed into malloc function. http://www.cplusplus.com/reference/cstdlib/malloc/
If you want to store int variable then you will do it by defining the pointer type you use to be of an int type.
example:
int p*; //pointer of type integer
size_t size = 20;
p = (int *) malloc(size); //returns to pointer p the memory address
after this, using the pointer p the programmer can access int (4 byte precision) values.
calloc only difference against malloc is that calloc initiallizes all values at this memory block to zero.
So I'm a bit confused on how to make a function that will return a pointer to an array of ints in C. I understand that you cannot do:
int* myFunction() {
int myInt[aDefinedSize];
return myInt; }
because this is returning a pointer to a local variable.
So, I thought about this:
int* myFunction(){
int* myInt = (int) malloc(aDefinedSize * sizeof(int));
return myInt; }
This gives the error: warning cast from pointer to integer of different size
This implies to use this, which works:
int* myFunction(){
int* myInt = (int*) malloc(aDefinedSize * sizeof(int));
return myInt; }
What I'm confused by though is this:
the (int*) before the malloc was explained to me to do this: it tells the compiler what the datatype of the memory being allocated is. This is then used when, for example, you are stepping through the array and the compiler needs to know how many bytes to increment by.
So, if this explanation I was given is correct, isn't memory being allocated for aDefinedSize number of pointers to ints, not actually ints? Thus, isnt myInt a pointer to an array of pointers to ints?
Some help in understanding this would be wonderful. Thanks!!
So, if this explanation I was given is correct, isn't memory being allocated for aDefinedSize number of pointers to ints, not actually ints?
No, you asked malloc for aDefinedSize * sizeof(int) bytes, not
aDefinedSize * sizeof(int *) bytes. That's the size of memory you get, the type depends on the pointer used to access the memory.
Thus, isnt myInt a pointer to an array of pointers to ints?
No, since you defined it as a int *, a pointer-to-an-int.
Of course the pointer has no knowledge of how large the allocated memory are is, but only points at the first int that fits there. It's up to you as programmer to keep track of the size.
Note that you shouldn't use that explicit typecast. malloc returns a void *, that can be silently assigned to any pointer, as in here:
int* myInt = malloc(aDefinedSize * sizeof(int));
Arithmetic on the pointer works in strides of the pointed-to type, i.e. with int *p, p[3] is the same as *(p+3), which means roughly "go to p, go forward three times sizeof(int) in bytes, and access that location".
int **q would be a pointer-to-a-pointer-to-an-int, and might point to an array of pointers.
malloc allocates an array of bytes and returns void* pointing to the first byte. Or NULL if the allocation failed.
To treat this array as an array of a different data type, the pointer must be cast to that data type.
In C, void* implicitly casts to any data pointer type, so no explicit cast is required:
int* allocateIntArray(unsigned number_of_elements) {
int* int_array = malloc(number_of_elements * sizeof(int)); // <--- no cast is required here.
return int_array;
}
Arrays in C
In C, you want to remember that an array is just an address in memory, plus a length and an object type. When you pass it as an argument to a function or a return value from a function, the length gets forgotten and it’s treated interchangeably with the address of the first element. This has led to a lot of security bugs in programs that either read or write past the end of a buffer.
The name of an array automatically converts to the address of its first element in most contexts, so you can for example pass either arrays or pointers to memmove(), but there are a few exceptions where the fact it also has a length matters. The sizeof() operator on an array is the number of bytes in the array, but sizeof() a pointer is the size of a pointer variable. So if we declare int a[SIZE];, sizeof(a) is the same as sizeof(int)*(size_t)(SIZE), whereas sizeof(&a[0]) is the same as sizeof(int*). Another important one is that the compiler can often tell at compile time if an array access is out of bounds, whereas it does not know which accesses to a pointer are safe.
How to Return an Array
If you want to return a pointer to the same, static array, and it’s fine that you’ll get the same array each time you call the function, you can do this:
#define ARRAY_SIZE 32U
int* get_static_array(void)
{
static int the_array[ARRAY_SIZE];
return the_array;
}
You must not call free() on a static array.
If you want to create a dynamic array, you can do something like this, although it is a contrived example:
#include <stdlib.h>
int* make_dynamic_array(size_t n)
// Returns an array that you must free with free().
{
return calloc( n, sizeof(int) );
}
The dynamic array must be freed with free() when you no longer need it, or the program will leak memory.
Practical Advice
For anything that simple, you would actually write:
int * const p = calloc( n, sizeof(int) );
Unless for some reason the array pointer would change, such as:
int* p = calloc( n, sizeof(int) );
/* ... */
p = realloc( p, new_size );
I would recommend calloc() over malloc() as a general rule, because it initializes the block of memory to zeroes, and malloc() leaves the contents unspecified. That means, if you have a bug where you read uninitialized memory, using calloc() will always give you predictable, reproducible results, and using malloc() could give you different undefined behavior each time. In particular, if you allocate a pointer and then dereference it on an implementation where 0 is a trap value for pointers (like typical desktop CPUs), a pointer created by calloc() will always give you a segfault immediately, while a garbage pointer created by malloc() might appear to work, but corrupt any part of memory. That kind of bug is a lot harder to track down. It’s also easier to see in the debugger that memory is or is not zeroed out than whether an arbitrary value is valid or garbage.
Further Discussion
In the comments, one person objects to some of the terminology I used. In particular, C++ offers a few different kinds of ways to return a reference to an array that preserve more information about its type, for example:
#include <array>
#include <cstdlib>
using std::size_t;
constexpr size_t size = 16U;
using int_array = int[size];
int_array& get_static_array()
{
static int the_array[size];
return the_array;
}
std::array<int, size>& get_static_std_array()
{
static std::array<int, size> the_array;
return the_array;
}
So, one commenter (if I understand correctly) objects that the phrase “return an array” should only refer to this kind of function. I use the phrase more broadly than that, but I hope that clarifies what happens when you return the_array; in C. You get back a pointer. The relevance to you is that you lose the information about the size of the array, which makes it very easy to write security bugs in C that read or write past the block of memory allocated for an array.
There was also some kind of objection that I shouldn’t have told you that using calloc() instead of malloc() to dynamically allocate structures and arrays that contain pointers will make almost all modern CPUs segfault if you dereference those pointers before you initialize them. For the record: this is not true of absolutely all CPUs, so it’s not portable behavior. Some CPUs will not trap. Some old mainframes will trap on a special pointer value other than zero. However, it’s come in very handy when I’ve coded on a desktop or workstation. Even if you’re running on one of the exceptions, at least your pointers will have the same value each time, which should make the bug more reproducible, and when you debug and look at the pointer, it will be immediately obvious that it’s zero, whereas it will not be immediately obvious that a pointer is garbage.
I recently studied about malloc() in C with declaration as follows:
void *malloc(size_t size)
where size_t is unsigned int and size defines the no. of bytes to be reserved.
Question is that on my system float values occupy 4bytes of memory. So if i make memory pointer(of float type) using malloc of 2bytes,
float *p;
p = (float *)malloc(2);
then how come it does not give any error? Because what i think is that float data required 4 bytes so if i issue only 2 bytes to it then it may lead to some data loss.
or is it that i m understanding malloc() incorrrectly?
In the example you give, if you only allocate 2 bytes for a float * and then attempt to write to that location by dereferencing the pointer, you'll be writing to memory that hasn't been allocated. This results in undefined behavior. That means it might work, it might core dump, or it might behave in unpredictable ways.
If you want to allocate memory for one or more floats, you would do it like this:
// allocates space for an array of 5 floats
// don't cast the result of malloc
int arrayLen = 5;
float *f = malloc(sizeof(float) * arrayLen);
You're encountering an implementation-specific result of the requirements of the C Standard:
7.22.3 Memory management functions
The order and contiguity of storage allocated by successive calls to
the aligned_alloc , calloc , malloc , and realloc functions
is unspecified. The pointer returned if the allocation succeeds is
suitably aligned so that it may be assigned to a pointer to any type
of object with a fundamental alignment requirement and then used to
access such an object or an array of such objects in the space
allocated (until the space is explicitly deallocated).
In order to provide storage "suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement", an implementation has to return memory from malloc() et al at specific offsets that are multiples of the most restrictive alignment requirement for the system. That's usually something like 8 or 16 bytes.
Given that each and every block returned has to be aligned that way, most implementations internally create blocks of memory in multiples of the alignment requirement.
So if your system has an 8-byte alignment requirement, your malloc() implementation is likely to actually give you an 8-byte block of memory even though you requested two bytes. Likewise, ask for 19 bytes and you'll likely get something like 24 in reality.
It's still undefined behavior to go beyond what you asked for, though. And undefined behavior does unfortunately include "works just fine".
This can be problematic if you try to use that pointer -actually the pointer is fine, it's the allocated memory it points to that's too small-, the reason the compiler doesn't recognize it as an error is the fact that the pointer isn't "aware" of what it points to , actually pointers are variables that contain memory addresses, so basically they're just a number, and in most cases ( as user694733 pointed out) the size of the pointer is the same whether it points to a short or a float.
what the compiler sees is a cast from (void*) to (float*) and to the compiler it's a totally valid cast.
Your question has actually nothing to do with the malloc, but rather to the data casting. In this case you have casted the bytes located starting from the address returned by malloc to float. Thus, if you later say *p = 0.0f; you will actually write 4 bytes to the mentioned memory area, but only 2 bytes are ligal to you to use since you have allocated only 2 bytes. Therefore, your code will compile and run with memory corruption (this either will result in crash, or in an unexpected runtime behavior later)
Reading some literature, I was able to grasp that realloc takes in a void pointer and a size variable and re-allocates the memory of the block the void pointer points to.
What will happen if realloc is called on an integer pointer (int *)
with a size of character? And vice versa.
What can be a possible application of this? (An example would definitely help.)
The realloc() function is the all-in-one memory management system.
If called with a null pointer and a non-zero size, it allocates memory.
If called with a valid pointer and a zero size, it frees memory.
If called with a valid pointer and a non-zero size, it changes the size of the allocated memory.
If you call realloc() with an invalid pointer — one which was not obtained from malloc(), calloc() or realloc() — then you get undefined behaviour.
You could pass realloc() an integer pointer to an allocated space of sizeof(char) bytes (1 byte), but you'd be in danger of invoking undefined behaviour. The problem is not with realloc(); it is with the code that was given an unusable integer pointer. Since only 1 byte was allocated but sizeof(int) is greater than 1 (on essentially all systems; there could be exceptions, but not for someone asking this question), there is no safe way to use that pointer except by passing it to free() or realloc().
Given:
int *pointer = malloc(sizeof(char));
you cannot do *pointer = 0; because there isn't enough space allocated (formally) for it to write to. You cannot do int x = *pointer; because there isn't enough space allocated (formally) for it to read from. The word 'formally' is there because in practice, the memory allocators allocate a minimum size chunk, which is often 8 or 16 bytes, so there actually is space after the one byte. However, your are stepping outside the bounds of what the standard guarantees, and it is possible to conceive of memory allocators that would hand you exactly one byte. So, don't risk it. An integer pointer to a single byte of allocated memory is unusable except as an argument to the memory allocation functions.
The first argument to realloc() is a void *. Since you're going to have a prototype in scope (#include <stdlib.h>), the compiler will convert the int * to a void * (if there's anything to do for such a cast), and as long as the space pointed at was allocated, everything will be fine; realloc() will change the allocation size, possibly returning the same pointer or possibly returning a different pointer, or it will release the space if the new size is zero bytes.
There is a vitally important requirement to non-NULL pointers that you pass to realloc: they must themselves come from a call to malloc, calloc or realloc, otherwise the behavior is undefined.
If you allocate a chunk of memory sufficient to store an int and then realloc for a char, you will always get back the same pointer, because sizeof(char) is less than or equal to the sizeof(int):
int* intPtr = malloc(sizeof(int));
int* otherPtr = realloc(intPtr, sizeof(char));
// intPtr == otherPtr
If you try it the other way around, you will almost certainly get back the same pointer as well, because memory allocators rarely, if ever, parcel the memory to chunks smaller than sizeof(int). However, the result is implementation-dependent, so theoretically you may get back a different address.
As far as the utility of any of the above exercises goes, it is not useful: realloc has been designed with the intention to help you manage variable-sized arrays, simplifying the code for growing the size of such arrays, and potentially reducing the number of allocations and copying. I do not see a reason for realloc-ing a scalar.
The type of the first parameter of realloc is void *, as you said yourself. So the function argument that you pass is converted to a void pointer, which is an implicit and safe conversion.
It's the same as if you called a function with a long int parameter with an int argument, essentially.
Realloc takes the size in bytes. If you do
int* a= malloc(sizeof(int));
and then
a=realloc(a,1);
of course a will now not be big enough for an int type and writing an int in it will give you odd behavior.
As stated by dasblinkenlight, realloc doesn't make sense for a scalar.
In your example:
int * a = malloc(sizeof(int));
int * b = realloc(a,sizeof(char));
Would result in a == b, simply because sizeof(char) < sizeof(int) and there is no reason for moving the data to a new location.
The difference is, that after the realloc you are writing your int to unallocated space, as you decreased the allocated space using realloc. Here this is only of theoretical relevance. Because of alignment it is very unlikely, that the os is reusing the freed memory. But you shouldn't rely on that. This is undefined behavior.
It could become relevant, if you resize the space of a long int to an int. This depends on your architecture.
Writing to unallocated space is like laying a time bomb and not knowing when it will explode. You should only read from / write to allocated space.