I'm trying to implement a new malloc that stores the size at the front of the malloc'ed region, and then returns a pointer to the incremented location (what comes after the stored unsigned int).
void* malloc_new(unsigned size) {
void* result = malloc(size + sizeof(unsigned));
((unsigned*)result)[0] = size;
result += sizeof(unsigned);
return result;
}
I'm having doubts regarding whether the
result += sizeof(unsigned);
line is correct (does what I want).
Say the original address in the heap for the malloc is X, and the size of unsigned is 4, I want the 'result' pointer to point to X + 4, right? Meaning that the memory location in the stack that stores the 'result' pointer should contain (the original heap address location + 4).
result += sizeof(unsigned); should give you at least a warning (pointer arithmetic on void * leads to undefined behavior).
unsigned *result = malloc(size + sizeof size);
result[0] = size;
return result + 1;
should be the easier way.
Please note that the returned memory is not well aligned for all possible datatypes. You will run into troubles if you are using this memory for double or other 64bit datatypes. You should use an 8 byte datatype uint64_t for storing the size, then the memory block afterwards is well aligned.
In addition to the problems noted in other answers with performing pointer arithmetic on void * pointers, you're also likely violating one of the restrictions the C standard places on memory returned from functions such as malloc().
7.22.3 Memory management functions, paragraph 1 of the C standard states:
The order and contiguity of storage allocated by successive calls to the aligned_alloc, calloc, malloc, and realloc functions is unspecified. The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object with a fundamental alignment requirement and then used to access such an object or an array of such objects in the space allocated (until the space is explicitly deallocated). The lifetime of an allocated object extends from the allocation until the deallocation. Each such allocation shall yield a pointer to an object disjoint from any other object. The pointer returned points to the start (lowest byte address) of the allocated space. If the space cannot be allocated, a null pointer is returned. If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.
Note the bolded part.
Unless your system has a fundamental alignment that's only four bytes (8 or 16 is much more typical), you are violating that restriction, and wil invoke undefined behavior per 6.3.2.3 Pointers, paragraph 7 for any object type with a fundamental alignment requirement larger than four bytes:
... If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. ...
How void pointer arithmetic is happening in GCC
C does not allow pointer arithmetic with void * pointer type.
GNU C allows it by considering the size of void is 1.
The result is void* so result += sizeof(unsigned); just happens to work on compatible compilers.
You can refactor your function into:
void *malloc_new(unsigned size) {
void* result = malloc(size + sizeof(unsigned));
((unsigned*)result)[0] = size;
result = (char*)result + sizeof(unsigned);
return result;
}
Side note, before void type and void* generic pointer existed in the C language, programmers used char* to represent a generic pointer.
You can do void* arithmetic if you cast the type first to char*, for instance. And then cast back to void*. To get better alignment, use 64 bit type for the size, e.g. uint64_t.
#define W_REF_VOID_PTR(ptr,offset) \
((void*)((char*) (ptr) + (offset)))
Related
malloc() function forms a single block of memory (say 20 bytes typecasted to int), so how it can be used as an array of int blocks like as calloc() function? Shouldn't it be used to store just one int value in whole 20 bytes (20*8 bits)?
(say 20 bytes typecasted to int)
No, the returned memory is given as a pointer to void, an incomplete type.
We assign the returned pointer to a variable of pointer to some type, and we can use that variable to access the memory.
Quoting C11, chapter ยง7.22.3, Memory management functions
[....] The
pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to
a pointer to any type of object with a fundamental alignment requirement and then used
to access such an object or an array of such objects in the space allocated (until the space
is explicitly deallocated). [...] The pointer returned points to the start (lowest byte address) of the
allocated space. [....]
Since the allocated memory is contiguous, pointer arithmetic works, just as in case of arrays, since in arrays also, elements are placed in contiguous memory.
One point to clarify, a pointer is not an array.
There's an abstract concept in C formally known as effective type, meaning the actual type of the data stored in memory. This is something the compiler keeps track of internally.
Most objects in C have such an effective type at the point when the variable is declared, for example if we type int a; then the effective type of what's stored in a is int.
Meaning it is legal to do evil things like this:
int a;
double* d = (double*)&a;
*(int*)d = 1;
This works because the effective type of the actual memory remains an int, even though we pointed at it with a wildly incompatible type. As long as we access it with the same type as the effective type, all is well. If we access the data using the wrong type, very bad things will happen, such as program crashes or dormant bugs.
But when we call malloc family of functions, we only tell them to reserve n number of bytes, with no type specified. This memory is guaranteed to be allocated in adjacent memory cells, but nothing else. The only difference between malloc and calloc is that the latter sets all values in this raw memory to zero. Neither function knows anything about types or arrays.
The returned chunk of raw memory has no effective type. Not until the point when we access it, then it gets the effective type which corresponds to the type used for the access.
So just as in the previous example, it doesn't matter which type of pointer we set to point at the data. It doesn't matter if we write int* i = malloc(n); or bananas_t* b = malloc(n);, because the pointed-at memory does not yet have a type. It does not get one until at the point where we access it for the first time.
There is nothing special about memory returned from malloc compared to memory returned from calloc, other that the fact that the bytes of the memory block returned by calloc are initialized to 0. Memory returned by malloc does not have to be used for a single object but may also be used for an array.
This means that the following are equivalent:
int *p1 = malloc(3 * sizeof(int));
p1[0] = 1;
p1[2] = 2;
p1[3] = 3;
...
int *p2 = calloc(3, sizeof(int));
p2[0] = 1;
p2[2] = 2;
p2[3] = 3;
Both will return 3 * sizeof(int) bytes of memory which can be used as an array of int of size 3.
What malloc returns back to you is just a pointer to the starting memory address where the contiguous block of memory was allocated.
The size of the contiguous block of memory that you allocated using malloc depends on the argument you passed into malloc function. http://www.cplusplus.com/reference/cstdlib/malloc/
If you want to store int variable then you will do it by defining the pointer type you use to be of an int type.
example:
int p*; //pointer of type integer
size_t size = 20;
p = (int *) malloc(size); //returns to pointer p the memory address
after this, using the pointer p the programmer can access int (4 byte precision) values.
calloc only difference against malloc is that calloc initiallizes all values at this memory block to zero.
I'm trying to use shared memory segments in POSIX and am having a lot of trouble figuring out if there is memory at a certain address.
I saw a solution that uses file_size = *(size_t *)ptr
Where ptr is the returned pointer from some call to mmap (.... )
I don't really understand how this works. What does *(size_t *) typecasting do? I assume it (size_t)*var would cast the value at pointer var to a size_t type. But then, when I put another asterisk... this would give me a pointer again, wouldn't it?
There is no general way to determine the size of the allocated memory to which a given pointer points. or even whether it points to a valid object. There might be some system-specific ways to determine something similar, but they're likely to be unreliable -- and they can't determine that a pointer points to a valid object, but not to the one that it's supposed to point to.
You'll just have to keep careful track of this information yourself.
The method you describe:
file_size = *(size_t *)ptr;
can work if the memory happens to have been allocated by something that specifically stores the size at the beginning of the allocated region -- but only if you already know that ptr is valid.
ptr could be a pointer of any type (other than a function pointer). The cast (size_t *) converts the value of ptr so you can treat it as a pointer to a size_t object (size_t is an unsigned integer type used to represent sizes). Dereferencing that size_t* value with the * dereference operator gives you the value of the size_t object.
Here's an example of a hypothetical allocation function that might work this way:
void *allocate(size_t size) {
void *result = malloc(sizeof (size_t) + size);
if (result != NULL) {
*(size_t*)result = size;
}
return result;
}
and a function that gives you the currently allocated size:
size_t curr_size(void *ptr) {
return *(size_t*)ptr;
}
NOTE that this ignores alignment issues. If you're allocating memory for something that requires stricter alignment that size_t does, this can fail badly.
char arr[512] = {0};
int *ptr = (int *)arr; // WRONG
// A bus error can be caused by unaligned memory access
printf("%d\n", *ptr);
On the other hand:
The block that malloc gives you is guaranteed to be aligned so that it
can hold any type of data.
char *arr= malloc(512);
int *ptr = (int *)arr; // OK, arr is properly aligned for ptr
memset(arr, 0, 512);
printf("%d\n", *ptr);
Is this assumption correct or am I missing something?
The C standard guarantees that malloc will return memory suitably aligned for the most stringent fundamental type (for example uint64_t). If you have more stringent requirements you have to use aligned_alloc or something like it.
7.22.3
The pointer returned if the allocation succeeds is suitably aligned so
that it may be assigned to a pointer to any type of object with a
fundamental alignment requirement and then used to access such an
object or an array of such objects in the space allocated (until the
space is explicitly deallocated)
About aligned_alloc:
void *aligned_alloc(size_t alignment, size_t size);
The aligned_alloc function allocates space for an object whose
alignment is specified by alignment, whose size is specified by size,
and whose value is indeterminate.
Your code is correct as far as alignment is concerned. I don't particularly like the pointer conversion (char * to int *) but I think it should work fine.
I'm studying this malloc function and I could use some help:
static void *malloc(int size)
{
void *p;
if (size < 0)
error("Malloc error");
if (!malloc_ptr)
malloc_ptr = free_mem_ptr;
malloc_ptr = (malloc_ptr + 3) & ~3; /* Align */
p = (void *)malloc_ptr;
malloc_ptr += size;
if (free_mem_end_ptr && malloc_ptr >= free_mem_end_ptr)
error("Out of memory");
malloc_count++;
return p;
}
I know that the malloc func allocates memory space for any type, if there is enough memory, but the lines i don't understand are:
p = (void *)malloc_ptr;
malloc_ptr += size;
How can it point to any data type like that? I just can't understand that void pointer or its location.
NOTE: malloc_ptr is an unsigned long
The reason it returns a void pointer is because it has no idea what you are allocating space for in the malloc call. All it knows is the amount of space you requested. It is up to you or your compiler to decide what will fill the memory. The void pointer's location is typically implemented as a linked list to maintain integrity and know what values of memory are free which is surprisingly kept track of in the free function.
This is the implementation of malloc, so it is allowed to do things that would not be legitimate in a regular program. Specifically, it is making use of the implementation-defined conversion from unsigned long to void *. Program initialization sets malloc_ptr to the numeric address of a large block of unallocated memory. Then, when you ask for an allocation, malloc makes a pointer out of the current value of malloc_ptr and increases malloc_ptr by the number of bytes you asked for. That way, the next time you call malloc it will return a new pointer.
This is about the simplest possible implementation of malloc. Most notably, it appears not to ever reuse freed memory.
Malloc is returning a pointer for a chunk of completely unstructured, flat memory. The (void *) pointer means that it has no idea what it's pointing to (no structure), merely that it points to some memory of size size.
Outside of your call to malloc, you can then tell your program that this pointer has some structure. I.e., if you have a structure some_struct you can say: struct some_struct *pStruct = (struct some_struct *) malloc(sizeof(struct some_struct)).
See how malloc only knows the size of what it is going to allocate, but does not actually know it's structure? Your call to malloc is passing in no information about the structure, merely the size of how much memory to allocate.
This is C's way of being generic: malloc returns you a certain amount of memory and it's your job to cast it to the structured memory you need.
p = (void *)malloc_ptr;
malloc returns a void pointer, which indicates that it is a pointer to a region of unknown data type. The use of casting is only required in C++ due to the strong type system, whereas this is not the case in C. The lack of a specific pointer type returned from malloc is type-unsafe behaviour according to some programmers:
malloc allocates based on byte count but not on type.
malloc_ptr += size;
C implicitly casts from and to void*, so the cast will be done automatically. In C++ only conversion to void* would be done implicitly, for the other direction an explicit cast is required.
Wiki explanation about type casting: malloc function returns an untyped pointer type void *, which the calling code must cast to the appropriate pointer type. Older C specifications required an explicit cast to do so, therefore the code
(struct foo *) malloc(sizeof(struct foo))
became the accepted practice.
However, this practice is discouraged in ANSI C as it can mask a failure to include the header file in which malloc is defined, resulting in
downstream errors on machines where the int and pointer types are of different sizes,
such as the now-ubiquitous x86_64 architecture. A conflict arises in code that is
required to compile as C++, since the cast is necessary in that language.
As you see this both lines,
p = (void *)malloc_ptr;
malloc_ptr += size;
here you are having malloc_ptr of type unsigned long so we are type casting this variable to void type and then store it in p.
and in similar manner second one is denoting malloc_ptr = malloc_ptr + size;
And this both codes are for developer's comfortness as p is of type void pointer so in application when you use malloc then you don't know which type of memory block have to be return by function so this function is always returns this generic void pointer so we are able to typecast again in our application as per requirement.
and same in second code if you are enter size in negative then what happens with this condition
if (free_mem_end_ptr && malloc_ptr >= free_mem_end_ptr)
error("Out of memory");
Here is a little snippet of code from Wikipedia's article on malloc():
int *ptr;
ptr = malloc(10 * sizeof (*ptr)); // Without a cast
ptr = (int*)malloc(10 * sizeof (int)); // With a cast
I was wondering if someone could help me understand what is going on here. So, from what I know, it seems like this is what's happening:
1) initialize an integer pointer that points to NULL. It is a pointer so its size is 4-bytes. Dereferencing this pointer will return the value NULL.
2) Since C allows for this type of automatic casting, it is safe not to include a cast-to-int-pointer. I am having trouble deciphering what exactly is being fed into the malloc function though (and why). It seems like we are getting the size of the dereferenced value of ptr. But isn't this NULL? So the size of NULL is 0, right? And why are we multiplying by 10??
3) The last line is just the same thing as above, except that a cast is explicitly declared. (cast from void pointer to int pointer).
I'm assuming we're talking about C here. The answer is different for C++.
1) is entirely off. ptr is a pointer to an int, that's all. It's uninitialized, so it has no deterministic value. Dereferencing it is undefined behaviour -- you will most certainly not get 0 out! The pointer also will most likely not point to 0. The size of ptr is sizeof(ptr), or sizeof(int*); nothing else. (At best you know that this is no larger than sizeof(void*).)
2/3) In C, never cast the result of malloc: int * p = malloc(sizeof(int) * 10);. The code allocates enough memory for 10 integers, i.e. 10 times the size of a single integer; the return value of the call is a pointer to that memory.
The first line declares a pointer to an integer, but doesn't initialize it -- so it points at some random piece of memory, probably invalid. The size of ptr is whatever size pointers to int are, likely either 4 or 8 bytes. The size of what it points at, which you'd get by dereferencing it when it points somewhere valid, is whatever size an int has.
The second line allocates enough memory for 10 ints from the heap, then assigns it to ptr. No cast is used, but the void * returned by malloc() is automatically converted to whatever type of pointer is needed when assigned. The sizeof (*ptr) gives the size of the dereferenced ptr, i.e. the size of what ptr points to (an int). For sizeof, it doesn't matter whether ptr actually points to a valid memory, just what the type would be.
The third line is just like the second, but with two changes: It explicitly casts the void * return from malloc() to an int *, to match the type of ptr; and it uses sizeof with the type name int rather than an expression of that type, like *ptr. The explicit cast is not necessary, and some people strongly oppose its use, but in the end it comes down to preference.
After either of the malloc()s ptr should point to a valid location on the heap and can be dereferenced safely, as long as malloc was successful.
For line 2 malloc() is allocating enough memory to hold 10 pointers.
malloc() is a general purpose function void so it must be cast to whatever type you actually want to use, in the above example pointer to int.