Difference between two malloc definition of a struct - c

I would like to know if there's a real difference between this:
c = (struct_t *) malloc(sizeof(struct_t));
and this
c = malloc(sizeof(struct_t *));
Besides avoid the cast, is the compiler takes any advantage in the second form respect the first? Or the two ways are completely the same and is just a "aesthetical" question ?

The first allocates sizeof(struct_t) bytes, the second sizeof(struct_t*) bytes.
Apart from that, there is no difference in what malloc does, whether you cast the result or not. Casting the result makes the code acceptable to C++ compilers, but it can hide the mistake of not including stdlib.h, therefore it is widely preferred to not cast the result in C.

The two are totally different. The first allocates an instance of the struct, whereas the second allocates a pointer to the struct.
In general, they won't even allocate the same number of bytes.

No, they are not the same. The latter allocates 4 or 8 bytes of space for a pointer to struct, the first allocates enough space for the struct it self.
When sizeof(struct_t) is small enough, and when the malloc allocates actually more than requested, the user may not see the difference.

Two forms are different. They both allocate memory, but with different amounts.
General rule is as follows:
when allocating type T, the result of malloc shall be casted to T*.
void sample1()
{
struct pollfd *pfd = (struct pollfd*)malloc(sizeof(struct pollfd));
// pfd is points to a memory with a size of struct pollfd
...
free(pfd);
}
void sample2()
{
struct pollfd *pfd = (struct pollfd*)malloc(sizeof(*pfd));
// same as above, but uses variable type instead
free(pfd);
}
If you specify incorrect type in malloc argument, generally that will lead to buffer overrun problems:
void sample3()
{
struct x *px= (struct x*)malloc(sizeof(struct x*));
x->field = 5; //<< error, as allocated only 4 or 8 bytes depending on pointer size
}

Both are different.
Usually malloc returns (void*). So you want to typecast void* to (struct_t*).

Related

the difference between struct with flexible arrays members and struct with pointer members

I'm quit confused with the difference between flexible arrays and pointer as struct members. Someone suggested, struct with pointers need malloc twice. However, consider the following code:
struct Vector {
size_t size;
double *data;
};
int len = 20;
struct Vector* newVector = malloc(sizeof *newVector + len * sizeof*newVector->data);
printf("%p\n",newVector->data);//print 0x0
newVector->data =(double*)((char*)newVector + sizeof*newVector);
// do sth
free(newVector);
I find a difference is that the address of data member of Vector is not defined. The programmer need to convert to "find" the exactly address. However, if defined Vector as:
struct Vector {
size_t size;
double data[];
};
Then the address of data is defined.
I am wondering whether it is safe and able to malloc struct with pointers like this, and what is the exactly reason programmers malloc twice when using struct with pointers.
The difference is how the struct is stored. In the first example you over-allocate memory but that doesn't magically mean that the data pointer gets set to point at that memory. Its value after malloc is in fact indeterminate, so you can't reliably print it.
Sure, you can set that pointer to point beyond the part allocated by the struct itself, but that means potentially slower access since you need to go through the pointer each time. Also you allocate the pointer itself as extra space (and potentially extra padding because of it), whereas in a flexible array member sizeof doesn't count the flexible array member. Your first design is overall much more cumbersome than the flexible version, but other than that well-defined.
The reason why people malloc twice when using a struct with pointers could either be that they aren't aware of flexible array members or using C90, or alternatively that the code isn't performance-critical and they just don't care about the overhead caused by fragmented allocation.
I am wondering whether it is safe and able to malloc struct with pointers like this, and what is the exactly reason programmers malloc twice when using struct with pointers.
If you use pointer method and malloc only once, there is one extra thing you need to care of in the calculation: alignment.
Let's add one extra field to the structure:
struct Vector {
size_t size;
uint32_t extra;
double *data;
};
Let's assume that we are on system where each field is 4 bytes, there is no trailing padding on struct and total size is 12 bytes. Let's also assume that double is 8 bytes and requires alignment to 8 bytes.
Now there is a problem: expression (char*)newVector + sizeof*newVector no longer gives address that is divisible by 8. There needs to be manual padding of 4 bytes between structure and data. This complicates the malloc size calculation and data pointer offset calculation.
So the main reason you see 1 malloc pointer version less, is that it is harder to get right. With pointer and 2 mallocs, or flexible array member, compiler takes care of necessary alignment calculation and padding so you don't have to.

Why memory allocation for a structure in C works with any value given to malloc?

I tried to find the proper way to dynamically allocate memory for a structure that looks like this:
typedef struct myThread {
unsigned int threadId;
char threadPriority;
unsigned int timeSlice;
sem_t threadSem;
} myThread;
I remember, but I'm not sure, that, in some school paper, I saw that the proper way to allocate memory for this case is this one:
myThread *node = (myThread *)malloc(sizeof(myThread *));
I tried that and it worked, but I didn't understand why. Sizeof pointer for my architecture is 8 bytes, so by writing the instruction above, I'm allocating 8 bytes of continuous memory, not enough to hold the information needed in my structure. So I tried to allocate 1 byte of memory, like this:
myThread *node = (myThread *)malloc(1);
And it's still working.
I tried to find the answer for this behavior but I didn't succeed. Why is this working? Besides that, I have few more questions:
Which is the right way to dynamically allocate memory for a structure?
Is that cast necessary?
How is the structure stored in memory? I know that (*node).threadId is equivalent to node->threadId and this confuses me a bit because by dereferencing the pointer to the structure, I get the whole structure, and then I have to access a specific field. I was expecting to access fields knowing the address of the structure in this way: *(node) it's the value for the first element, *(node + sizeof(firstElement)) it's the value for the second and so on. I thought that accessing structure fields it's similar to accessing array values.
Thank you
Later Edit: Thank you for your answers, but I realized that I didn't explained myself properly. By saying that it works, I mean that it worked to store values in those specific fields of the structure and use them later. I tested that by filling up the fields and printing them afterwards. I wonder why is this working, why I can fill and work with fields of the structure for which I allocated just one byte of memory.
The below works in that they allocate memory - yet the wrong size.
myThread *node = (myThread *)malloc(sizeof(myThread *));// wrong size,s/b sizeof(myThread)
myThread *node = (myThread *)malloc(1); // wrong size
Why is this working?
When code attempts to save data to that address, the wrong size may or may not become apparent. It is undefined behavior (UB).
C is coding without training wheels. When code has UB like not allocating enough memory and using it, it does not have to fail, it might fail, now or later or next Tuesday.
myThread *node = (myThread *)malloc(1); // too small
node->timeSlice = 42; // undefined behavior
Which is the right way to dynamically allocate memory for a structure? #M.M
The below is easy to code right, review and maintain.
p = malloc(sizeof *p); //no cast, no type involved.
// or
number_of_elements = 1;
p = malloc(sizeof *p * number_of_elements);
// Robust code does error checking looking for out-of-memory
if (p == NULL) {
Handle_error();
}
Is that cast necessary?
No. Do I cast the result of malloc?
How is the structure stored in memory?
Each member followed by potential padding. It is implementation dependent.
unsigned int
maybe some padding
char
maybe some padding
unsigned int
maybe some padding
sem_t
maybe some padding
I wonder why is this working, why I can fill and work with fields of the structure for which I allocated just one byte of memory.
OP is looking for a reason why it works.
Perhaps memory allocation is done in chunks of 64-bytes or something exceeding sizeof *p so allocating 1 had same effect as sizeof *p.
Perhaps the later memory area now corrupted by code's use of scant allocation will manifest itself later.
Perhaps the allocater is a malevolent beast toying with OP, only to wipe out the hard drive next April 1. (Nefarious code often takes advantage of UB to infect systems - this is not so far-fetched)
Its all UB. Anything may happen.
Since memory allocation in C is quite error prone I always define macro functions NEW and NEW_ARRAY as in the example below. This makes memory allocation more safe and succinct.
#include <semaphore.h> /*POSIX*/
#include <stdio.h>
#include <stdlib.h>
#define NEW_ARRAY(ptr, n) \
{ \
(ptr) = malloc((sizeof (ptr)[0]) * (n)); \
if ((ptr) == NULL) { \
fprintf(stderr, "error: Memory exhausted\n"); \
exit(EXIT_FAILURE); \
} \
}
#define NEW(ptr) NEW_ARRAY((ptr), 1)
typedef struct myThread {
unsigned int threadId;
char threadPriority;
unsigned int timeSlice;
sem_t threadSem;
} myThread;
int main(void)
{
myThread *node;
myThread **nodes;
int nodesLen = 100;
NEW(node);
NEW_ARRAY(nodes, nodesLen);
/*...*/
free(nodes);
free(node);
return 0;
}
malloc reserves memory for you to use.
When you attempt to use more memory than you requested, several results are possible, including:
Your program accesses memory it should not, but nothing breaks.
Your program accesses memory it should not, and this damages other data that your program needs, so your program fails.
Your program attempts to access memory that is not mapped in its virtual address space, and a trap is caused.
Optimization by the compiler transforms your program in an unexpected way, and strange errors occur.
Thus, it would not be surprising either that your program appears to work when you fail to allocate enough memory or that your program breaks when you fail to allocate enough memory.
Which is the right way to dynamically allocate memory for a structure?
Good code is myThread *node = malloc(sizeof *node);.
Is that cast necessary?
No, not in C.
How is the structure stored in memory? I know that (*node).threadId is equivalent to node->threadId and this confuses me a bit because by dereferencing the pointer to the structure, I get the whole structure, and then I have to access a specific field. I was expecting to access fields knowing the address of the structure in this way: *(node) it's the value for the first element, *(node + sizeof(firstElement)) it's the value for the second and so on. I thought that accessing structure fields it's similar to accessing array values.
The structure is stored in memory as a sequence of bytes, as all objects in C are. You do not need to do any byte or pointer calculations because the compiler does it for you. When you write node->timeSlice, for example, the compiler takes the pointer node, adds the offset to the member timeSlice, and uses the result to access the memory where the member timeSlice is stored.
you do not allocate the right size doing
myThread *node = (myThread *)malloc(sizeof(myThread *));
the right way can be for instance
myThread *node = (myThread *)malloc(sizeof(myThread));
and the cast is useless so finally
myThread *node = malloc(sizeof(myThread));
or as said in remarks to your question
myThread *node = malloc(sizeof(*node));
The reason is you allocate a myThread not a pointer to, so the size to allocate is the size of myThread
If you allocate sizeof(myThread *) that means you want a myThread ** rather than a myThread *
I know that (*node).threadId is equivalent to node->threadI
yes, -> dereference while . does not
Having myThread node; to access the field threadId you do node.threadId, but having a pointer to you need to deference whatever the way
Later Edit: ...
Not allocating enough when you access out of the allocated block the behavior is undefined, that means anything can happen, including nothing bad visible immediately

Dynamic memory with C

struct forcePin {
char _name[512];
};
struct forcePin *_forcePin[500000];
_forcePin[i] = (struct forcePin *) malloc (sizeof (struct forcePin));
May I know what is the line as shown below doing?
_forcePin[i] = (struct forcePin *) malloc (sizeof (struct forcePin));
I am not familiar with c,if you can tell me how to make this line to be in C++ format as well.Thanks
Dynamic memory allocation is so important in C that you should really learn it properly.
What the line in question does is allocating memory from the heap, namely sizeof(struct forcePin) bytes. The malloc function returns a generic pointer to this allocated memory, and that pointer is assigned to the pointer _forcePin[i].
One thing about that line, you should not type-cast the return value of the malloc function.
In C++ you use the new statement to allocate pointers:
_forcePin[i] = new forcePin;
However, in C++ using pointers and dynamic heap allocations is discouraged. I would instead recommend you to use a std::vector of non-pointer structures:
struct forcePin {
std::string name;
};
std::vector<forcePin> forcePin;
forcePin.push_back(forcePin{});
It is calling the standard library function malloc(), to allocate sizeof (struct forcePin) bytes of dynamic ("heap") memory.
It is then pointlessly casting the returned pointer, and storing it in the variable _forcePin[i].
It's not the optimal way to write this code, in my opinion it should be:
_forcePin[i] = malloc(sizeof *_forcePin[i]);
Note that if the allocation is broken out of a loop (as the i implies), then that code looks like it's allocating 512 * 500,000 bytes, or around 244 MB of memory. Since it's done in half a million allocation calls, there will be considerable overhead, too.
If all the memory really is needed, it would be better to try for a single malloc() call and then split the allocated buffer into the 500,000 parts. Doing it that way would very likely be faster since malloc() can be expensive and half a million calls is a lot, but it would certainly save memory since there would be a one overhead cost rather than 500,000.
_forcePin[i] = (struct forcePin *) malloc (sizeof (struct forcePin));
Allocates a memory block of size forcePin, and casts the allocated memory from (void *) to the forcePin type. In C++ you would do:
_forcePin[i] = new forcePin();
or better yet, you can have:
std::vector<forcePin> vec;
vec.push_back(forcePin());
This line is your generic C-style memory allocation:
allocate memory (malloc) for exact number of bytes as is needed for structure forcePin (sizeof(struct forcePin)).
as the return value of malloc is a void pointer (void *) pointing to newly allocated memory, cast it to the pointer to forcePin structure ((struct forcePin *))
C++ version would be something like:
_forcePin[i] = new forcePin;
in C++, struct keyword is unnecessary when refering to struct type)
don't forget to free the memory when not needed by delete _forcePin[i]
Because it looks like you just may want to create all 500000 forcePins, you may do in in one step:
forcePin _forcePins = new forcePin[500000];
The line you ask about allocate a block of memory (see malloc()), and stores its address into the _forcePin array, at index i.
In C++, you would have used new, ie: _forcePin[i] = new forcePin

Increasing The Size of Memory Allocated to a Struct via Malloc

I just learned that it's possible to increase the size of the memory you'll allocate to a struct when using the malloc function. For example, you can have a struct like this:
struct test{
char a;
int v[1];
char b;
};
Which clearly has space for only 2 chars and 1 int (pointer to an int in reality, but anyway). But you could call malloc in such a way to make the struct holds 2 chars and as many ints as you wanted (let's say 10):
int main(){
struct test *ptr;
ptr = malloc (sizeof(struct test)+sizeof(int)*9);
ptr->v[9]=50;
printf("%d\n",ptr->v[9]);
return 0;
}
The output here would be "50" printed on the screen, meaning that the array inside the struct was holding up to 10 ints.
My questions for the experienced C programmers out there:
What is happening behind the scenes here? Does the computer allocate 2+4 (2 chars + pointer to int) bytes for the standard "struct test", and then 4*9 more bytes of memory and let the pointer "ptr" put whatever kind of data it wants on those extra bytes?
Does this trick only works when there is an array inside the struct?
If the array is not the last member of the struct, how does the computer manage the memory block allocated?
...Which clearly has space for only 2 chars and 1 int (pointer to an
int in reality, but anyway)...
Already incorrect. Arrays are not pointers. Your struct holds space for 2 chars and 1 int. There's no pointer of any kind there. What you have declared is essentially equivalent to
struct test {
char a;
int v;
char b;
};
There's not much difference between an array of 1 element and an ordinary variable (there's conceptual difference only, i.e. syntactic sugar).
...But you could call malloc in such a way to make it hold 1 char and as
many ints as you wanted (let's say 10)...
Er... If you want it to hold 1 char, why did you declare your struct with 2 chars???
Anyway, in order to implement an array of flexible size as a member of a struct you have to place your array at the very end of the struct.
struct test {
char a;
char b;
int v[1];
};
Then you can allocate memory for your struct with some "extra" memory for the array at the end
struct test *ptr = malloc(offsetof(struct test, v) + sizeof(int) * 10);
(Note how offsetof is used to calculate the proper size).
That way it will work, giving you an array of size 10 and 2 chars in the struct (as declared). It is called "struct hack" and it depends critically on the array being the very last member of the struct.
C99 version of C language introduced dedicated support for "struct hack". In C99 it can be done as
struct test {
char a;
char b;
int v[];
};
...
struct test *ptr = malloc(sizeof(struct test) + sizeof(int) * 10);
What is happening behind the scenes here? Does the computer allocate
2+4 (2 chars + pointer to int) bytes for the standard "struct test",
and then 4*9 more bytes of memory and let the pointer "ptr" put
whatever kind of data it wants on those extra bytes?
malloc allocates as much memory as you ask it to allocate. It is just a single flat block of raw memory. Nothing else happens "behind the scenes". There's no "pointer to int" of any kind in your struct, so any questions that involve "pointer to int" make no sense at all.
Does this trick only works when there is an array inside the struct?
Well, that's the whole point: to access the extra memory as if it belongs to an array declared as the last member of the struct.
If the array is not the last member of the struct, how does the computer manage the memory block allocated?
It doesn't manage anything. If the array is not the last member of the struct, then trying to work with the extra elements of the array will trash the members of the struct that declared after the array. This is pretty useless, which is why the "flexible" array has to be the last member.
No, that does not work. You can't change the immutable size of a struct (which is a compile-time allocation, after all) by using malloc ( ) at run time. But you can allocate a memory block, or change its size, such that it holds more than one struct:
int main(){
struct test *ptr;
ptr = malloc (sizeof(struct test) * 9);
}
That's just about all you can do with malloc ( ) in this context.
In addition to what others have told you (summary: arrays are not pointers, pointers are not arrays, read section 6 of the comp.lang.c FAQ), attempting to access array elements past the last element invokes undefined behavior.
Let's look at an example that doesn't involve dynamic allocation:
struct foo {
int arr1[1];
int arr2[1000];
};
struct foo obj;
The language guarantees that obj.arr1 will be allocated starting at offset 0, and that the offset of obj.arr2 will be sizeof (int) or more (the compiler may insert padding between struct members and after the last member, but not before the first one). So we know that there's enough room in obj for multiple int objects immediately following obj.arr1. That means that if you write obj.arr1[5] = 42, and then later access obj.arr[5], you'll probably get back the value 42 that you stored there (and you'll probably have clobbered obj.arr2[4]).
The C language doesn't require array bounds checking, but it makes the behavior of accessing an array outside its declared bounds undefined. Anything could happen -- including having the code quietly behave just the way you want it to. In fact, C permits array bounds checking; it just doesn't provide a way to handle errors, and most compilers don't implement it.
For an example like this, you're most likely to run into visible problems in the presence of optimization. A compiler (particularly an optimizing compiler) is permitted to assume that your program's behavior is well-defined, and to rearrange the generated code to take advantage of that assumption. If you write
int index = 5;
obj.arr1[index] = 42;
the compiler is permitted to assume that the index operation doesn't go outside the declared bounds of the array. As Henry Spencer wrote, "If you lie to the compiler, it will get its revenge".
Strictly speaking, the struct hack probably involves undefined behavior (which is why C99 added a well-defined version of it), but it's been so widely used that most or all compilers will support it. This is covered in question 2.6 of the comp.lang.c FAQ.

Setting the first two bytes of a block of memory as a pointer or NULL while still accessing the rest of the block

Suppose I have a block of memory as such:
void *block = malloc(sizeof(void *) + size);
How do I set a pointer to the beginning of the block while still being able to access the rest of the reserved space? For this reason, I do not want to simply assign 'block' to another pointer or NULL.
How do I set the first two bytes of the block as NULL or have it point somewhere?
This doesn't make any sense unless you're running on a 16-bit machine.
Based on the way that you're calling malloc(), you're planning to have the first N bytes be a pointer to something else (where N may be 2, 4, or 8 depending on whether you're running on a 16-, 32-, or 64-bit architecture). Is this what you really want to do?
If it is, then you can create use a pointer-to-a-pointer approach (recognizing that you can't actually use a void* to change anything, but I don't want to confuse matters by introducing a real type):
void** ptr = block;
However, it would be far more elegant to define your block with a struct (this may contain syntax errors; I haven't run it through a compiler):
typedef struct {
void* ptr; /* replace void* with whatever your pointer type really is */
char[1] data; } MY_STRUCT;
MY_STRUCT* block = malloc(sizeof(MY_STRUCT) + additional);
block->ptr = /* something */
memset(block, 0, 2);
memset can be found in string.h
Putting the first two bytes of the allocated memory block to 0 is easy. There is many ways to do it, for example:
((char*)block)[0] = 0;
((char*)block)[1] = 0;
Now, the way the question is asked show some misunderstanding.
You can put anything in the first two bytes of your allocated block, it doesn't change anything for accessing the following bytes. The only difference is that C string manipulation operator use as a convention that strings end with a 0 byte. Then if you do things like strcpy((char*)block, target) it will stop copying immediately if the first byte is a zero. But you can still do strcpy((char*)block+2, target).
Now if you want to store a pointer a the beginning of the block (and usually it's not 2 bytes).
You can do the same thing as above but using void* instead of char.
((void**)block)[0] = your_pointer;
You access the rest of the block as you like, just get it's address and go on. You could do it for example with.
void * pointer_to_rest = &((void**)block)[1];
PS: I do not recommand such pointer games. They are very error prone. Your best move would probably be to follow the struct method proposed by #Anon.
void *block = malloc(sizeof(void *) + size); // allocate block
void *ptr = NULL; // some pointer
memcpy(block, &ptr, sizeof(void *)); // copy pointer to start of block
I have a guess at what you're trying to ask, but your wording is so confusing that I could be totally wrong. I am assuming that you want a pointer that points to the "first 2 bytes" of the block you allocated, and then another pointer that points to the rest of the block.
Pointers carry no information about the size of the memory block that they point to, so you can do this:
void *block = malloc(sizeof(void *) + size);
void *first_two_bytes = block;
void *rest_of_block = ((char*)block)+2;
Now, first_two_bytes points to the beginning of the block that you allocated, and you should just treat it as if it pointed to a memory area 2 bytes long.
And rest_of_block points to the portion of the block starting 3 bytes in, and you should treat it as if it pointed to a memory area 2 bytes smaller than what you allocated.
Note, however, that this is still only a single allocation, and you should only free the block pointer. If you free all three pointers, you will corrupt the heap, since you will be calling free more than once on the same block.
While implementing a map interface using a hash table I faced a similar issue, where each key-value pair (both of which are not statically sized, omitting the option of defining a compile-time struct) had to be stored in block of heap memory that also included a pointer to the next element in a linked list (should the blocks be chained in the event that more than one is hashed to the same index in the hash table array). Leaving space for the pointer at the beginning of the block, I found that the solution mentioned by kriss:
((void**)block)[0] = your_pointer;
where you cast the pointer to the block as an array, and then use the bracket syntax to handle pointer arithmetic and dereferencing, was the cleanest solution for copying a new value into this pointer "field" of the block.

Resources