How to malloc a struct that contains a variable length array? - c

The CoreAudio framework uses a struct that is declared like this:
struct AudioBufferList
{
UInt32 mNumberBuffers;
AudioBuffer mBuffers[1]; // this is a variable length array of mNumberBuffers elements
};
typedef struct AudioBufferList AudioBufferList;
As far as I can tell, this is basically a variable length collection of AudioBuffer structs. What is the 'correct' way to malloc such a struct?
AudioBufferList *list = (AudioBufferList *)malloc(sizeof(AudioBufferList));
Would this work?
I've seen all kinds of examples around the internet, like
calloc(1, offsetof(AudioBufferList, mBuffers) +
(sizeof(AudioBuffer) * numBuffers))
or
malloc(sizeof(AudioBufferList) + sizeof(AudioBuffer) * (numBuffers - 1))

That's not a variable length array; it's a 'struct hack'. The standard (since C99) technique uses a 'flexible array member', which would look this:
struct AudioBufferList
{
UInt32 mNumberBuffers;
AudioBuffer mBuffers[]; // flexible array member
};
One of the advantages of the FAM is that your questions are 'irrelevant'; the correct way to allocate the space for numBuffer elements in the mBuffers array is:
size_t n_bytes = sizeof(struct AudioBufferList) + numBuffer * sizeof(AudioBuffer);
struct AudioBufferList *bp = malloc(nbytes);
To answer your question, in practice both the malloc() and the calloc() will allocate at least enough space for the job, but nothing in any C standard guarantees that the code will work. Having said that, compiler writers know that the idiom is used and usually won't go out of their way to break it.
Unless space is incredibly tight, it might be simplest to use the same expression as would be used with a FAM; at worst, you have a little more space allocated than you absolutely need allocated. It will continue to work when you upgrade the code to use a FAM. The expression used in the calloc() version would also work with a FAM member; the expression used in the malloc() version would suddenly be allocating too little space.

Work out how much memory you need and malloc that amount. For example if you want 9 more AudioBuffers then
list = malloc( sizeof *list + 9 * sizeof list->mBuffers[0] );
This entire construct is non-portable by the way (the behaviour is undefined if they access beyond the bound of mBuffers, which is 1), but it used to be reasonably common.
Note that you should not cast the value returned by malloc. There's no benefit to be gained by doing so, but there is harm that can be done.
There is a standard construct which is similar; if you remove the 1 from the definition of AudioBufferList (and malloc one more unit). This is called "flexible array member".

The preferred way to do this is to use the Apple-provided CAAudioBufferList::Create function from CAAudioBufferList.cpp in the Core Audio Utility Classes. You can download the sources here:
https://developer.apple.com/library/mac/samplecode/CoreAudioUtilityClasses/Introduction/Intro.html
Here is their implementation:
AudioBufferList* CAAudioBufferList::Create(UInt32 inNumberBuffers)
{
UInt32 theSize = CalculateByteSize(inNumberBuffers);
AudioBufferList* theAnswer = static_cast<AudioBufferList*>(calloc(1, theSize));
if(theAnswer != NULL)
{
theAnswer->mNumberBuffers = inNumberBuffers;
}
return theAnswer;
}
void CAAudioBufferList::Destroy(AudioBufferList* inBufferList)
{
free(inBufferList);
}
UInt32 CAAudioBufferList::CalculateByteSize(UInt32 inNumberBuffers)
{
UInt32 theSize = SizeOf32(AudioBufferList) - SizeOf32(AudioBuffer);
theSize += inNumberBuffers * SizeOf32(AudioBuffer);
return theSize;
}

Related

Multiple structures in a single malloc invoking undefined behaviour?

From Use the correct syntax when declaring a flexible array member it says that when malloc is used for a header and flexible data when data[1] is hacked into the struct,
This example has undefined behavior when accessing any element other
than the first element of the data array. (See the C Standard, 6.5.6.)
Consequently, the compiler can generate code that does not return the
expected value when accessing the second element of data.
I looked up the C Standard 6.5.6, and could not see how this would produce undefined behaviour. I've used a pattern that I'm comfortable with, where the header is implicitly followed by data, using the same sort of malloc,
#include <stdlib.h> /* EXIT malloc free */
#include <stdio.h> /* printf */
#include <string.h> /* strlen memcpy */
struct Array {
size_t length;
char *array;
}; /* +(length + 1) char */
static struct Array *Array(const char *const str) {
struct Array *a;
size_t length;
length = strlen(str);
if(!(a = malloc(sizeof *a + length + 1))) return 0;
a->length = length;
a->array = (char *)(a + 1); /* UB? */
memcpy(a->array, str, length + 1);
return a;
}
/* Take a char off the end just so that it's useful. */
static void Array_to_string(const struct Array *const a, char (*const s)[12]) {
const int n = a->length ? a->length > 9 ? 9 : (int)a->length - 1 : 0;
sprintf(*s, "<%.*s>", n, a->array);
}
int main(void) {
struct Array *a = 0, *b = 0;
int is_done = 0;
do { /* Try. */
char s[12], t[12];
if(!(a = Array("Foo!")) || !(b = Array("To be or not to be."))) break;
Array_to_string(a, &s);
Array_to_string(b, &t);
printf("%s %s\n", s, t);
is_done = 1;
} while(0); if(!is_done) {
perror(":(");
} {
free(a);
free(b);
}
return is_done ? EXIT_SUCCESS : EXIT_FAILURE;
}
Prints,
<Foo> <To be or >
The compliant solution uses C99 flexible array members. The page also says,
Failing to use the correct syntax when declaring a flexible array
member can result in undefined behavior, although the incorrect syntax
will work on most implementations.
Technically, does this C90 code produce undefined behaviour, too? And if not, what is the difference? (Or the Carnegie Mellon Wiki is incorrect?) What is the factor on the implementations this will not work on?
This should be well defined:
a->array = (char *)(a + 1);
Because you create a pointer to one element past the end of an array of size 1 but do not dereference it. And because a->array now points to bytes that do not yet have an effective type, you can use them safely.
This only works however because you're using the bytes that follow as an array of char. If you instead tried to create an array of some other type whose size is greater than 1, you could have alignment issues.
For example, if you compiled a program for ARM with 32 bit pointers and you had this:
struct Array {
int size;
uint64_t *a;
};
...
Array a = malloc(sizeof *a + (length * sizeof(uint64_t)));
a->length = length;
a->a= (uint64_t *)(a + 1); // misaligned pointer
a->a[0] = 0x1111222233334444ULL; // misaligned write
Your program would crash due to a misaligned write. So in general you shouldn't depend on this. Best to stick with a flexible array member which the standard guarantees will work.
As an adjunct to #dbush good answer, a way to get around alignment woes is to use a union. This insures &p[1] is properly aligned for (uint64_t*)1. sizeof *p includes any needed padding vs. sizeof *a.
union {
struct Array header;
uint64_t dummy;
} *p;
p = malloc(sizeof *p + length*sizeof p->header->array);
struct Array *a = (struct Array *)&p[0]; // or = &(p->header);
a->length = length;
a->array = (uint64_t*) &p[1]; // or &p[1].dummy;
Or go with C99 and flexible array member.
1 As well as struct Array
Before the publication of C89, there were some implementations that would attempt to identify and trap upon out-of-bounds array accesses. Given something like:
struct foo {int a[4],b[4];} *p;
such implementations would squawk at an effort to access p->a[i] if i wasn't in the range 0 to 3. For programs that don't need to index the address of array-type lvalue p->a to access anything outside that array, being able to trap on such out-of-bounds accesses would be useful.
The authors of C89 were also almost certainly aware that it was common for programs to use the address of dummy-sized array at the end of a structure as a means of accessing storage beyond the structure. Using such techniques made it possible to do things that couldn't be done nearly as nicely otherwise, and part of the Spirit of C, according to the authors of the Standard, is "Don't prevent the programmer from doing what needs to be done".
Consequently, the authors of the Standard treated such accesses as something which implementations could support or not, at their leisure, presumably based upon what would be most useful for their customers. While it would often be helpful for implementations which would normally bounds-check accesses to structures in an array, to provide an option to omit such checks in cases where the last item of an indirectly-accessed structure is an array with one element (or, if they extend the language to waive a compile-time constraint, zero elements), people writing such implementations would presumably be capable of recognizing such things without the authors of the Standard having to tell them. The notion that "Undefined Behavior" was intended as some form of prohibition doesn't seem to have really taken hold until after the publication of C89's successor standard.
With regard to your example, having a pointer within a struct point to later storage in the same allocation should work, but with a couple of caveats:
If the allocation is passed to realloc, the pointer within it will become invalid.
The only real advantage of using a pointer versus a flexible array member is that it allows for the possibility of having it point somewhere else. That may be good if the only kind of "something else" will always be an constant object of static duration that never has to be freed, or perhaps if it is some other kind of object that won't have to be freed, but may be problematical if it could hold the only reference to something stored in a separate allocation.
Flexible array members have been available as an extension in some compilers before C89 was written, and were officially added in C99. Any decent compiler should support them.
You can define struct Array as:
struct Array
{
size_t length;
char array[1];
}; /* +(length + 1) char */
then malloc( sizeof *a + length ). The "+1" element is in array[1] member. Fill structure with:
a->length = length;
strcpy( a->array, str );

How to allocate memory for an array and a struct in one malloc call without breaking strict aliasing?

When allocating memory for a variable sized array, I often do something like this:
struct array {
long length;
int *mem;
};
struct array *alloc_array( long length)
{
struct array *arr = malloc( sizeof(struct array) + sizeof(int)*length);
arr->length = length;
arr->mem = (int *)(arr + 1); /* dubious pointer manipulation */
return arr;
}
I then use the arrray like this:
int main()
{
struct array *arr = alloc_array( 10);
for( int i = 0; i < 10; i++)
arr->mem[i] = i;
/* do something more meaningful */
free( arr);
return 0;
}
This works and compiles without warnings. Recently however, I read about strict aliasing. To my understanding, the code above is legal with regard to strict aliasing, because the memory being accessed through the int * is not the memory being accessed through the struct array *. Does the code in fact break strict aliasing rules? If so, how can it be modified not to break them?
I am aware that I could allocate the struct and array separately, but then I would need to free them separately too, presumably in some sort of free_array function. That would mean that I have to know the type of the memory I am freeing when I free it, which would complicate code. It would also likely be slower. That is not what I am looking for.
The proper way to declare a flexible array member in a struct is as follows:
struct array {
long length;
int mem[];
};
Then you can allocate the space as before without having to assign anything to mem:
struct array *alloc_array( long length)
{
struct array *arr = malloc( sizeof(struct array) + sizeof(int)*length);
arr->length = length;
return arr;
}
Modern C officially supports flexible array members. So you can define your structure as follows:
struct array {
long length;
int mem[];
};
And allocate it as you do now, without the added hassle of dubious pointer manipulation. It will work out of the box, all the access will be properly aligned and you won't have to worry about dark corners of the language. Though, naturally, it's only viable if you have a single such member you need to allocate.
As for what you have now, since allocated storage doesn't have a declared type (it's a blank slate), you aren't breaking strict aliasing, since you haven't given that memory an effective type. The only issue is with possible mess-up of alignment. Though that's unlikely with the types in your structure.
I believe the code as written does violate strict aliasing rules, when standard read in the strictest sense.
You are accessing an object of type int through a pointer to unrelated type array. I believe, that an easy way out would be to use starting address of the struct, and than convert it char*, and perform a pointer arithmetic on it. Example:
void* alloc = malloc(...);
array = alloc;
int* p_int = (char*)alloc + sizeof(array);

malloc'ing for field inside struct

I have, roughly speaking, a function prototype like this:
init_superstruct(const char *name, Superstruct **super, int num_substructs) {...
where superstruct looks like
typedef struct superstruct {
char *name,
Substruct **substructs,
int num_substructs;
} Superstruct
The function is supposed to
1) allocate memory for (and initialize) super, by...
2) ...assigning the name field of super enough memory to hold the name argument, and...
3) ...assigning the substructs field enough memory to hold an array pointers to Substructs (of size num_substructs).
My question: will the following code accomplish these goals?
*super = malloc(sizeof(*super));
*super->name = malloc(sizeof(strlen(name) + 1)));
*super->substructs = calloc(num_substructs, sizeof(Substruct));
This is literally my first foray into dynamic memory allocation. Any advice you have would be helpful for me!
First:
*super = malloc(sizeof(*super));
You want sizeof(**super). *super is a pointer, with type Superstruct *, so this won't allocate enough memory.
Really, you should probably allocate the structure normally, then assign it to the pointer separately. This will make your code much easier to write:
Superstruct *r = malloc(sizeof *r);
r->name = …
*super = r;
Second:
*super->name = malloc(sizeof(strlen(name) + 1)));
This is wrong. sizeof(strlen(name) + 1) is sizeof(int) (or perhaps sizeof(size_t); either way it's not what you want) -- strlen() won't even be called! Remove the sizeof() from this expression to make it correct.
Third: to allocate a single array of Substruct objects, define the structure member as Substruct *substructs, and allocate it using the exact code you've got right now. You don't need a double pointer unless you want an array of pointers to the structures, which is more complicated than you need.
If you really think you do need an array of pointers here (you probably don't), you need to allocate the array using sizeof(Substruct *) as the size argument to calloc(), not sizeof(Substruct).

C - Dynamically sized array of struct pointers without using realloc?

I need help with a school assignment, specifically with resizing the amount of memory allocated for a pointer WITHOUT realloc.
I have the following declarations in my program.
struct GraphicElement
{
enum{ SIZE = 256 };
unsigned int numLines;
Line* pLines;
char name[SIZE];
};
typedef struct
{
unsigned int numGraphicElements;
GraphicElement* pElements;
}VectorGraphic;
VectorGraphic Image;
As the program runs I'll be adding more GraphicElements to pElements.
For example, after 5 iterations the memory for pElements should be something like this:
[GraphicElement 0][GraphicElement 1] ... [GraphicElement 4]
For the function AddGraphicElement(VectorGraphic* vg) I have this code (with some lines removed for easier reading):
vg->pElements = (GraphicElement*)realloc(vg->pElements, sizeof(GraphicElement)*(vg->numGraphicElements+1));
//Then I assign inputs from user into the members of the struct at vg->pElements[vg->numGraphicElements]
vg->numGraphicElements++;
This works, BUT according to the instructions given by my professor, I'm only allowed to use malloc and free- no realloc. Sadly the only way I've made this work is with realloc.
Can anyone point me in the right direction to implement this using only malloc?
Thanks!
If you are not allowed to use realloc, but malloc and free are allowed, you can replace the call with the following, less efficient, sequence:
void *newData = malloc(newSize);
memcpy(newData, oldData, oldSize);
free(oldData);
Internally, realloc does the same thing, but it does so more efficiently. Unlike user program, realloc knows the actual size of the dynamic memory chunk, so it checks if newSize <= actualSize to avoid reallocation. When actualSize is insufficient, realloc does the same thing as above. realloc has additional logic to deal with situations when the size need to shrink, but in your situation this does not apply.

Allocate Pointer and pointee at once

If I want to reduce malloc()s (espacially if the data is small and allocated often) I would like to allocate the pointer and pointee at once.
If you assume something like the following:
struct entry {
size_t buf_len;
char *buf;
int something;
};
I would like to allocate memory in the following way (don't care about error checking here):
size_t buf_len = 4; // size of the buffer
struct entry *e = NULL;
e = malloc( sizeof(*e) + buf_len ); // allocate struct and buffer
e->buf_len = buf_len; // set buffer size
e->buf = e + 1; // the buffer lies behind the struct
This could even be extende, so that a whole array is allocated at once.
How would you assess such a technuique with regard to:
Portability
Maintainability / Extendability
Performance
Readability
Is this reasonable? If it is ok to use, are there any ideas on how to design a possible interface for that?
You could use a flexible array member instead of a pointer:
struct entry {
size_t buf_len;
int something;
char buf[];
};
// ...
struct entry *e = malloc(sizeof *e + buf_len);
e->buf_len = buf_len;
Portability and performance are fine. Readability: not perfect but good enough.
Extendability: you can't use this for more than one member at a time, you'd have to fall back to your explicit pointer version. Also, the explicit pointer version means that you have to muck around to ensure correct alignment if you use it with a type that doesn't have an alignment of 1.
If you are seriously thinking about this I'd consider revisiting your entire data structure's design to see if there is another way of doing it. (Maybe this way is actually the best way, but have a good think about it first).
As to portability, I am unaware of any issues, as long as the sizes are found via suitable calls to sizeof(), as in your code.
Regarding maintainability, extendability and readability, you should certainly wrap allocation and de-allocation in a well-commented function. Calls to...
entry *allocate_entry_with_buffer();
void deallocate_entry_with_buffer(entry **entry_with_buffer);
...do not need to know implementation details of how the memory actually gets handled. People use stranger things like custom allocators and memory pools quite frequently.
As for speed, this is certainly faster than making lots of small allocations. I used to allocate whole 2D matrices with a similar strategy...
It should work, but in fact you are using a pointer for a useless indirection. Windows API (for example) uses another method for variable size structs : the variable size buffer is last in struct and is declared to be char buf[1].
Your struct would become :
struct entry {
size_t buf_len;
int something;
char buf[1];
};
The allocation is (still no error checking) :
size_t buf_len = 4; // size of the buffer
struct entry *e;
e = malloc( sizeof(*e) + buf_len - 1); // struct already has room for 1 char
e->buf_len = buf_len; // set buffer size
That's all e.buf is guaranteed to be a char array of size buf_len.
That way ensures that even if the variable part was not a character array but a int, long, or anything array, the alignement would be given by the last element being a array of proper type and size 1.
For starters, the line:
e->buf = e + sizeof(*e); // the buffer lies behind the struct
Should be:
e->buf = e + 1; // the buffer lies behind the struct
This is because e + 1 will be equal to the address at the end of the structure. As you have it, it will only be the number of bytes into the structure equal to the number of bytes in a pointer.
And, yes, it's reasonable. However, I prefer this approach:
struct entry {
size_t buf_len;
int something;
char buf[1];
};
This way, you don't mess with the pointers. Just append as many bytes as needed, and they will grow the size of your buf array.
Note: I wrote a text editor using an approach similar to this but used a Microsoft c++ extension that allowed me to declare the last member as char buf[]. So it was an empty array that was exactly as long as the number of extra bytes I allocated.
seems fine to me - put comments in though
Or you could do this - which is quite common
struct entry {
size_t buf_len;
int something;
char buf;
};
ie make the struct itself variable length. and do
size_t buf_len = 4; // size of the buffer
struct entry *e = NULL;
// check that it packs right
e = malloc(sizeof(size_t) + sizeof(int) + buf_len ); // allocate struct and buffer
e->buf_len = buf_len; // set buffer size
...... later
printf(&e.buf);

Resources