Undefined type of arrays in C - c

Is it possible to make undefined type arrays in C, similarly to Object arrays ? If so, how ? Something like this,
undefinedtype ArrayName[200];

Not really. In C, when you create an array, the system allocates memory for your array. It needs to know how much memory to allocate. Objects of different types require different amounts of memory, so if you don't know what kind of objects will be in your array, you won't know how much memory to allocate.
However, you can make an array of pointers by using void* instead of undefinedtype. Then you can make those pointers point to any kind of object you want later.

Either use just a vanilla void * array, or create a base Object struct of your own to contain metadata and a reference.
Using a void * array:
void * objArray[200];
int x;
char * s = "hello";
float f;
objArray[0] = &x;
objArray[1] = s;
objArray[2] = &f;
This works, and is easy, but requires great care to avoid getting the actual type of the "objects" mixed up.
Using a wrapper with useful meta-data:
// enum to list the types of objects you expect and know how to handle
typedef enum {
INT_TYPE, FLOAT_TYPE, MY_TYPE /* etc, etc */
} ObjectType;
// structure containing the pointer & associated metadata (e.g. type and size)
typedef struct {
ObjectType object_type;
size_t object_size;
void * object_ref;
} Object;
// Array of objects.
Object objArray[200];
// Store an object and some meta-data.
objArray[0].object_type = INT_TYPE;
objArray[0].object_size = sizeof(int);
objArray[0].object_ref = malloc(sizeof(int));
((int*)objArray[0].object_ref) = 100;
You'll see the latter construct in libraries that deal with JSON, XML, and various other non-native "types" / objects, as well as in the internal implementations of languages with richer type systems.

No, you couldn't make such array. You could declare array with fixed size containing pointers to void. And then dynamically allocate memory for data with needed type.

Related

Is it valid to call malloc with a pointer of the type of the first member?

I am building a hash library, this library works with different structs and all those structs haves an unsigned type as first member, an example:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct data {
unsigned hash;
void (*func)(struct data *);
};
struct another_data {unsigned hash; int value;};
static void *hash_insert(const char *text, size_t size)
{
unsigned *hash;
hash = malloc(size);
// *hash = hash(text);
*hash = (unsigned)strlen(text);
return hash;
}
static void func(struct data *data)
{
printf("%u\n", data->hash);
}
int main(void)
{
struct data *data;
data = hash_insert("Some text", sizeof *data);
data->func = func;
data->func(data);
free(data);
return 0;
}
Since the first member of the struct and the struct itself haves the same alignment requirements, is it valid to call malloc with a pointer of the type of the first member in order to reserve space for the entire struct?
unsigned *hash = malloc(size); /* where size is the size of the struct */
EDIT:
In this related question provided by #MohitJain:
struct Msg
{
unsigned int a;
unsigned int b;
};
...
uint32_t* buff = malloc(sizeof(Msg));
// Alias that buffer through message
Msg* msg = (Msg*)(buff);
The strict aliasing rule makes this setup illegal
But in my case I am returning a void * from the function, I can use this (returned) pointer inside main without alignment issues, is this assumption correct?
The rules about effective type and pointer aliasing say that it is fine to convert pointers between a struct (or any other "aggregate") and a pointer of the same type as the first appearing member in the struct. And another rule says that structs are not allowed to have padding in the very beginning.
So it is fine as far as the C standard goes... which doesn't really say much of the quality of the program.
The code does not make much sense. You clearly want to use the whole struct, so why not return a pointer of the struct type? Don't complicate things for the sake of it. You should always avoid using void* when there is no need for it, to increase type safety.
Overall, all these things would sort themselves out in a multi-file project with proper program design. If you had a separate file with the hash type and all functions using that type, there would be no doubt of how to write the program.
malloc is guaranteed to return memory aligned for any type.
Therefore it will work regardless of the alignment requirement of the different subtypes.
Since structure member are allocated in the order they are declared, using
unsigned *hash = malloc(size); /* where size is the size of the struct */
can work if your pourpose is just using the hash data.
It fails if you want to apply pointer aritmetic on it, so in this case using
hash++
is an undefined behavior.
Yes it is perfectly fine for first member type to point to the memory returned by malloc (Memory is aligned for any data type). What you do with this memory later may cause issues if you are not careful.
[Extracts from Joachim Pileborg's answer # Maintaining an array of pointers in C that points to two related types with minor changes]
Having a structure inside other structures or a common data type in other structures is a common way of emulating inheritance in C. This common member should contain the minimal set of data common to all structures in the "inheritance" hierarchy, and it must always be the first member in the inheriting structures.
Possible issue with this scheme is:
This "inheritance" scheme will work on all modern PC-like systems and their compilers, and have done so for a long time, but there's no guarantee that it will work on all systems and all compilers (if you're planning on porting the code to some rare system with weird hardware and compiler you might want to look out, but the cases where the "inheritance" scheme will not work is very small, and most people will never come in contact with such a system in their entire lifetime.) But once you point the same memory with struct data *, you may fall victime of strict aliasing rule. So you need to be careful there. Further readL What is the strict aliasing rule?
unsigned *hash = malloc(size);
This will create allocate an array of unsigned integers. Total number of integers allocated will be size/sizeof(int). hash here is pointer to int.
Since the first member of the struct and the struct itself haves the same alignment requirements, is it valid to call malloc with a pointer of the type of the first member in order to reserve space for the entire struct?
The point is, that hash here is a separate variable which has nothing to do with the hash inside the structure. (It is in a separate namespace, if you want to look it up)
You can then cast hash to the struct variable.
struct data *data;
unsigned *hash = malloc(size * sizeof(struct));
data = (struct data *) hash;
But, what is the point. You can just as well remove the unnecessary unsigned hash pointer and go with the traditional.
struct data *data;
data = malloc(size * sizeof(struct));

How to include a variable-sized array as stuct member in C?

I must say, I have quite a conundrum in a seemingly elementary problem. I have a structure, in which I would like to store an array as a field. I'd like to reuse this structure in different contexts, and sometimes I need a bigger array, sometimes a smaller one. C prohibits the use of variable-sized buffer. So the natural approach would be declaring a pointer to this array as struct member:
struct my {
struct other* array;
}
The problem with this approach however, is that I have to obey the rules of MISRA-C, which prohibits dynamic memory allocation. So then if I'd like to allocate memory and initialize the array, I'm forced to do:
var.array = malloc(n * sizeof(...));
which is forbidden by MISRA standards. How else can I do this?
Since you are following MISRA-C, I would guess that the software is somehow mission-critical, in which case all memory allocation must be deterministic. Heap allocation is banned by every safety standard out there, not just by MISRA-C but by the more general safety standards as well (IEC 61508, ISO 26262, DO-178 and so on).
In such systems, you must always design for the worst-case scenario, which will consume the most memory. You need to allocate exactly that much space, no more, no less. Everything else does not make sense in such a system.
Given those pre-requisites, you must allocate a static buffer of size LARGE_ENOUGH_FOR_WORST_CASE. Once you have realized this, you simply need to find a way to keep track of what kind of data you have stored in this buffer, by using an enum and maybe a "size used" counter.
Please note that not just malloc/calloc, but also VLAs and flexible array members are banned by MISRA-C:2012. And if you are using C90/MISRA-C:2004, there are no VLAs, nor are there any well-defined use of flexible array members - they invoked undefined behavior until C99.
Edit: This solution does not conform to MISRA-C rules.
You can kind of include VLAs in a struct definition, but only when it's inside a function. A way to get around this is to use a "flexible array member" at the end of your main struct, like so:
#include <stdio.h>
struct my {
int len;
int array[];
};
You can create functions that operate on this struct.
void print_my(struct my *my) {
int i;
for (i = 0; i < my->len; i++) {
printf("%d\n", my->array[i]);
}
}
Then, to create variable length versions of this struct, you can create a new type of struct in your function body, containing your my struct, but also defining a length for that buffer. This can be done with a varying size parameter. Then, for all the functions you call, you can just pass around a pointer to the contained struct my value, and they will work correctly.
void create_and_use_my(int nelements) {
int i;
// Declare the containing struct with variable number of elements.
struct {
struct my my;
int array[nelements];
} my_wrapper;
// Initialize the values in the struct.
my_wrapper.my.len = nelements;
for (i = 0; i < nelements; i++) {
my_wrapper.my.array[i] = i;
}
// Print the struct using the generic function above.
print_my(&my_wrapper.my);
}
You can call this function with any value of nelements and it will work fine. This requires C99, because it does use VLAs. Also, there are some GCC extensions that make this a bit easier.
Important: If you pass the struct my to another function, and not a pointer to it, I can pretty much guarantee you it will cause all sorts of errors, since it won't copy the variable length array with it.
Here's a thought that may be totally inappropriate for your situation, but given your constraints I'm not sure how else to deal with it.
Create a large static array and use this as your "heap":
static struct other heap[SOME_BIG_NUMBER];
You'll then "allocate" memory from this "heap" like so:
var.array = &heap[start_point];
You'll have to do some bookkeeping to keep track of what parts of your "heap" have been allocated. This assumes that you don't have any major constraints on the size of your executable.

Why do we use zero length array instead of pointers?

It's said that zero length array is for variable length structure, which I can understand. But what puzzle me is why we don't simply use a pointer, we can dereference and allocate a different size structure in the same way.
EDIT - Added example from comments
Assuming:
struct p
{
char ch;
int *arr;
};
We can use this:
struct p *p = malloc(sizeof(*p) + (sizeof(int) * n));
p->arr = (struct p*)(p + 1);
To get a contiguous chunk of memory. However, I seemed to forget the space p->arr occupies and it seems to be a disparate thing from the zero size array method.
If you use a pointer, the structure would no longer be of variable length: it will have fixed length, but its data will be stored in a different place.
The idea behind zero-length arrays* is to store the data of the array "in line" with the rest of the data in the structure, so that the array's data follows the structure's data in memory. Pointer to a separately allocated region of memory does not let you do that.
* Such arrays are also known as flexible arrays; in C99 you declare them as element_type flexArray[] instead of element_type flexArray[0], i.e. you drop zero.
The pointer isn't really needed, so it costs space for no benefit. Also, it might imply another level of indirection, which also isn't really needed.
Compare these example declarations, for a dynamic integer array:
typedef struct {
size_t length;
int data[0];
} IntArray1;
and:
typedef struct {
size_t length;
int *data;
} IntArray2;
Basically, the pointer expresses "the first element of the array is at this address, which can be anything" which is more generic than is typically needed. The desired model is "the first element of the array is right here, but I don't know how large the array is".
Of course, the second form makes it possible to grow the array without risking that the "base" address (the address of the IntArray2 structure itself) changes, which can be really neat. You can't do that with IntArray1, since you need to allocate the base structure and the integer data elements together. Trade-offs, trade-offs ...
These are various forms of the so-called "struct hack", discussed in question 2.6 of the comp.lang.c FAQ.
Defining an array of size 0 is actually illegal in C, and has been at least since the 1989 ANSI standard. Some compilers permit it as an extension, but relying on that leads to non-portable code.
A more portable way to implement this is to use an array of length 1, for example:
struct foo {
size_t len;
char str[1];
};
You could allocate more than sizeof (struct foo) bytes, using len to keep track of the allocated size, and then access str[N] to get the Nth element of the array. Since C compilers typically don't do array bounds checking, this would generally "work". But, strictly speaking, the behavior is undefined.
The 1999 ISO standard added a feature called "flexible array members", intended to replace this usage:
struct foo {
size_t len;
char str[];
};
You can deal with these in the same way as the older struct hack, but the behavior is well defined. But you have to do all the bookkeeping yourself; sizeof (struct foo) still doesn't include the size of the array, for example.
You can, of course, use a pointer instead:
struct bar {
size_t len;
char *ptr;
};
And this is a perfectly good approach, but it has different semantics. The main advantage of the "struct hack", or of flexible array members, is that the array is allocated contiguously with the rest of the structure, and you can copy the array along with the structure using memcpy (as long as the target has been properly allocated). With a pointer, the array is allocated separately -- which may or may not be exactly what you want.
This is because with a pointer you need a separate allocation and assignment.
struct WithPointer
{
int someOtherField;
...
int* array;
};
struct WithArray
{
int someOtherField;
...
int array[1];
};
To get an 'object' of WithPointer you need to do:
struct WithPointer* withPointer = malloc(sizeof(struct WithPointer));
withPointer.array = malloc(ARRAY_SIZE * sizeof(int));
To get an 'object' of WithArray:
struct WithArray* withArray = malloc(sizeof(struct WithArray) +
(ARRAY_SIZE - 1) * sizeof(int));
That's it.
In some cases it's also very handy, or even necessary, to have the array in consecutive memory; for example in network protocol packets.

safely resizing structs

I would like some advice on safe ways to deal with struct's when the size of certain members are not known at code time.
For example I have a Struct Named "Channel". This struct has a member name "AudioSourceOBJ" which is a pointer to an an array of other struct type named "AudioSource". I wont know how many AudioSources I will have per channel until the program is run. I deal with that like this.
channel object
struct channelobj {
AudioUnitSampleType *leftoutput;
AudioUnitSampleType *rightoutput;
AudioSourceOBJ *audioSource;
};
audiosource
struct audiosourceobj {
AudioUnitSampleType *leftoutput;
AudioUnitSampleType *rightoutput;
};
creation of variable sized structs
void createInputs(ChannelOBJ channel,int numAudioInputs)
{
channel->audioSource=(AudioSourceOBJ *)malloc(numAudioInputs * sizeof(AudioSourceOBJ));
for (int i=0;i<numAudioInputs;i++)
{
AudioSourceOBJ obj;
obj=newAudioSourceOBJ();
channel->audioSource[i]=obj;
}
}
I think this is o.k?
The problem I am now facing is that even though I can assign memory for the correct number of audio objects in my channel struct, the leftoutput and rightoutput arrays in the audiosource struct will not be set until later in the program. They will be filled with an undermined amount of data, and are likely to change in size and content throughout the lifetime of the application.
Will I have to completely re malloc the channel containing the audiosource every time I want to make changes to a single audio object?
What is a safe way to do this or is there a better approach?
"Will I have to completely re malloc the channel containing the audiosource every time I want to make changes to a single audio object?"
No. You could for example replace the left output of the ith audio source like this:
free(channel->audioSource[i].leftoutput);
channel->audioSource[i].leftoutput = malloc(newSize * sizeof(AudioUnitSampleType));
Or even:
AudioUnitSampleType *tmp = realloc(channel->audioSource[i].leftoutput,
newSize * sizeof(*tmp));
if (tmp == 0) { /* handle the error */ }
channel->audioSource[i].leftoutput = tmp;
By the way, if you don't post real code, it's possible that answers will contain errors due to errors in your examples.
There seems to be some confusion in your code between pointers and objects, for example the channel parameter is of type ChannelOBJ, then you use it as if it's a pointer. Is this an error, or is ChannelOBJ a typedef for struct channelobj*? It's generally better not to conceal that something is a pointer using a typedef.
If AudioUnitSampleType is likewise a pointer type, then my first code snippet above is incomplete, since it would then also be necessary to free the old objects pointed to by the elements of the array, and allocate new ones. The second one needs to free old ones or allocate new ones according to whether the size is being increased or decreased.
No, you won't have to resize the allocated block of AudioSourceObj structs. leftoutput and rightoutput are merely pointers of a fixed size (not variable-sized arrays) and can be assigned an address by doing a separate malloc:
channel->audioSource[i].leftoutput = malloc(5 * sizeof(AudioUnitSampleType));

Dynamically create an array of TYPE in C

I've seen many posts for c++/java, but nothing for C. Is it possible to allocate memory for an array of type X dynamically during run time? For example, in pseudo,
switch(data_type)
case1:float, create a new array of floats to use in the rest of the program
case2:int, create new array of ints to use in the rest of the program
case3:unsigned, ....
// etc.
In my program I determine the data type from a text header file during run time, and then I need to create an appropriate array to store/manipulate data. Is there some kind of generic type in C?
EDIT: I need to dynamically create and DECIDE which array should be created.
Thanks,
csand
Assuming you calculate the total size, in bytes, required from the array, you can just allocate that much memory and assign it to the correct pointer type.
Ex:
void * data_ptr = malloc( data_sz );
then you can assign it to a pointer for whatever type you want:
int *array1 = (int *)data_ptr;
or
float *array2 = (float *)data_ptr;
NOTE: malloc allocates memory on the heap, so it will not be automatically freed. Make sure you free the memory you allocate at some point.
UPDATE
enum {
DATA_TYPE_INT,
DATA_TYPE_FLOAT,
...
};
typedef struct {
int data_type;
union {
float * float_ptr;
int * int_ptr;
...
} data_ptr;
} data;
While this might allow you to store the pointer and tell what type of pointer you should be using, it still leaves the problem of not having to branch the behavior depending on the data type. That will be difficult because the compiler has to know the data type for assignments etc.
You're going to have a hard time doing this in C because C is statically typed and has no run-time type information. Every line of C code has to know exactly what type it is dealing with.
However, C comes with a nifty and much-abused macro preprocessor that lets you (among other things) define new functions that differ only in the static type. For example:
#define FOO_FUNCTION(t) t foo_function_##t(t a, t b) { return a + b; }
FOO_FUNCTION(int)
FOO_FUNCTION(float)
This gets you 2 functions, foo_function_int and foo_function_float, which are identical other than the name and type signature. If you're not familiar with the C preprocessor, be warned it has all sorts of fun gotchas, so read up on it before embarking on rewriting chunks of your program as macros.
Without knowing what your program looks like, I don't know how feasible this approach will be for you, but often the macro preprocessor can help you pretend that you're using a language that supports generic programming.

Resources