C API allowing for both automatic and allocated storage - c

I'm writing an API that has structs such as
struct datast{
int a;
int *items;
size_t numitems;
};
I'm providing functions which free the contents of such structs (in a similar fashion to what C++ destructors do). Constructors are not provided because I'm mandating zero initialization for them (the .items field is required to be a NULL pointer on initialization, which makes it suitable for later realloc() and free()).
I'm providing however, an additem() function, which realloc()s .items and increases .numitems accordingly.
However, because these structs are small, I'd like to encourage the use of designated initializers and compound literals, so that users can conveniently create these objects with one-liners when possible, without having to manually invoke additem().
But then, if you initialize structs like these with designated initializers (or assign to them from a compound literal), the .items field will have automatic storage rather than allocated storage. And so, if you pass this struct to the "freeing" function/destructor later, you'll be calling free() with an illegal pointer (pointing to automatic storage).
Yes, I know the wording could be "don't call the destructor for objects for which you didn't call additem()"... but this looks really clumsy, and seems like bad design.
Somehow, it's like I had to decide if all these objects should have either automatic or allocated storage, without giving both possibilities to the user.
Have you ever been in a scenario like this? Is there any kind of design I could use that could provide a clean and elegant interface for both automatic and allocated storage?

Add a boolean member items_allocated. The zero initialisation that you mandate will make that false. Then additem() will set it true:
struct datast
{
int a;
int *items;
bool items_allocated ;
size_t numitems;
} ;
Then your destructor can have something like:
if( d->items_allocated )
{
free( d->items ) ;
d->items = NULL ;
}
d->numitems = 0 ;
...

Related

Can I use this pointer to struct initialization?

Can I use this type of initialization for a pointer?
/* Globally scoped variables definitions -------------------------------------*/
some_struct_t* PLAYLISTS = &(some_struct_t){0};
some_struct_t* DEVICES = &(some_struct_t){0};
At the moment I'm using this function:
void init_fun()
{
...
PLAYLISTS = calloc(1, sizeof(*PLAYLISTS));
assert(PLAYLISTS && "Error allocating memory");
DEVICES = calloc(1, sizeof(*DEVICES));
assert(DEVICES && "Error allocating memory");
...
}
Context: I am creating a program targeting ESP8266/ESP32 SOC devices that interacts with the Spotify API.
Can I use this type of initialization for a pointer?
Sure, it is well-defined. The pointers are set to point at compound literals, and since it is done at file scope, those will exist throughout the execution of the program. This is guaranteed by C17 6.5.2.5/5:
If the compound literal occurs outside the body of a function, the object
has static storage duration; otherwise, it has automatic storage duration associated with the enclosing block.
Although the memory location of those compound literals wouldn't be possible to reuse once you assign the pointers to point somewhere else. It would have been much more sensible to point at a zeroed out struct in flash, since flash is less valuable than RAM.
However, since this is an embedded system (Why should I not use dynamic memory allocation in embedded systems?) and since it isn't a brilliant idea to make code extra slow just for the heck of it: it would make far more sense to ditch pointers, the compound literals and the malloc calls and instead just memcpy a new value into PLAYLISTS or DEVICES. That way you don't even need them to be pointers.
Applying the KISS principle, we instead end up with:
/* Globally scoped variables definitions -------------------------------------*/
static some_struct_t PLAYLISTS = {0};
static some_struct_t DEVICES = {0};
Faster, safer, less memory consuming, more readable.

malloc'd pointer inside struct that is passed by value

I am putting together a project in C where I must pass around a variable length byte sequence, but I'm trying to limit malloc calls due to potentially limited heap.
Say I have a struct, my_struct, that contains the variable length byte sequence, ptr, and a function, my_func, that creates an instance of my_struct. In my_func, my_struct.ptr is malloc'd and my_struct is returned by value. my_struct will then be used by other functions being passed by value: another_func. Code below.
Is this "safe" to do against memory leaks provided somewhere on the original or any copy of my_struct when passed by value, I call my_struct_destroy or free the malloc'd pointer? Specifically, is there any way that when another_func returns, that inst.ptr is open to being rewritten or dangling?
Since stackoverflow doesn't like opinion-based questions, are there any good references that discuss this behavior? I'm not sure what to search for.
typedef struct {
char * ptr;
} my_struct;
// allocates n bytes to pointer in structure and initializes.
my_struct my_func(size_t n) {
my_struct out = {(char *) malloc(n)};
/* initialization of out.ptr */
return out;
}
void another_func(my_struct inst) {
/*
do something using the passed-by-value inst
are there problems with inst.ptr here or after this function returns?
*/
}
void my_struct_destroy(my_struct * ms_ptr) {
free(ms_ptr->ptr);
ms_ptr->ptr = NULL;
}
int main() {
my_struct inst = my_func(20);
another_func(inst);
my_struct_destroy(&inst);
}
I's safe to pass and return a struct containing a pointer by value as you did it. It contains a copy of ptr. Nothing is changed in the calling function. There would, of course, be a big problem if another_func frees ptr and then the caller tries to use it or free it again.
Locality of alloc+free is a best practice. Wherever possible, make the function that allocates an object also responsible for freeing it. Where that's not feasible, malloc and free of the same object should be in the same source file. Where that's not possible (think complex graph data structure with deletes), the collection of files that manage objects of a given type should be clearly identified and conventions documented. There's a common technique useful for programs (like compilers) that work in stages where much of the memory allocated in one stage should be freed before the next starts. Here, memory is only malloced in big blocks by a manager. From these, the manager allocs objects of any size. But it knows only one way to free: all at once, presumably at the end of a stage. This is a gcc idea: obstacks. When allocation is more complex, bigger systems implement some kind of garbage collector. Beyond these ideas, there are as many ways to manage C storage as there are colors. Sorry I don't have any pointers to references (pun intended :)
If you only have one variable-length field and its size doesn't need to be dynamically updated, consider making the last field in the struct an array to hold it. This is okay with the C standard:
typedef struct {
... other fields
char a[1]; // variable length
} my_struct;
my_struct my_func(size_t n) {
my_struct *p = malloc(sizeof *p + (n - 1) * sizeof p->a[0]);
... initialize fields of p
return p;
}
This avoids the need to separately free the variable length field. Unfortunately it only works for one.
If you're okay with gcc extensions, you can allocate the array with size zero. In C 99, you can get the same effect with a[]. This avoids the - 1 in the size calculation.

Allocation of variables inside dynamically allocated structs

Suppose to have a struct that contains a pointer to an array and its size, like this one:
typedef struct {
int * array;
int arr_size;
}IntArray;
and want to have this inside another struct, it can be done in two ways:
typedef struct{
IntArray ia;
//other variables
}Base1;
typedef struct{
IntArray * ia;
//other variables
}Base2;
What happens when I dynamically allocate Base1 and Base2 (e.g Base1 b1 = (Base1 *)malloc(sizeof(Base1));) and why should I choose one way instead of the other?
Nested structs' space exist as space in their parent struct, which means they don't need their own allocation (but they might still need their own initialization), whereas struct fields that are pointers need to be both allocated and freed when the parent object is initiated (this is a common cause of memory leaks in C because it does not have automatic object destructors like C++ does). Though if using a pointer you could point to another array/object that might exist on the stack (thus avoiding malloc/free) but then you might run into object lifetime bugs depending on the difference on scope and lifetimes of your objects.
Nested structs exist in-place, so they cannot be shared by other instances. This may or may not be ideal (you could solve this with a template in C++, in C you'd have to settle for a hideous preprocessor macro).
Because dynamically-allocated objects (such as your array and your Base2 type's nested ia member) exist in different locations in physical memory it means your code will not take advantage of spatial locality that the CPU's caches can take advantage of and you'll incur a double pointer dereference. So your code will run slower.
Anyway: when in C, you should generally try to minimize pointer use.
Basically the question is the same as, should I allocate a struct or a pointer to a struct? That is:
IntArray myStruct;
or
IntArray *myStructPtr;
The fact that the variables in question are within a struct makes no difference, you can choose either.
And you access them in the same manner as you would if they were not inside another structure, after referencing the field inside the outside structure of course, so
Base1 contains the actual IntArray struct so you would
Base1 *b1 = malloc(sizeof(*b1));
b1->ia.array = malloc(yourSizeHere);
Base2 contains a pointer to a IntArray struct, so you would need to point it to an existing IntArray struct or malloc() memory for it, and then access it as a pointer.
Base2 *b2 = malloc(sizeof(*b2));
b2->ia = malloc(sizeof(*(b2->ia)));
b2->ia->array = malloc(yourSizeHere);

Memory allocation of struct member variables

I am new to C. I have these two files set up in this way.
I do not fully understand how I am able to assign values in the Item array without dynamically allocating memory.
The line Collection c; places all fields on the stack, so is that why I can directly set array members?
//collection.c
typedef struct {
uint32 price;
uint32 itemId;
} Item;
typedef struct {
Item item[MAX_SIZE];
uint32 name;
} Collection;
void function(Collection * ptr)
{
int i;
uint32 id = 0;
for(i = 0; i < MAX_SIZE; i++)
{
ptr->item[i].price = 10;
ptr->item[i].itemId = id;
id++;
}
}
//collection_main.c
Collection c; //global struct variable
//calls function in collection.c
function(&c);
I do not fully understand how I am able to assign values in the Item array without dynamically allocating memory.
First, as you are new to C, be aware of a potential issue with passing C functions pointers (which is quite reasonable, BTW). Unless you can guarantee that your calling code will always pass a valid pointer you need to check that pointer value in the function as best you can. That will typically amount to checking for a non-null pointer like this :
if ( ptr == NULL )
return <whatever to signal an error> ;
In this case you did allocate memory, because you created a Collection variable and that contains allocated space for the required fields.
The line Collection c; places all fields on the stack,
If it's in a function it will (typically) allocate space on that function's stack frame, which you should logically view as a separate area that the calling code cannot access. Make no assumptions about the layout of the stack. A very typical bug is to try and return a pointer to an item declared inside a function, and even supposedly experienced programmers have been known to do it.
Another potential bug in passing a pointer to a function is trying to access beyond the limits of the space allocated and pointed to. This can do things like corrupt other variables or even crash code. Your own code is correctly using the declared constant size of the array, so no problem.
If you do this outside of a function (which is possible), you would be using space reserved by the OS for these type of variables. That may not be on the stack but elsewhere. The OS gets that information from the compiled code file.
so is that why I can directly set array members ?
C code (and the executable binary that's produced by the compiler) does not care or check whether the pointers you pass are valid or not. So it's possible to pass a bad pointer to a C function and cause chaos.
In this case you did allocate all the required valid memory when you declared the variable and you passed a pointer to that variable. So no problem.
Dynamic memory allocation
It is more usual to consider explicit allocation using the malloc() family of functions as dynamic allocation. Allocations for local and global variables may be dynamic in the sense that they can happen at runtime but the allocation and deallocation are not the responsibility of the programmer to explicitly control so you do not generally need to think about these as part of dynamic memory allocation.
A minor point to close :
uint32 name ;
I'd consider this a bad choice of field name. Using "name" implies a string, whereas you probably mean a string id from e.g. an array. So try something like :
uint32 nameid ;
instead.
You'd be surprised how many coding problems crop up in a production environment simply because of a poor choice of variable name. Make them informative if possible and practical.
This is just a good coding habit to get into, IMO.

How to include a variable-sized array as stuct member in C?

I must say, I have quite a conundrum in a seemingly elementary problem. I have a structure, in which I would like to store an array as a field. I'd like to reuse this structure in different contexts, and sometimes I need a bigger array, sometimes a smaller one. C prohibits the use of variable-sized buffer. So the natural approach would be declaring a pointer to this array as struct member:
struct my {
struct other* array;
}
The problem with this approach however, is that I have to obey the rules of MISRA-C, which prohibits dynamic memory allocation. So then if I'd like to allocate memory and initialize the array, I'm forced to do:
var.array = malloc(n * sizeof(...));
which is forbidden by MISRA standards. How else can I do this?
Since you are following MISRA-C, I would guess that the software is somehow mission-critical, in which case all memory allocation must be deterministic. Heap allocation is banned by every safety standard out there, not just by MISRA-C but by the more general safety standards as well (IEC 61508, ISO 26262, DO-178 and so on).
In such systems, you must always design for the worst-case scenario, which will consume the most memory. You need to allocate exactly that much space, no more, no less. Everything else does not make sense in such a system.
Given those pre-requisites, you must allocate a static buffer of size LARGE_ENOUGH_FOR_WORST_CASE. Once you have realized this, you simply need to find a way to keep track of what kind of data you have stored in this buffer, by using an enum and maybe a "size used" counter.
Please note that not just malloc/calloc, but also VLAs and flexible array members are banned by MISRA-C:2012. And if you are using C90/MISRA-C:2004, there are no VLAs, nor are there any well-defined use of flexible array members - they invoked undefined behavior until C99.
Edit: This solution does not conform to MISRA-C rules.
You can kind of include VLAs in a struct definition, but only when it's inside a function. A way to get around this is to use a "flexible array member" at the end of your main struct, like so:
#include <stdio.h>
struct my {
int len;
int array[];
};
You can create functions that operate on this struct.
void print_my(struct my *my) {
int i;
for (i = 0; i < my->len; i++) {
printf("%d\n", my->array[i]);
}
}
Then, to create variable length versions of this struct, you can create a new type of struct in your function body, containing your my struct, but also defining a length for that buffer. This can be done with a varying size parameter. Then, for all the functions you call, you can just pass around a pointer to the contained struct my value, and they will work correctly.
void create_and_use_my(int nelements) {
int i;
// Declare the containing struct with variable number of elements.
struct {
struct my my;
int array[nelements];
} my_wrapper;
// Initialize the values in the struct.
my_wrapper.my.len = nelements;
for (i = 0; i < nelements; i++) {
my_wrapper.my.array[i] = i;
}
// Print the struct using the generic function above.
print_my(&my_wrapper.my);
}
You can call this function with any value of nelements and it will work fine. This requires C99, because it does use VLAs. Also, there are some GCC extensions that make this a bit easier.
Important: If you pass the struct my to another function, and not a pointer to it, I can pretty much guarantee you it will cause all sorts of errors, since it won't copy the variable length array with it.
Here's a thought that may be totally inappropriate for your situation, but given your constraints I'm not sure how else to deal with it.
Create a large static array and use this as your "heap":
static struct other heap[SOME_BIG_NUMBER];
You'll then "allocate" memory from this "heap" like so:
var.array = &heap[start_point];
You'll have to do some bookkeeping to keep track of what parts of your "heap" have been allocated. This assumes that you don't have any major constraints on the size of your executable.

Resources