Public sizeof for privately defined struct - c

I have a little data-hiding module that looks like this:
/** mydata.h */
struct _mystruct_t;
typedef struct _mystruct_t mystruct;
mystruct *newMystruct();
void freeMystruct( mystruct** p );
/** mydata.c */
#include "mydata.h"
struct _mystruct_t {
int64_t data1;
int16_t data2;
int16_t data3;
};
// ... related definitions ... //
For the most part, this is what I want; although simple, the struct has strict consistency requirements and I really don't want to provide access to the data members.
The problem is that in client code I would like to include the struct in another struct which I would like to allocate on the stack. Right now I am jumping through hoops to free the mystruct*s in some client code. Since a) mystruct is pretty small and I really don't think it's going to get big anytime soon and b) it's not a problem that client code has to recompile if I ever change mystruct, I would like to make the size of mystruct public (i.e. in the header).
Two possibilities I've considered:
/** mydata.h */
typedef struct {
// SERIOUSLY DON'T ACCESS THESE MEMBERS
int64_t data1;
int16_t data2;
int16_t data3;
} mystruct;
I think the drawbacks here speak for themselves.
OR
/** mydata.h */
#define SIZEOF_MYSTRUCT (sizeof(int64_t)+sizeof(int16_t)+sizeof(int16_t))
// everything else same as before...
/** mydata.c */
// same as before...
_Static_assert (SIZEOF_MYSTRUCT == sizeof(mystruct), "SIZEOF_MYSTRUCT is incorrect")
Of course this seems non-ideal since I have to update this value manually and I don't know if/how alignment of the struct could actually cause this to be incorrect (I thought of the static assert while writing this question, it partially addresses this concern).
Is one of these preferred? Or even better, is there some clever trick to provide the actual struct definition in the header while later somehow hiding the ability to access the members?

You can create different .h file distributed to the end user that would define your secret structure just as byte array (you can't hide data without crypto/checksumming more than just saying "here are some bytes"):
typedef struct {
unsigned char data[12];
} your_struct;
You just have to make sure that both structures are the same for all the compilers and options, thus using __declspec(align()) (for VC) in your library code, so for example:
// Client side
__declspec(align(32)) typedef struct {
int64_t data1;
int16_t data2;
int16_t data3;
} mystruct;
To prevent structure from being 16B long instead of commonly expected 12B. Or just use /Zp compiler option.

I would stay with a configure time generated #define describing the size of the mystruct and possibly a typedef char[SIZEOF_MYSTRUCT] opaque_mystruct to simplify creation of placeholders for mystruct.
Likely the idea of configure time actions deserves some explanations. The general idea is to
place the definition of the mystruct into a private, non-exported but nevertheless distributed header,
create a small test application being built and executed before the library. The test application would #include the private header, and print actual sizeof (mystruct) for a given compiler and compile options
create an appropriate script which would create a library config.h with #define SIZEOF_MYSTRUCT <calculated_number> and possibly definition of opaque_mystruct.
It's convenient to automate these steps with a decent build system, for examplecmake, gnu autotools or any other with support of configure stage. Actually all mentioned systems have built-in facilities which simplify the whole task to invocation of few predefined macros.

I've been researching and thinking and took one of my potential answers and took it to the next level; I think it addresses all of my concerns. Please critique.
/** in mydata.h */
typedef const struct { const char data[12]; } mystruct;
mystruct createMystruct();
int16_t exampleMystructGetter( mystruct *p );
// other func decls operating on mystruct ...
/** in mydata.c */
typedef union {
mystruct public_block;
struct mystruct_data_s {
int64_t d1;
int16_t d2
int16_t d3;
} data;
} mystruct_data;
// Optionally use '==' instead of '<=' to force minimal space usage
_Static_assert (sizeof(struct mystruct_data_s) <= sizeof(mystruct), "mystruct not big enough");
mystruct createMystruct(){
static mystruct_data mystruct_blank = { .data = { .d1 = 1, .d2 = 2, .d3 = 3 } };
return mystruct_blank.public_block;
}
int16_t exampleMystructGetter(mystruct *p) {
mystruct_data *a = (mystruct_data*)p;
return a->data.d2;
}
Under gcc 4.7.3 this compiles without warnings. A simple test program to create and access via the getter also compiles and works as expected.

Related

Is it possible to allocate structs on the stack with its definition hidden in a source file?

I have the following header file:
struct StackList_s;
typedef struct StackList_s StackList_t;
// From here I add in the method signatures
And the following source file:
struct StackList_s
{
integer_t count;
struct StackListNode_s *top; // Here begins the linked list
// Some other members that store information about the stack
integer_t version_id;
};
// From here I define StackListNode_s and implement the StackList_s functions
// Note that the user will never manipulate directly a StackListNode_s
// There are functions that will handle the free() of each node correctly
I hide the struct definition in the source file so that anyone using this stack can't modify directly its members, since changing them requires some input treatment or checking for certain invalid states.
Currently, to get a new stack you have to use the following:
// malloc(sizeof(StackList_t)) and set members to default
StackList_t *stack = stl_new(/* Some info parameters */);
But I can only do this allocating a StackList_t in the heap. What I want to do is to have the StackList_t allocated on the stack and then its nodes can be allocated in the heap allong with their data and pointers to other nodes. This way I can give the user a choice, if either the struct is being used locally or if he will pass it around functions as an allocated resource.
StackList_t stack;
stl_init(&stack, /* Info parameters */); // No malloc, only setting members to 0
But of course I can't do this because the definition of struct StackList_s is in the source file. So here are my questions:
Is it possible to, at the same time, not allow access to members of a struct and allocate that same struct in the stack?
Is there any way to tell the compiler the size of my struct?
You can do that with VLAs or alloca in Linux:
Library header:
struct StackList_s;
typedef struct StackList_s StackList_t;
extern const size_t StackList_size;
// If you're using VLAs
extern const size_t StackList_align;
StackList_t* stl_init_inline(char stack_source[], ...);
Library source:
#include "header.h"
struct StackList_s {
// ...
};
const size_t StackList_size = sizeof(StackList_t);
// If you're using VLAs
#include <stdalign.h>
#include <stdint.h>
const size_t StackList_align = alignof(StackList_t);
StackList_t* stl_init_inline(char stack_source[], ...) {
// align the address to the nearest multiple of StackList_align
uintptr_t address = (uintptr_t) ((void*) stack_source);
if (address % StackList_align != 0) {
address += StackList_align - address % StackList_align;
}
StackList_t* stack = (StackList_t*) ((void*) address);
stl_init(stack, ...);
return stack;
}
Main source
#include <header.h>
StackList_t* stack = alloca(Stacklist_size);
stl_init(stack, ...);
char stack_source[StackList_size + StackList_align - 1]; // Not compile time.
StackList_t* stack = stl_init_inline(stack_source, ...);
This would allocate it on the stack, and you won't need to free it, but it's slower and more verbose than just StackList_t stack_source;. (And alloca is Linux only)
For the second question, you need the full definition of a struct to get it's size. Common pitfalls include the fact that sizeof(struct { int a; }) == sizeof(struct { int a; }) can be false. It probably won't be though, so you can do #define StackList_size sizeof(struct { integer_t count; struct StackListNode_s *top; integer_t version_id; }) but that also leads to a lot of code duplication.
I personally would just put the struct definition in the header file, and just declare "don't mess with the members or my methods won't work" in a comment somewhere (Maybe making the names start with _ to give a hint that they are private)
You could do something similar to Artyer's answer without using VLA's by using a #define instead
Header:
#define STACKLISTSIZE 32
typedef uint8_t stl_storage[STACKLISTSIZE];
typedef struct stacklist_s stacklist_t;
stacklist_t* stl_create_from_stack(stl_storage b); //user provides memory
stacklist_t* stl_allocate(void); //library allocates memory, user must free.
Source:
int myfunction()
{
stl_storage x;
stacklist_t* sp = stl_create_from_stack(x);
//do something with sp.
}
Make sure you have a compile-time assert that sizeof(stack_s) == STACKSTRUCTSIZE in the implementation file.
Some implementations guarantee that calls between compilation units will be processed in a fashion consistent with the platform's Application Binary Interface (ABI), without regard for what a called function is going to do with storage whose address it receives, or what a caller will have done with storage whose address it supplies, or will do with such storage once the function returns. On such implementations, given something like:
// In header
typedef union FOO_PUBLIC_UNION {
uint64_t dat[4]; // Allocate space
double dummy_align1; // Force alignment
void *dummy_align2; // Force alignment
} FOO;
void act_on_foo(FOO_PUBLIC_UNION*);
// In code
FOO x = {0};
act_on_foo(&x);
in one compilation unit, and something like:
struct FOO_PRIVATE {
int this; float that; double whatever;
};
typedef union FOO_PUBLIC_UNION { uint64_t dat[4]; struct FOO_PRIVATE priv; } FOOPP;
void act_on_foo(FOO *p)
{
FOOPP *pp = (FOOPP*)p;
pp->priv.whatever = 1234.567;
}
provided that the size of FOO and FOOPP match, the behavior of calling an external function from the first compilation unit would be defined as allocating sizeof(FOO) bytes, zeroing them, and passing their address to act_on_foo, whose behavior would then be defined as acting upon the bytes to which it receives an address, without regard for how they got their values or what the caller would do with them later.
Unfortunately, even though almost every implementation should be capable of producing behavior consistent with calling a function it knows nothing about, there is no standard way of indicating to a compiler that a particular function call should be viewed as "opaque". Implementations intended for purposes where that would be useful could and typically did support such semantics with "ordinary" function calls whether or not the Standard required that, and such semantics would offer little value on implementations intended only for purposes where they wouldn't be useful. Unfortunately, this has led to a Catch 22: there's no reason for the Standard to mandate things implementations would be free to do, with or without a mandate, in cases where they're useful, but some compiler writers treat the Standard's lack of a mandate as an encouragement to deny support.

Data encapsulation in C

I am currently working on an embedded system and I have a component on a board which appears two times. I would like to have one .c and one .h file for the component.
I have the following code:
typedef struct {
uint32_t pin_reset;
uint32_t pin_drdy;
uint32_t pin_start;
volatile avr32_spi_t *spi_module;
uint8_t cs_id;
} ads1248_options_t;
Those are all hardware settings. I create two instances of this struct (one for each part).
Now I need to keep an array of values in the background. E.g. I can read values from that device every second and I want to keep the last 100 values. I would like this data to be non-accessible from the "outside" of my component (only through special functions in my component).
I am unsure on how to proceed here. Do I really need to make the array part of my struct? What I thought of would be to do the following:
int32_t *adc_values; // <-- Add this to struct
int32_t *adc_value_buffer = malloc(sizeof(int32_t) * 100); // <-- Call in initialize function, this will never be freed on purpose
Yet, I will then be able to access my int32_t pointer from everywhere in my code (also from outside my component) which I do not like.
Is this the only way to do it? Do you know of a better way?
Thanks.
For the specific case of writing hardware drivers for a microcontroller, which this appears to be, please consider doing like this.
Otherwise, use opaque/incomplete type. You'd be surprised to learn how shockingly few C programmers there are who know how to actually implement 100% private encapsulation of custom types. This is why there's some persistent myth about C lacking the OO feature known as private encapsulation. This myth originates from lack of C knowledge and nothing else.
This is how it goes:
ads1248.h
typedef struct ads1248_options_t ads1248_options_t; // incomplete/opaque type
ads1248_options_t* ads1248_init (parameters); // a "constructor"
void ads1248_destroy (ads1248_options_t* ads); // a "destructor"
ads1248.c
#include "ads1248.h"
struct ads1248_options_t {
uint32_t pin_reset;
uint32_t pin_drdy;
uint32_t pin_start;
volatile avr32_spi_t *spi_module;
uint8_t cs_id;
};
ads1248_options_t* ads1248_init (parameters)
{
ads1248_options_t* ads = malloc(sizeof(ads1248_options_t));
// do things with ads based on parameters
return ads;
}
void ads1248_destroy (ads1248_options_t* ads)
{
free(ads);
}
main.c
#include "ads1248.h"
int main()
{
ads1248_options_t* ads = ads1248_init(parameters);
...
ads1248_destroy(ads);
}
Now the code in main cannot access any of the struct members, all members are 100% private. It can only create a pointer to a struct object, not an instance of it. Works exactly like abstract base classes in C++, if you are familiar with that. The only difference is that you'll have to call the init/destroy functions manually, rather than using true constructors/destructors.
It's common that structures in C are defined completely in the header, although they're totally opaque (FILE, for example), or only have some of their fields specified in the documentation.
C lacks private to prevent accidental access, but I consider this a minor problem: If a field isn't mentioned in the spec, why should someone try to access it? Have you ever accidentally accessed a member of a FILE? (It's probably better not to do things like having a published member foo and a non-published fooo which can easily be accessed by a small typo.) Some use conventions like giving them "unusual" names, for example, having a trailing underscore on private members.
Another way is the PIMPL idiom: Forward-declare the structure as an incomplete type and provide the complete declaration in the implementation file only. This may complicate debugging, and may have performance penalties due to less possibilities for inlining and an additional indirection, though this may be solvable with link-time optimization. A combination of both is also possible, declaring the public fields in the header along with a pointer to an incomplete structure type holding the private fields.
I would like this data to be non-accessible from the "outside" of my
component (only through special functions in my component).
You can do it in this way (a big malloc including the data):
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
typedef struct {
uint32_t pin_reset;
uint32_t pin_drdy;
uint32_t pin_start;
volatile avr32_spi_t *spi_module;
uint8_t cs_id;
} ads1248_options_t;
void fn(ads1248_options_t *x)
{
int32_t *values = (int32_t *)(x + 1);
/* values are not accesible via a member of the struct */
values[0] = 10;
printf("%d\n", values[0]);
}
int main(void)
{
ads1248_options_t *x = malloc(sizeof(*x) + (sizeof(int32_t) * 100));
fn(x);
free(x);
return 0;
}
You could make a portion of your structure private like this.
object.h
struct object_public {
uint32_t public_item1;
uint32_t public_item2;
};
object.c
struct object {
struct object_public public;
uint32_t private_item1;
uint32_t *private_ptr;
}
A pointer to an object can be cast to a pointer to object_public because object_public is the first item in struct object. So the code outside of object.c will reference the object through a pointer to object_public. While the code within object.c references the object through a pointer to object. Only the code within object.c will know about the private members.
The program should not define or allocate an instance object_public because that instance won't have the private stuff appended to it.
The technique of including a struct as the first item in another struct is really a way for implementing single inheritance in C. I don't recall ever using it like this for encapsulation. But I thought I would throw the idea out there.
You can:
Make your whole ads1248_options_t an opaque type (as already discussed in other answers)
Make just the adc_values member an opaque type, like:
// in the header(.h)
typedef struct adc_values adc_values_t;
// in the code (.c)
struct adc_values {
int32_t *values;
};
Have a static array of array of values "parallel" to your ads1248_options_t and provide functions to access them. Like:
// in the header (.h)
int32_t get_adc_value(int id, int value_idx);
// in the code (.c)
static int32_t values[MAX_ADS][MAX_VALUES];
// or
static int32_t *values[MAX_ADS]; // malloc()-ate members somewhere
int32_t get_adc_value(int id, int value_idx) {
return values[id][value_idx]
}
If the user doesn't know the index to use, keep an index (id) in your ads1248_options_t.
Instead of a static array, you may provide some other way of allocating the value arrays "in parallel", but, again, need a way to identify which array belongs to which ADC, where its id is the simplest solution.

memcpy Inheritance-like structs - is it safe?

I have two structs I'm working with, and they are defined nearly identical. These are defined in header files that I cannot modify.
typedef struct
{
uint32_t property1;
uint32_t property2;
} CarV1;
typedef struct
{
uint32_t property1;
uint32_t property2;
/* V2 specific properties */
uint32_t property3;
uint32_t property4;
} CarV2;
In my code, I initialize the V2 struct at the top of my file, to cover all my bases:
static const carV2 my_car = {
.property1 = value,
.property2 = value,
/* V2 specific properties */
.property3 = value,
.property4 = value
};
Later, I want to retrieve the values I have initialized and copy them into the struct to be returned from a function via void pointer. I sometimes want V2 properties of the car, and sometimes V1. How can I memcpy safely without having duplicate definitions/initializations? I'm fairly new to C, and its my understanding that this is ugly and engineers to follow me in looking at this code will not approve. What's a clean way to do this?
int get_properties(void *returned_car){
int version = get_version();
switch (version){
case V1:
{
CarV1 *car = returned_car;
memcpy(car, &my_car, sizeof(CarV1)); // is this safe? What's a better way?
}
case V2:
{
CarV2 *car = returned_car;
memcpy(car, &my_car, sizeof(CarV2));
}
}
}
Yes, it's definitely possible to do what you're asking.
You can use a base struct member to implement inheritance, like this:
typedef struct
{
uint32_t property1;
uint32_t property2;
} CarV1;
typedef struct
{
CarV1 base;
/* V2 specific properties */
uint32_t property3;
uint32_t property4;
} CarV2;
In this case, you're eliminating the duplicate definitions. Of course, on a variable of type CarV2*, you can't reference the fields of the base directly - you'll have to do a small redirection, like this:
cv2p->base.property1 = 0;
To upcast to CarV1*, do this:
CarV1* cv1p = &(cv2p->base);
c1vp->property1 = 0;
You've written memcpy(&car, &my_car, sizeof(CarV1)). This looks like a mistake, because it's copying the data of the pointer variable (that is, the address of your struct, instead of the struct itself). Since car is already a pointer (CarV1*) and I'm assuming that so is my_car, you probably wanted to do this instead:
memcpy(car, my_car, sizeof(CarV1));
If my_car is CarV2* and car is CarV1* (or vice versa), then the above code is guaranteed to work by the C standard, because the first member of a struct is always at a zero offset and, therefore, the memory layout of those two for the first sizeof(CarV1) bytes will be identical.
The compiler is not allowed to align/pad that part differently (which I assume is what you meant about optimizing), because you've explicitly declared the first part of CarV2 to be a CarV1.
Since in your case you are stuck with identically defined structs that you can't change, you may find useful that the C standard defines a macro/special form called offsetof.
To be absolutely sure about your memory layouts, I'd advise that you put a series of checks during the initialization phase of your program that verifies whether the offsetof(struct CarV1, property1) is equal to offsetof(struct CarV2, property1) etc for all common properties:
void validateAlignment(void)
{
if (offsetof(CarV1, property1) != offsetof(CarV2, property1)) exit(-1);
if (offsetof(CarV1, property2) != offsetof(CarV2, property2)) exit(-1);
// and so on
}
This will stop the program for going ahead in case the compiler has done anything creative with the padding.
It also won't slow down your program's initialization because offsetof is actually calculated at compile time. So, with all the optimizations in place, the void validateAlignment(void) function should be optimized out completely (because a static analysis would show that the exit conditions are always false).
What you wrote will almost work, except that instead of memcpy(&car, ... you should just have memcpy (car, ..., but there is no reason to use memcpy in such a case. Rather, you should just copy each of the fields in a separate statement.
car->property1 = my_car.property1
(is my_car a pointer or not? it's impossible to tell from the code fragment)
For the second case, I think you can just assign the entire struct: *car = my_car
there is no perfect solution but one way is to use a union
typedef union car_union
{
CarV1 v1;
CarV2 v2;
} Car;
that way the size will not differ when you do a memcpy - if version v1 then v2 specific parts will not be initialized.
In C and Objective-C, this is fine in practice. (In theory, the compiler must see the declaration of a union containing both structs as members).
In C++ (and Objective-C++), the language very carefully describes when this is safe and when it isn't. For example, if you start with
typedef struct {
public:
...
then the compiler is free to re-arrange where struct members are. If the struct uses no C++ features then you are safe.

Declare a variable that is Defined in a struct

Consider the following struct defined in ModuleA:
typedef struct{
int A;
int B;
int C[4];
}myStructType;
myStructType MyStruct;
If I wanted to use this struct from ModuleB, then I would declare the struct in the ModuleA header like this:
extern myStructType MyStruct;
So far, so good. Other modules can read and write MyStruct by including the Module A header file.
Now the question:
How can I declare only part of the struct in the Module A header file? For example, if I wanted ModuleB to be able to read and write MyStruct.C (or, to make things a bit easier, perhaps MyStruct.A or MyStruct.B), but not necessarily know that it's in a struct or know about elements A and B.
Edit: I should probably also specify that this will go in an embedded system which does basically all of its memory allocation at compile time, so we can be extremely confident at compile time that we know where MyStruct is located (and it's not going to move around).
Edit2: I'll also clarify that I'm not necessarily trying to prevent other modules from accessing parts of the struct, but rather, I'm attempting to allow other modules to access individual elements without having to do MyStruct.Whatever because other modules probably only care about a single element and not the whole structure.
You would have to encapsulate it, i.e. make a private variable such as:
static myStructType the_struct;
in some C file, and then provide an API to get access to the parts:
int * getC(void)
{
return the_struct.C;
}
this would then let other C files get access to an integer array by calling
int *some_c = getC();
some_c[0] = 4711;
or whatever. It can be made "tighter" by being more explicit about the length of the returned array of course, I aimed for the minimal solution.
While in theory there might be some problems with the cleanliness of this solution (e.g. structure alignment), in practice it usually works if it compiles and if it doesn't compile, you can alter the structures to make it compile:
#include <stddef.h>
#define C_ASSERT(expr) extern char CAssertExtern[(expr)?1:-1]
// You keep this definition private to module A (e.g. in a .c file):
typedef struct
{
int A;
int B;
int C[4];
} PrivateStruct;
// You expose this definition to all modules (in an .h file):
typedef struct
{
char reserved[2*sizeof(int)];
int C[4];
} PublicStruct;
// You put these in module A (in a .c file):
C_ASSERT(sizeof(PrivateStruct) == sizeof(PublicStruct));
C_ASSERT(offsetof(PrivateStruct,C) == offsetof(PublicStruct,C));
int main(void)
{
return 0;
}
In the public .h file you can lie to the world about the global variable type:
extern PublicStruct MyStruct; // It's "PrivateStruct MyStruct;" in module A
If the two structure definitions go out of sync, you get a compile-time error (match, mismatch).
You will need to manually define the size of the reserved part of PublicStruct, perhaps by trial and error.
You get the idea.
To make the long story short — you can't. To make it a bit longer, you can't reliably do it.
You could try to use a kind of getter:
in ModuleA:
typedef struct{
int A;
int B;
int C[4];
}myStructType;
myStructType MyStruct;
int getA()
{
return MyStruct.A;
}
and so on.
Instead switch over to c++
You can't do exactly what you've described but it's common to have a struct used as a header, which is contiguous with a buffer which has it's internals only known to a particular module.
It's fairly obvious what this struct is the header for, but it still works as an example:
typedef struct _FILE_NOTIFY_INFORMATION {
ULONG NextEntryOffset;
ULONG Action;
ULONG NameLength;
ULONG Name[1];
} FILE_NOTIFY_INFORMATION, *PFILE_NOTIFY_INFORMATION;
This struct (from Microsofts NativeSDK) is designed to be the header of a variable length buffer. All modules can work out how long the buffer is by looking at NameLength but you could use this method to store anything in the buffer which goes with it. That might only be known by a particular module with the others just using the length to copy it etc..
If it not for hiding but for structuring, then do structure it. For example like so:
moduleA.h:
typedef struct{
int A;
}myStructModuleAType;
extern myStructModuleAType myStructModuleA;
moduleA.c:
myStructModuleAType myStructModuleA;
moduleB.h:
typedef struct{
int B;
}myStructModuleBType;
extern myStructModuleBType myStructModuleB;
moduleB.c:
myStructModuleBType myStructModuleB;
main.h:
#include "moduleA.h"
#include "moduleB.h"
typedef struct{
myStructModuleAType * pmyStructModuleA;
myStructModuleBType * pmyStructModuleB;
int C[4];
}myStructType;
extern myStructType myStruct;
main.c:
#include "main.h"
myStructType myStruct;
myStructType myStruct = {
.pmyStructModuleA = &myStructModuleA
.pmyStructModuleB = &myStructModuleB
};

Static allocation of opaque data types

Very often malloc() is absolutely not allowed when programming for embedded systems. Most of the time I'm pretty able to deal with this, but one thing irritates me: it keeps me from using so called 'opaque types' to enable data hiding. Normally I'd do something like this:
// In file module.h
typedef struct handle_t handle_t;
handle_t *create_handle();
void operation_on_handle(handle_t *handle, int an_argument);
void another_operation_on_handle(handle_t *handle, char etcetera);
void close_handle(handle_t *handle);
// In file module.c
struct handle_t {
int foo;
void *something;
int another_implementation_detail;
};
handle_t *create_handle() {
handle_t *handle = malloc(sizeof(struct handle_t));
// other initialization
return handle;
}
There you go: create_handle() performs a malloc() to create an 'instance'. A construction often used to prevent having to malloc() is to change the prototype of create_handle() like this:
void create_handle(handle_t *handle);
And then the caller could create the handle this way:
// In file caller.c
void i_am_the_caller() {
handle_t a_handle; // Allocate a handle on the stack instead of malloc()
create_handle(&a_handle);
// ... a_handle is ready to go!
}
But unfortunately this code is obviously invalid, the size of handle_t isn't known!
I never really found a solution to solve this in a proper way. I'd very like to know if anyone has a proper way of doing this, or maybe a complete different approach to enable data hiding in C (not using static globals in the module.c of course, one must be able to create multiple instances).
You can use the _alloca function. I believe that it's not exactly Standard, but as far as I know, nearly all common compilers implement it. When you use it as a default argument, it allocates off the caller's stack.
// Header
typedef struct {} something;
size_t get_size();
something* create_something(void* mem);
// Usage
something* ptr = create_something(_alloca(get_size())); // or define a macro.
// Implementation
size_t get_size() {
return sizeof(real_handle_type);
}
something* create_something(void* mem) {
real_handle_type* ptr = (real_handle_type*)mem;
// Fill out real_type
return (something*)mem;
}
You could also use some kind of object pool semi-heap - if you have a maximum number of currently available objects, then you could allocate all memory for them statically, and just bit-shift for which ones are currently in use.
#define MAX_OBJECTS 32
real_type objects[MAX_OBJECTS];
unsigned int in_use; // Make sure this is large enough
something* create_something() {
for(int i = 0; i < MAX_OBJECTS; i++) {
if (!(in_use & (1 << i))) {
in_use |= (1 << i);
return &objects[i];
}
}
return NULL;
}
My bit-shifting is a little off, been a long time since I've done it, but I hope that you get the point.
One way would be to add something like
#define MODULE_HANDLE_SIZE (4711)
to the public module.h header. Since that creates a worrying requirement of keeping this in sync with the actual size, the line is of course best auto-generated by the build process.
The other option is of course to actually expose the structure, but document it as being opaque and forbidding access through any other means than through the defined API. This can be made more clear by doing something like:
#include "module_private.h"
typedef struct
{
handle_private_t private;
} handle_t;
Here, the actual declaration of the module's handle has been moved into a separate header, to make it less obviously visible. A type declared in that header is then simply wrapped in the desired typedef name, making sure to indicate that it is private.
Functions inside the module that take handle_t * can safely access private as a handle_private_t value, since it's the first member of the public struct.
Unfortunately, I think the typical way to deal with this problem is by simply having the programmer treat the object as opaque - the full structure implementation is in the header and available, it's just the responsibility of the programmer to not use the internals directly, only through the APIs defined for the object.
If this isn't good enough, a few options might be:
use C++ as a 'better C' and declare the internals of the structure as private.
run some sort of pre-processor on the headers so that the internals of the structure are declared, but with unusable names. The original header, with good names, will be available to the implementation of the APIs that manage the structure. I've never seen this technique used - it's just an idea off the top of my head that might be possible, but seems like far more trouble than it's worth.
have your code that uses opaque pointers declare the statically allocated objects as extern (ie., globals) Then have a special module that has access to the full definition of the object actually declare these objects. Since only the 'special' module has access to the full definition, the normal use of the opaque object remains opaque. However, now you have to rely on your programmers to not abuse the fact that thee objects are global. You have also increased the change of naming collisions, so that need to be managed (probably not a big problem, except that it might occur unintentionally - ouch!).
I think overall, just relying on your programmers to follow the rules for the use of these objects might be the best solution (though using a subset of C++ isn't bad either in my opinion). Depending on your programmers to follow the rules of not using the structure internals isn't perfect, but it's a workable solution that is in common use.
One solution if to create a static pool of struct handle_t objects, and provide then as neceessary. There are many ways to achieve that, but a simple illustrative example follows:
// In file module.c
struct handle_t
{
int foo;
void* something;
int another_implementation_detail;
int in_use ;
} ;
static struct handle_t handle_pool[MAX_HANDLES] ;
handle_t* create_handle()
{
int h ;
handle_t* handle = 0 ;
for( h = 0; handle == 0 && h < MAX_HANDLES; h++ )
{
if( handle_pool[h].in_use == 0 )
{
handle = &handle_pool[h] ;
}
}
// other initialization
return handle;
}
void release_handle( handle_t* handle )
{
handle->in_use = 0 ;
}
There are faster faster ways of finding an unused handle, you could for example keep a static index that increments each time a handle is allocated and 'wraps-around' when it reaches MAX_HANDLES; this would be faster for the typical situation where several handles are allocated before releasing any one. For a small number of handles however, this brute-force search is probably adequate.
Of course the handle itself need no longer be a pointer but could be a simple index into the hidden pool. This would enhance data hiding and protection of the pool from external access.
So the header would have:
typedef int handle_t ;
and the code would change as follows:
// In file module.c
struct handle_s
{
int foo;
void* something;
int another_implementation_detail;
int in_use ;
} ;
static struct handle_s handle_pool[MAX_HANDLES] ;
handle_t create_handle()
{
int h ;
handle_t handle = -1 ;
for( h = 0; handle != -1 && h < MAX_HANDLES; h++ )
{
if( handle_pool[h].in_use == 0 )
{
handle = h ;
}
}
// other initialization
return handle;
}
void release_handle( handle_t handle )
{
handle_pool[handle].in_use = 0 ;
}
Because the handle returned is no longer a pointer to the internal data, and inquisitive or malicious user cannnot gain access to it through the handle.
Note that you may need to add some thread-safety mechanisms if you are getting handles in multiple threads.
I faced a similar problem in implementing a data structure in which the header of the data structure, which is opaque, holds all the various data that needs to be carried over from operation to operation.
Since re-initialization might cause a memory leak, I wanted to make sure that data structure implementation itself never actually overwrite a point to heap allocated memory.
What I did is the following:
/**
* In order to allow the client to place the data structure header on the
* stack we need data structure header size. [1/4]
**/
#define CT_HEADER_SIZE ( (sizeof(void*) * 2) \
+ (sizeof(int) * 2) \
+ (sizeof(unsigned long) * 1) \
)
/**
* After the size has been produced, a type which is a size *alias* of the
* header can be created. [2/4]
**/
struct header { char h_sz[CT_HEADER_SIZE]; };
typedef struct header data_structure_header;
/* In all the public interfaces the size alias is used. [3/4] */
bool ds_init_new(data_structure_header *ds /* , ...*/);
In the implementation file:
struct imp_header {
void *ptr1,
*ptr2;
int i,
max;
unsigned long total;
};
/* implementation proper */
static bool imp_init_new(struct imp_header *head /* , ...*/)
{
return false;
}
/* public interface */
bool ds_init_new(data_structure_header *ds /* , ...*/)
{
int i;
/* only accept a zero init'ed header */
for(i = 0; i < CT_HEADER_SIZE; ++i) {
if(ds->h_sz[i] != 0) {
return false;
}
}
/* just in case we forgot something */
assert(sizeof(data_structure_header) == sizeof(struct imp_header));
/* Explicit conversion is used from the public interface to the
* implementation proper. [4/4]
*/
return imp_init_new( (struct imp_header *)ds /* , ...*/);
}
client side:
int foo()
{
data_structure_header ds = { 0 };
ds_init_new(&ds /*, ...*/);
}
To expand on some old discussion in comments here, you can do this by providing an allocator function as part of the constructor call.
Given some opaque type typedef struct opaque opaque;, then
Define a function type for an allocator function typedef void* alloc_t (size_t bytes);. In this case I used the same signature as malloc/alloca for compatibility purposes.
The constructor implementation would look something like this:
struct opaque
{
int foo; // some private member
};
opaque* opaque_construct (alloc_t* alloc, int some_value)
{
opaque* obj = alloc(sizeof *obj);
if(obj == NULL) { return NULL; }
// initialize members
obj->foo = some_value;
return obj;
}
That is, the allocator gets provided the size of the opaque object from inside the constructor, where it is known.
For static storage allocation like done in embedded systems, we can create a simple static memory pool class like this:
#define MAX_SIZE 100
static uint8_t mempool [MAX_SIZE];
static size_t mempool_size=0;
void* static_alloc (size_t size)
{
uint8_t* result;
if(mempool_size + size > MAX_SIZE)
{
return NULL;
}
result = &mempool[mempool_size];
mempool_size += size;
return result;
}
(This might be allocated in .bss or in your own custom section, whatever is preferred.)
Now the caller can decide how each object is allocated and all objects in for example a resource-constrained microcontroller can share the same memory pool. Usage:
opaque* obj1 = opaque_construct(malloc, 123);
opaque* obj2 = opaque_construct(static_alloc, 123);
opaque* obj3 = opaque_construct(alloca, 123); // if supported
This is useful for the purpose of saving memory. In case you have multiple drivers in a microcontroller application and each makes sense to hide behind a HAL, they can now share the same memory pool without the driver implementer having to speculate how many instances of each opaque type that will be needed.
Say for example that we have generic HAL for hardware peripherals to UART, SPI and CAN. Rather than each implementation of the driver providing its own memory pool, they can all share a centralized section. Normally I would otherwise solve that by having a constant such as UART_MEMPOOL_SIZE 5 exposed in uart.h so that the user may change it after how many UART objects they need (like the the number of present UART hardware peripherals on some MCU, or the number of CAN bus message objects required for some CAN implementation etc etc). Using #define constants is an unfortunate design since we typically don't want application programmers to mess around with provided standardized HAL headers.
I'm a little confused why you say you can't use malloc(). Obviously on an embedded system you have limited memory and the usual solution is to have your own memory manager which mallocs a large memory pool and then allocates chunks of this out as needed. I've seen various different implementations of this idea in my time.
To answer your question though, why don't you simply statically allocate a fixed size array of them in module.c add an "in-use" flag, and then have create_handle() simply return the pointer to the first free element.
As an extension to this idea, the "handle" could then be an integer index rather than the actual pointer which avoids any chance of the user trying to abuse it by casting it to their own definition of the object.
The least grim solution I've seen to this has been to provide an opaque struct for the caller's use, which is large enough, plus maybe a bit, along with a mention of the types used in the real struct, to ensure that the opaque struct will be aligned well enough compared to the real one:
struct Thing {
union {
char data[16];
uint32_t b;
uint8_t a;
} opaque;
};
typedef struct Thing Thing;
Then functions take a pointer to one of those:
void InitThing(Thing *thing);
void DoThingy(Thing *thing,float whatever);
Internally, not exposed as part of the API, there is a struct that has the true internals:
struct RealThing {
uint32_t private1,private2,private3;
uint8_t private4;
};
typedef struct RealThing RealThing;
(This one just has uint32_t' anduint8_t' -- that's the reason for the appearance of these two types in the union above.)
Plus probably a compile-time assert to make sure that RealThing's size doesn't exceed that of Thing:
typedef char CheckRealThingSize[sizeof(RealThing)<=sizeof(Thing)?1:-1];
Then each function in the library does a cast on its argument when it's going to use it:
void InitThing(Thing *thing) {
RealThing *t=(RealThing *)thing;
/* stuff with *t */
}
With this in place, the caller can create objects of the right size on the stack, and call functions against them, the struct is still opaque, and there's some checking that the opaque version is large enough.
One potential issue is that fields could be inserted into the real struct that mean it requires an alignment that the opaque struct doesn't, and this won't necessarily trip the size check. Many such changes will change the struct's size, so they'll get caught, but not all. I'm not sure of any solution to this.
Alternatively, if you have a special public-facing header(s) that the library never includes itself, then you can probably (subject to testing against the compilers you support...) just write your public prototypes with one type and your internal ones with the other. It would still be a good idea to structure the headers so that the library sees the public-facing Thing struct somehow, though, so that its size can be checked.
It is simple, simply put the structs in a privateTypes.h header file. It will not be opaque anymore, still, it will be private to the programmer, since it is inside a private file.
An example here:
Hiding members in a C struct
This is an old question, but since it's also biting me, I wanted to provide here a possible answer (which I'm using).
So here is an example :
// file.h
typedef struct { size_t space[3]; } publicType;
int doSomething(publicType* object);
// file.c
typedef struct { unsigned var1; int var2; size_t var3; } privateType;
int doSomething(publicType* object)
{
privateType* obPtr = (privateType*) object;
(...)
}
Advantages :
publicType can be allocated on stack.
Note that correct underlying type must be selected in order to ensure proper alignment (i.e. don't use char).
Note also that sizeof(publicType) >= sizeof(privateType).
I suggest a static assert to make sure this condition is always checked.
As a final note, if you believe your structure may evolve later on, don't hesitate to make the public type a bit bigger, to keep room for future expansions without breaking ABI.
Disadvantage :
The casting from public to private type can trigger strict aliasing warnings.
I discovered later on that this method has similarities with struct sockaddr within BSD socket, which meets basically the same problem with strict aliasing warnings.

Resources