Initializing a const array inside a struct - c

#define LENGTH 6
typedef char data_t[LENGTH];
struct foo {
const data_t data;
...
}
...
void bar(data_t data) {
printf("%.6s\n", data);
struct foo myfoo = {*data};
printf("%.6s\n", foo.data);
}
I'm trying to have this struct which holds directly the data I'm interested in, sizeof(foo) == 6+the rest, not sizeof(foo) == sizeof(void*)+the rest. However I can't find a way to initialize a struct of type foo with a data_t. I think maybe I could remove the const modifier from the field and use memcpy but I like the extra safety and clarity.
I don't get any compile errors but when I run the code I get
123456
1??
so the copy didn't work properly I think.
This is for an arduino (or similar device) so I'm trying to keep it to very portable code.
Is it just not possible ?
EDIT: removing the const modifier on the data_t field doesn't seem to help.

It is possible to do this, for some cost >=0.
typedef struct
{
char c[LENGTH];
} data_t; // this struct is freely copyable
struct foo
{
const data_t data; // but this data member is not
int what;
};
void foo (char* x) {
data_t d; // declare freely copyable struct instance
memcpy(d.c, x, sizeof(d.c)); // memcpy it
struct foo foo = { d, 42 }; // initialise struct instance with const member
...
};
Some compilers (e.g. clang) are even able to optimise away the redundant copying (from x to d.c and then from d to foo.data ⇒ from x straight to foo.data). Others (gcc I'm looking at you) don't seem to be able to achieve this.
If you pass around pointers to data_t rather than straight char pointers, you won't need this additional memcpy step. OTOH in order to access the char array inside foo you need another level of member access (.data.c instead of just .data; this has no runtime cost though).

It's impossible to do it in a standard compliant way.
Due to its being const, const char data[6]; must be initialized to be usable, and it may only be initialized statically (static objects with no initializer get automatically zeroed), with a string literal, or with a brace-enclosed initializer list. You cannot initialize it with a pointer or another array.
If I were you, I would get rid of the const, document that .data shouldn't be changed post-initialization, and then use memcpy to initialize it.
(const on struct members doesn't work very well in my opinion. It effectively prevents you from being able to have initializer functions, and while C++ gets around the problem a little bit by having special language support for its constructor functions, the problem still remains if the const members are arrays).

Related

Is a struct copied when a stack-variable is initialized by a result of a function call?

Given I'll return a large struct in a function like here:
#include <stdio.h>
// this is a large struct
struct my_struct {
int x[64];
int y[64];
int z[64];
};
struct my_struct get_my_struct_from_file(const char *filename) {
int tmp1, tmp2; // some tmp. variables
struct my_struct u;
// ... load values from filename ...
return u;
}
int main() {
struct my_struct res = get_my_struct_from_file("tmp.txt"); // <-- here
printf("x[0] = %d\n", res.x[0]);
// ... print all values ...
}
At the place marked by here, do I have to assume that this large struct is copied or is it likely that the compiler does something to avoid this?
Thank you
… do I have to assume that this large struct is copied…
No, of course you do not have to make that assumption. Nobody requires you to make that assumption, and it would be unwise to adopt the statement as an assumption rather than deriving it from known information, such as compiler documentation or inspection of the generated assembly code.
In the specific code you show, it is likely good compilers will optimize so that the structure is not copied. (Testing with Apple Clang 11 confirms it does this optimization.) But that is likely overly simplified code. If a call to get_my_struct_from_file appears in a translation unit separate from its definition, the compiler will not know what get_my_struct_from_file is accessing. If the destination object, res in this example, has had its address previously passed to some other routine in some other translation unit, then the compiler cannot know that other routine did not stash the address somewhere and that get_my_struct_from_file is not using it. So the compiler would have to treat the structure returned by get_my_struct_from_file and the structure the return value is being assigned to as separate; it could not coalesce them to avoid the copy.
To ensure the compiler does what you want, simply tell it what you want it to do. Write the code so that the function puts the results directly in the structure you want to put it in:
void get_my_struct_from_file(struct my_struct *result, const char *filename)
{
…
}
...
get_my_struct_from_file(&res, "tmp.txt");
At the place marked by here, do I have to assume that this large struct is copied or is it likely that the compiler does something to avoid this?
Semantically, the structure is copied from the function's local variable to the caller's variable. These are distinct objects, and just like objects of other types, setting one structure equal to another requires copying from the representation of one to the representation of the other.
The only way to avoid a copy would be for the compiler to treat the local variable as an alias for the caller's structure, but that would be wrong in the general case. Such aliasing can easily produce observably different behavior than would occur without.
It is possible that in some specific cases, the compiler can indeed avoid the copy, but if you want to ensure that no copying happens then you should set up the wanted aliasing explicitly:
void get_my_struct_from_file(const char *filename, struct my_struct *u) {
int tmp1, tmp2; // some tmp. variables
// ... load values from filename into *u
}
int main() {
struct my_struct res = { 0 };
get_my_struct_from_file("tmp.txt", &res);
printf("x[0] = %d\n", res.x[0]);
// ... print all values ...
}

memcpy Inheritance-like structs - is it safe?

I have two structs I'm working with, and they are defined nearly identical. These are defined in header files that I cannot modify.
typedef struct
{
uint32_t property1;
uint32_t property2;
} CarV1;
typedef struct
{
uint32_t property1;
uint32_t property2;
/* V2 specific properties */
uint32_t property3;
uint32_t property4;
} CarV2;
In my code, I initialize the V2 struct at the top of my file, to cover all my bases:
static const carV2 my_car = {
.property1 = value,
.property2 = value,
/* V2 specific properties */
.property3 = value,
.property4 = value
};
Later, I want to retrieve the values I have initialized and copy them into the struct to be returned from a function via void pointer. I sometimes want V2 properties of the car, and sometimes V1. How can I memcpy safely without having duplicate definitions/initializations? I'm fairly new to C, and its my understanding that this is ugly and engineers to follow me in looking at this code will not approve. What's a clean way to do this?
int get_properties(void *returned_car){
int version = get_version();
switch (version){
case V1:
{
CarV1 *car = returned_car;
memcpy(car, &my_car, sizeof(CarV1)); // is this safe? What's a better way?
}
case V2:
{
CarV2 *car = returned_car;
memcpy(car, &my_car, sizeof(CarV2));
}
}
}
Yes, it's definitely possible to do what you're asking.
You can use a base struct member to implement inheritance, like this:
typedef struct
{
uint32_t property1;
uint32_t property2;
} CarV1;
typedef struct
{
CarV1 base;
/* V2 specific properties */
uint32_t property3;
uint32_t property4;
} CarV2;
In this case, you're eliminating the duplicate definitions. Of course, on a variable of type CarV2*, you can't reference the fields of the base directly - you'll have to do a small redirection, like this:
cv2p->base.property1 = 0;
To upcast to CarV1*, do this:
CarV1* cv1p = &(cv2p->base);
c1vp->property1 = 0;
You've written memcpy(&car, &my_car, sizeof(CarV1)). This looks like a mistake, because it's copying the data of the pointer variable (that is, the address of your struct, instead of the struct itself). Since car is already a pointer (CarV1*) and I'm assuming that so is my_car, you probably wanted to do this instead:
memcpy(car, my_car, sizeof(CarV1));
If my_car is CarV2* and car is CarV1* (or vice versa), then the above code is guaranteed to work by the C standard, because the first member of a struct is always at a zero offset and, therefore, the memory layout of those two for the first sizeof(CarV1) bytes will be identical.
The compiler is not allowed to align/pad that part differently (which I assume is what you meant about optimizing), because you've explicitly declared the first part of CarV2 to be a CarV1.
Since in your case you are stuck with identically defined structs that you can't change, you may find useful that the C standard defines a macro/special form called offsetof.
To be absolutely sure about your memory layouts, I'd advise that you put a series of checks during the initialization phase of your program that verifies whether the offsetof(struct CarV1, property1) is equal to offsetof(struct CarV2, property1) etc for all common properties:
void validateAlignment(void)
{
if (offsetof(CarV1, property1) != offsetof(CarV2, property1)) exit(-1);
if (offsetof(CarV1, property2) != offsetof(CarV2, property2)) exit(-1);
// and so on
}
This will stop the program for going ahead in case the compiler has done anything creative with the padding.
It also won't slow down your program's initialization because offsetof is actually calculated at compile time. So, with all the optimizations in place, the void validateAlignment(void) function should be optimized out completely (because a static analysis would show that the exit conditions are always false).
What you wrote will almost work, except that instead of memcpy(&car, ... you should just have memcpy (car, ..., but there is no reason to use memcpy in such a case. Rather, you should just copy each of the fields in a separate statement.
car->property1 = my_car.property1
(is my_car a pointer or not? it's impossible to tell from the code fragment)
For the second case, I think you can just assign the entire struct: *car = my_car
there is no perfect solution but one way is to use a union
typedef union car_union
{
CarV1 v1;
CarV2 v2;
} Car;
that way the size will not differ when you do a memcpy - if version v1 then v2 specific parts will not be initialized.
In C and Objective-C, this is fine in practice. (In theory, the compiler must see the declaration of a union containing both structs as members).
In C++ (and Objective-C++), the language very carefully describes when this is safe and when it isn't. For example, if you start with
typedef struct {
public:
...
then the compiler is free to re-arrange where struct members are. If the struct uses no C++ features then you are safe.

What does the code below mean, in regards to structs in C?

I'm really new to C programming and I'm still trying to understand the concept of using pointers and using typedef structs.
I have this code snippet below that I need to use in a program:
typedef struct
{
char* firstName;
char* lastName;
int id;
float mark;
}* pStudentRecord;
I'm not exactly sure what this does - to me it seems similar as using interfaces in Objective-C, but I don't think that's the case.
And then I have this line
pStudentRecord* g_ppRecords;
I basically need to add several pStudentRecord to g_ppRecords based on a number. I understand how to create and allocate memory for an object of type pStudentRecord, but I'm not sure how to actually add multiple objects to g_ppRecords.
defines a pointer to the struct described within the curly bracers, here is a simpler example
typedef struct {
int x;
int y;
}Point,* pPoint;
int main(void) {
Point point = {4,5};
pPoint point_ptr = &point;
printf("%d - %d\n",point.x,point_ptr->x);
pPoint second_point_ptr = malloc(sizeof(Point));
second_point_ptr->x = 5;
free(second_point_ptr);
}
The first declares an unnamed struct, and a type pStudentRecord that is a pointer to it. The second declares g_ppRecords to be a pointer to a pStudentRecord. In other words, a pointer to a pointer to a struct.
It's probably easier to think of the second as an "array of pointers". As such, g_ppRecords[0] may point to a pStudentRecord and g_ppRecords[1] to another one. (Which, in turn, point to a record struct.)
In order to add to it, you will need to know how it stores the pointers, that is, how one might tell how many pointers are stored in it. There either is a size somewhere, which for size N, means at least N * sizeof(pStudentRecord*) of memory is allocated, and g_ppRecords[0] through g_ppRecords[N-1] hold the N items. Or, it's NULL terminated, which for size N, means at least (N+1) * sizeof(pStudentRecord*) of memory is allocated and g_ppRecords[0] through g_ppRecords[N-1] hold the N items, and g_ppRecords[N] holds NULL, marking the end of the string.
After this, it should be straightforward to create or add to a g_ppRecords.
A struct is a compound data type, meaning that it's a variable which contains other variables. You're familiar with Objective C, so you might think of it as being a tiny bit like a 'data only' class; that is, a class with no methods. It's a way to store related information together that you can pass around as a single unit.
Typedef is a way for you to name your own data types as synonyms for the built-in types in C. It makes code more readable and allows the compiler to catch more errors (you're effectively teaching the compiler more about your program's intent.) The classic example is
typedef int BOOL;
(There's no built-in BOOL type in older ANSI C.)
This means you can now do things like:
BOOL state = 1;
and declare functions that take BOOL parameters, then have the compiler make sure you're passing BOOLs even though they're really just ints:
void flipSwitch(BOOL isOn); /* function declaration */
...
int value = 0;
BOOL boolValue = 1;
flipSwitch(value); /* Compiler will error here */
flipSwitch(boolValue); /* But this is OK */
So your typedef above is creating a synonym for a student record struct, so you can pass around student records without having to call them struct StudentRecord every time. It makes for cleaner and more readable code. Except that there's more to it here, in your example. What I've just described is:
typedef struct {
char * firstName;
char * lastName;
int id;
float mark;
} StudentRecord;
You can now do things like:
StudentRecord aStudent = { "Angus\n", "Young\n", 1, 4.0 };
or
void writeToParents(StudentRecord student) {
...
}
But you've got a * after the typedef. That's because you want to typedef a data type which holds a pointer to a StudentRecord, not typedef the StudentRecord itself. Eh? Read on...
You need this pointer to StudentRecord because if you want to pass StudentRecords around and be able to modify their member variables, you need to pass around pointers to them, not the variables themselves. typedefs are great for this because, again, the compiler can catch subtle errors. Above we made writeToParents which just reads the contents of the StudentRecord. Say we want to change their grade; we can't set up a function with a simple StudentRecord parameter because we can't change the members directly. So, we need a pointer:
void changeGrade(StudentRecord *student, float newGrade) {
student->mark = newGrade;
}
Easy to see that you might miss the *, so instead, typedef a pointer type for StudentRecord and the compiler will help:
typedef struct { /* as above */ } *PStudentRecord;
Now:
void changeGrade(PStudentRecord student, float newGrade) {
student->mark = newGrade;
}
It's more common to declare both at the same time:
typedef struct {
/* Members */
} StudentRecord, *PStudentRecord;
This gives you both the plain struct typedef and a pointer typedef too.
What's a pointer, then? A variable which holds the address in memory of another variable. Sounds simple; it is, on the face of it, but it gets very subtle and involved very quickly. Try this tutorial
This defines the name of a pointer to the structure but not a name for the structure itself.
Try changing to:
typedef struct
{
char* firstName;
char* lastName;
int id;
float mark;
} StudentRecord;
StudentRecord foo;
StudentRecord *pfoo = &foo;

How can I hide the declaration of a struct in C?

In the question Why should we typedef a struct so often in C?, unwind answered that:
In this latter case, you cannot return
the Point by value, since its
declaration is hidden from users of
the header file. This is a technique
used widely in GTK+, for instance.
How is declaration hiding accomplished? Why can't I return the Point by value?
ADD:
I understood why I can't return the struct by value, but, is still hard to see why i can't deference this point in my function. i.e. If my struct have member named y, why i can't do it?
pointer_to_struct->y = some_value;
Why should I use methods to do it? (Like Gtk+)
Thanks guys, and sorry for my bad english again.
Have a look at this example of a library, using a public header file, a private header file and an implementation file.
In file public.h:
struct Point;
struct Point* getSomePoint();
In file private.h:
struct Point
{
int x;
int y;
}
In file private.c:
struct Point* getSomePoint()
{
/* ... */
}
If you compile these three files into a library, you only give public.h and the library object file to the consumer of the library.
getSomePoint has to return a pointer to Point, because public.h does not define the size of Point, only that is a struct and that it exists. Consumers of the library can use pointers to Point, but can not access the members or copy it around, because they do not know the size of the structure.
Regarding your further question:
You can not dereference because the program using the library does only have the information from private.h, that does not contain the member declarations. It therefore can not access the members of the point structure.
You can see this as the encapsulation feature of C, just like you would declare the data members of a C++ class as private.
What he means is that you cannot return the struct by-value in the header, because for that, the struct must be completely declared. But that happens in the C file (the declaration that makes X a complete type is "hidden" in the C file, and not exposed into the header), in his example. The following declares only an incomplete type, if that's the first declaration of the struct
struct X;
Then, you can declare the function
struct X f(void);
But you cannot define the function, because you cannot create a variable of that type, and much less so return it (its size is not known).
struct X f(void) { // <- error here
// ...
}
The error happens because "x" is still incomplete. Now, if you only include the header with the incomplete declaration in it, then you cannot call that function, because the expression of the function call would yield an incomplete type, which is forbidden to happen.
If you were to provide a declaration of the complete type struct X in between, it would be valid
struct X;
struct X f(void);
// ...
struct X { int data; };
struct X f(void) { // valid now: struct X is a complete type
// ...
}
This would apply to the way using typedef too: They both name the same, (possibly incomplete) type. One time using an ordinary identifier X, and another time using a tag struct X.
In the header file:
typedef struct _point * Point;
After the compiler sees this it knows:
There is a struct called _point.
There is a pointer type Point that can refer to a struct _point.
The compiler does not know:
What the struct _point looks like.
What members struct _point contains.
How big struct _point is.
Not only does the compiler not know it - we as programmers don't know it either. This means we can't write code that depends on those properties of struct _point, which means that our code may be more portable.
Given the above code, you can write functions like:
Point f() {
....
}
because Point is a pointer and struct pointers are all the same size and the compiler doesn't need to know anything else about them. But you can't write a function that returns by value:
struct _point f() {
....
}
because the compiler does not know anything about struct _point, specifically its size, which it needs in order to construct the return value.
Thus, we can only refer to struct _point via the Point type, which is really a pointer. This is why Standard C has types like FILE, which can only be accessed via a pointer - you can't create a FILE structure instance in your code.
Old question, better answer:
In Header File:
typedef struct _Point Point;
In C File:
struct _Point
{
int X;
int Y;
};
What that post means is: If you see the header
typedef struct _Point Point;
Point * point_new(int x, int y);
then you don't know the implementation details of Point.
As an alternative to using opaque pointers (as others have mentioned), you can instead return an opaque bag of bytes if you want to avoid using heap memory:
// In public.h:
struct Point
{
uint8_t data[SIZEOF_POINT]; // make sure this size is correct!
};
void MakePoint(struct Point *p);
// In private.h:
struct Point
{
int x, y, z;
};
void MakePoint(struct Point *p);
// In private.c:
void MakePoint(struct Point *p)
{
p->x = 1;
p->y = 2;
p->z = 3;
}
Then, you can create instances of the struct on the stack in client code, but the client doesn't know what's in it -- all it knows is that it's a blob of bytes with a given size. Of course, it can still access the data if it can guess the offsets and data types of the members, but then again you have the same problem with opaque pointers (though clients don't know the object size in that case).
For example, the various structs used in the pthreads library use structs of opaque bytes for types like pthread_t, pthread_cond_t, etc. -- you can still create instances of those on the stack (and you usually do), but you have no idea what's in them. Just take a peek into your /usr/include/pthreads.h and the various files it includes.

What is forward reference in C?

What is forward reference in C with respect to pointers?
Can I get an example?
See this page on forward references. I don't see how forward referencing would be different with pointers and with other PoD types.
Note that you can forward declare types, and declare variables which are pointers to that type:
struct MyStruct;
struct MyStruct *ptr;
struct MyStruct var; // ILLEGAL
ptr->member; // ILLEGAL
struct MyStruct {
// ...
};
// Or:
typedef struct MyStruct MyStruct;
MyStruct *ptr;
MyStruct var; // ILLEGAL
ptr->member; // ILLEGAL
struct MyStruct {
// ...
};
I think this is what you're asking for when dealing with pointers and forward declaration.
I think "forward reference" with respect to pointers means something like this:
struct MyStruct *ptr; // this is a forward reference.
struct MyStruct
{
struct MyStruct *next; // another forward reference - this is much more useful
// some data members
};
The pointer is declared before the structure it points to is defined.
The compiler can get away with this because the pointer stores an address, and you don't need to know what is at that address to reserve the memory for the pointer.
Forward reference is when you declare a type but do not define it.
It allows you to use the type by pointer (or reference for C++) but you cannot declare a variable.
This is a way to say to the compiler that something exists
Say that you have a Plop structure defined in Plop.h:
struct Plop
{
int n;
float f;
};
Now you want to add some utility functions that works with that struct. You create another file PlopUtils.h (let's say you can't change Plop.h):
struct Plop; // Instead of including Plop.h, just use a forward declaration to speed up compile time
void doSomething(Plop* plop);
void doNothing(Plop* plop);
Now when you implement those function, you will need the structure definition, so you need to include the Plop.h file in your PlopUtils.cpp:
#include "PlopUtils.h"
#include "Plop.h" // now we need to include the header in order to work with the type
void doSomething(Plop* plop)
{
plop->n ...
}
void doNothing(Plop* plop);
{
plop->f ...
}
I think the C compiler originally had a pass in which it did symbol table building and semantic analysis together. So for example:
....
... foo(a,b) + 1 ... // assumes foo returns int
....
double foo(double x, double y){ ... } // violates earlier assumption
to prevent this, you say:
double foo(double x, double y); // this is the forward declaration
....
... foo(a,b) + 1 ... // correct assumptions made
....
double foo(double x, double y){ ... } // this is the real declaration
Pascal had the same concept.
Adding to previous answers. The typical situation in which forward reference is mandatory is when a struct foo contains a pointer to a struct bar, and bar contains a pointer to foo (a circular dependency between declarations). The only way to express this situation in C is to use a forward declaration, i.e.:
struct foo;
struct bar
{
struct foo *f;
};
struct foo
{
struct bar *b;
};
Forward references allow C compiler to do less passes and significantly reduces compilation time. It is probably was important some 20 years ago when computers was much slower and compliers less efficient.

Resources