Using struct with bitfields inside a struct with bitfields? - c

Lets take a look at the following structs:
struct child {
int a:1;
int b:2;
int c:2;
} __attribute__((packed));
struct parent1 {
int x:3;
struct child y;
} __attribute__((packed));
struct parent2 {
int p:1;
int q:5;
int r:5;
struct child s;
} __attribute__((packed));
These are the sizes I am getting:
sizeof(int) 4
sizeof(struct child) 1
sizeof(struct parent1) 2
sizeof(struct parent2) 3
I've heard that padding is added before structs for performance reasons.
But forgetting about performance for a moment,
is there a way so that I can get the following sizes?
sizeof(struct parent1) 1
sizeof(struct parent2) 2
As only that much of memory is actually required...
EDIT
Is there any way of doing it with gcc on linux?

No it is not possible to pack structures tighter than what your compiler does.
Every struct must start on a byte boundary, so the members s and y cannot use the available bits from the previous members of their enclosing struct definition.
Also note that __attribute__((packed)) is an extension that might not be supported on many compilers.

If all else fails, you can always write functions and macros to bitshift & extract what you need.
Using unions gets close to what you want. It may be possible to use C++ to clean up the syntax, but with C, this is the closest (non-bitwise) solution I could come up with:
(note parent1 is near perfect.)
#pragma pack(1)
typedef struct {
char x:3;
char a:3;
char b:3;
char c:3;
} child;
typedef union {
char x:3;
child y;
} parent1;
typedef struct {
short p:1;
short q:5;
short r:5;
short s:5;
} par2;
typedef struct {
char pad;
child s;
} padchild;
typedef union {
par2 parent2;
padchild s;
} parent2;
#pragma pop
Technically unions are for either-or use and compilers can pad however they want, but by forcing the bit counts to be the same, easiest way for the compiler to implement happens to be what you want.

Related

How does __attribute__((packed)) for a field affect struct which contains this field?

If I have a field in my struct which is packed, why my whole structure is becoming packed?
Example:
#include <stdio.h>
struct foo {
int a;
} __attribute__((packed));
struct bar {
char b;
struct foo bla;
char a;
};
int main() {
printf("%ld\n", sizeof(struct bar));
return 0;
}
https://ideone.com/bjoZHB
Sizeof of bar struct is 6, but it should be 12, because it should be aligned.
it seems because __attribute__((packed)) means use the minimum memory for structure, it also means that it can ignore alignment for siding members when it is in another structure. Check following structure:
struct bar {
char b;
__attribute__((packed)) int bla;
char a;
};
When you check size for this structure, it will be 6. This happens because it ignores member alignment for 2 side members(a and b here). But this structure:
struct bar {
char b;
__attribute__((packed)) int bla;
char a;
int c;
};
has size of 12, because it is aligned c on 4 bytes boundary. In your case, if you use aligned attribute too at same time, it works as you expect:
struct bar {
char b;
__attribute__((aligned (4), packed)) int bla;
char a;
};
This structure size is 12.
Update:
I only found this in GCC's aligned section of attributes. I think it is related to what I mentioned here:
The aligned attribute can only increase the alignment; but you can
decrease it by specifying packed as well
.Just remember that if you want to keep child structure packed but main structure aligned, you need to use 2 attributes in 2 different declarations. For example following structure has size of 12:
struct foo {
char b;
int a;
} __attribute__((packed));
struct bar {
char b;
__attribute__((aligned(4))) struct foo bla;
char a;
};
but if you use aligned() in declaration of foo as __attribute__((aligned (4), packed)), size will be 16. This happens because foo gets aligned too, and it will not be useful in case of packing.

How do I safely put two variable sized datatypes (structs) in a single struct?

I'm trying to construct some C structs that themselves need to hold multiple structs. It looks something like this:
typedef struct hdr_t {
uint16_t a;
uint16_t b;
uint8_t c;
uint8_t d[3];
uint64_t e;
uint8_t f[];
} hdr_t;
typedef struct {
uint64_t data;
} pyld_t;
typedef struct {
hdr_t hdr;
pyld_t pyld;
} msg_t;
When I compile this, depending on the compiler and settings, I get warnings.
./file.h:55:24: warning: field 'hdr' with variable sized type 'hdr_t'
(aka 'struct hdr_t') not at the end of a struct or class is a GNU extension
[-Wgnu-variable-sized-type-not-at-end]
hdr_t hdr;
For this example, I'm using clang 6.1.0:
$ clang --version
Apple LLVM version 6.1.0 (clang-602.0.53) (based on LLVM 3.6.0svn)
Target: x86_64-apple-darwin14.5.0
Thread model: posix
The warning complain that what I'm doing is a non-portable GNU extension, which I'd rather avoid. What can I do to solve this? Is there not a safe way to put multiple structs in a struct? Surely that's not the case.
A C structure can be variable-size only in the sense that it may contain a "flexible array member" as its last member. C forbids such a structure type from being the type of any structure member or array element, though GCC permits that as an extension.
Even if GCC (or clang) accepted your declaration, I doubt it would mean what you think it means. Every member of a structure has a fixed offset relative to the beginning of the structure, determined statically at compile time. As a result, your msg_t cannot provide sufficient space for arbitrary hdr.f, and quite possibly it provides no space at all, especially if you enable structure packing. Thus, accessing hdr.f of a msg_t could easily access the message data, which I suppose is not what you expect.
I guess the whole point is to map a structure to a byte buffer, but if the underlying data format has variable-length elements in the middle then you just can't directly map a single C structure to it. You could, however, create and use an index structure:
typedef struct {
hdr_t *hdr;
pyld_t *pyld;
} msg_index_t;
That would make it easier to handle mapping a pair of structures to your buffer.
You could use an union:
typedef struct {
..stuff
}type_f1;
typedef struct {
..stuff
}type_f2;
typedef struct {
..stuff
}type_f3;
typedef union {
type_f1 f1;
type_f2 f2;
type_f3 f3;
uint8_t rawdata[MAX_RAWDATA];
}type_f;
typedef struct hdr_t {
uint16_t a;
uint16_t b;
uint8_t c;
uint8_t d[3];
uint64_t e;
type_f f;
} hdr_t;
typedef struct {
uint64_t data;
} pyld_t;
typedef struct {
hdr_t hdr;
pyld_t pyld;
} msg_t;
There's lots of solutions that will fix the issue. For my specific problem, just changing the declaration of hdr_t from this:
typedef struct hdr_t {
uint16_t a;
uint16_t b;
uint8_t c;
uint8_t d[3];
uint64_t e;
uint8_t f[];
} hdr_t;
to this:
typedef struct hdr_t {
uint16_t a;
uint16_t b;
uint8_t c;
uint8_t d[3];
uint64_t e;
uint8_t *f; //Use pointer instead of variable sized array
} hdr_t;
I'm not sure why the original code was using the array, but it wasn't necessary.

struct similarity in C

Consider the two structs below:
struct A {
double x[3];
double y[3];
int z[3];
struct A *a;
int b;
struct A *c;
unsigned d[10];
};
struct B {
double x[3];
double y[3];
int z[3];
};
Notice that struct B is a strict subset of struct A. Now, I want to copy the members .x, .y and .z from an instance of struct A to an instance of struct B. My question is: according to the standards, is it valid to do:
struct A s_a = ...;
struct B s_b;
memcpy(&s_b, &s_a, sizeof s_b);
I.e. is it guaranteed that the paddings for the members, in their sequence of appearance, will be the same, so that I can "partially" memcpy struct A to struct B?
It is not guaranteed that struct A's layout starts off the same as struct B's layout.
However, if and only if they were both members of a union:
union X
{
struct A a;
struct B b;
};
then it is guaranteed that the common initial sequence has the same layout.
I've never heard of any compiler that would lay out a struct differently if it detected that the struct were a member of a union, so in practice you should be safe!
How about using struct B as an anonymous struct member of struct A. This requires, however, -fms-extensions for gcc (there should be a similar extension for VC as the name implies):
struct B {
double x[3];
double y[3];
int z[3];
};
struct A {
struct B;
struct A *a;
int b;
struct A *c;
unsigned d[10];
};
This allows to use the fields in struct A like:
struct A as;
as.x[2] = as.y[0];
etc. This guarantees identical layout (the standard allows no padding at the beginning of a struct, so the inner struct is guarantee to start at the same address as the outer) and struct A being cast-compatible to struct B.
Also:
struct A as;
struct B bs;
memcpy(&as, &bs, sizeof(bs));
I do not think the Standard would prohibit an implementation from including so much more padding in s_a than s_b that the former is actually larger even though its members are a subset of s_b's. Such behavior would be very weird, and I can't think of any reason why a compiler would do such a thing, but I don't think it would be prohibited.
If the number of bytes copied is the lesser of sizeof s_a and sizeof s_b, then the memcpy operation will be guaranteed to copy all of the common fields, but would not necessarily leave the later fields of s_b undisturbed. On a typical machine, if the declarations had been:
struct A { uint32_t x; char y; };
struct B { uint32_t x; char y,p; uint16_t q; };
the first structure would contain five bytes of data and three bytes of padding, while the second would contain eight bytes of data with no padding. Using memcpy as shown in your code would copy the padding from s_a over the data in s_b.
If you need to copy the initial structure members while leaving the balance of the structure undisturbed, you should compute add offset and size of the last member of interest, and use that as the number of bytes to copy. In the example I give above, the offset of y would be 4, and the size would be 1, so the memcpy would thus ignore parts of the structure that are used as padding in A but might hold data in B.

When are anonymous structs and unions useful in C11?

C11 adds, among other things, 'Anonymous Structs and Unions'.
I poked around but could not find a clear explanation of when anonymous structs and unions would be useful. I ask because I don't completely understand what they are. I get that they are structs or unions without the name afterwards, but I have always (had to?) treat that as an error so I can only conceive a use for named structs.
Anonymous union inside structures are very useful in practice. Consider that you want to implement a discriminated sum type (or tagged union), an aggregate with a boolean and either a float or a char* (i.e. a string), depending upon the boolean flag. With C11 you should be able to code
typedef struct {
bool is_float;
union {
float f;
char* s;
};
} mychoice_t;
double as_float(mychoice_t* ch)
{
if (ch->is_float) return ch->f;
else return atof(ch->s);
}
With C99, you'll have to name the union, and code ch->u.f and ch->u.s which is less readable and more verbose.
Another way to implement some tagged union type is to use casts. The Ocaml runtime gives a lot of examples.
The SBCL implementation of Common Lisp does use some union to implement tagged union types. And GNU make also uses them.
A typical and real world use of anonymous structs and unions are to provide an alternative view to data. For example when implementing a 3D point type:
typedef struct {
union{
struct{
double x;
double y;
double z;
};
double raw[3];
};
}vec3d_t;
vec3d_t v;
v.x = 4.0;
v.raw[1] = 3.0; // Equivalent to v.y = 3.0
v.z = 2.0;
This is useful if you interface to code that expects a 3D vector as a pointer to three doubles. Instead of doing f(&v.x) which is ugly, you can do f(v.raw) which makes your intent clear.
struct bla {
struct { int a; int b; };
int c;
};
the type struct bla has a member of a C11 anonymous structure type.
struct { int a; int b; } has no tag and the object has no name: it is an anonymous structure type.
You can access the members of the anonymous structure this way:
struct bla myobject;
myobject.a = 1; // a is a member of the anonymous structure inside struct bla
myobject.b = 2; // same for b
myobject.c = 3; // c is a member of the structure struct bla
Another useful implementation is when you are dealing with rgba colors, since you might want access each color on its own or as a single int.
typedef struct {
union{
struct {uint8_t a, b, g, r;};
uint32_t val;
};
}Color;
Now you can access the individual rgba values or the entire value, with its highest byte being r. i.e:
int main(void)
{
Color x;
x.r = 0x11;
x.g = 0xAA;
x.b = 0xCC;
x.a = 0xFF;
printf("%X\n", x.val);
return 0;
}
Prints 11AACCFF
I'm not sure why C11 allows anonymous structures inside structures. But Linux uses it with a certain language extension:
/**
* struct blk_mq_ctx - State for a software queue facing the submitting CPUs
*/
struct blk_mq_ctx {
struct {
spinlock_t lock;
struct list_head rq_lists[HCTX_MAX_TYPES];
} ____cacheline_aligned_in_smp;
/* ... other fields without explicit alignment annotations ... */
} ____cacheline_aligned_in_smp;
I'm not sure if that example strictly necessary, except to make the intent clear.
EDIT: I found another similar pattern which is more clear-cut. The anonymous struct feature is used with this attribute:
#if defined(RANDSTRUCT_PLUGIN) && !defined(__CHECKER__)
#define __randomize_layout __attribute__((randomize_layout))
#define __no_randomize_layout __attribute__((no_randomize_layout))
/* This anon struct can add padding, so only enable it under randstruct. */
#define randomized_struct_fields_start struct {
#define randomized_struct_fields_end } __randomize_layout;
#endif
I.e. a language extension / compiler plugin to randomize field order (ASLR-style exploit "hardening"):
struct kiocb {
struct file *ki_filp;
/* The 'ki_filp' pointer is shared in a union for aio */
randomized_struct_fields_start
loff_t ki_pos;
void (*ki_complete)(struct kiocb *iocb, long ret, long ret2);
void *private;
int ki_flags;
u16 ki_hint;
u16 ki_ioprio; /* See linux/ioprio.h */
unsigned int ki_cookie; /* for ->iopoll */
randomized_struct_fields_end
};
Well, if you declare variables from that struct only once in your code, why does it need a name?
struct {
int a;
struct {
int b;
int c;
} d;
} e,f;
And you can now write things like e.a,f.d.b,etc.
(I added the inner struct, because I think that this is one of the most usages of anonymous structs)

Typedef struct question

Why would I want to do this?
typedef struct Frame_s
{
int x;
int y;
int z;
} Frame_t;
Also if I want to create an object what do I use Frame_s or Frame_t?
You would use Frame_t.
With typedef you are saying that Frame_t and struct Frame_s are the exact same type.
So these are equivalent sentences:
// 1
Frame_t f;
// 2
struct Frame_s f;
I would use:
typedef struct
{
int x;
int y;
int z;
} Frame_t;
And always declare my vars like this:
Frame_t f1, f2, f3;
Confusion usually comes from places where you use that sentence in a C++ piece of code. If you use C++ with that typedef you can use either:
// 1
Frame_t f;
// 2
Frame_s f;
But if you use a plain C compiler, then //2 is invalid.
Either you use struct Frame_s, or you use Frame_t.
Usually you do such a typedef so that you can use the typedefed name, Frame_t, and don't have to write struct whenever you refer to the type.
Aside from Frame_t being shorter than struct Frame_s there is no real difference.
Another aspect of typedef that has not yet been mentioned in the other replies is that it reserves the identifier and thus may avoid confusion. If you do a forward declaration like that
typedef struct Frame Frame;
you would avoid that some code that may use the same name Frame e.g a variable or function.
One very bad traditional example that comes in mind for this is "sys/stat.h" in POSIX: it defines a struct stat and a function stat:
int stat(const char *path, struct stat *buf);
To declare a value of the struct, you could use either struct Frame_s foo or Frame_t foo (the latter is more normal, since that's the whole point of typedefing). My guess is that Frame_s is meant to indicate the struct type itself, while Frame_t is the plain type that's normally used for Frame values.
typedef is used as a short form. So when a function which returns a structure of this type, normally you write -
struct Frame_s *function_name()
The code start to get obfuscated. Function definitions become long etc. With this typedef you get -
Frame_t *function_name()
Clean code! Makes a big difference in maintenance...
So these two declarations are equivalent:
struct Frame_s f;
Frame_t f;
In fact, you could now leave Frame_s out of the declaration as it isn't needed.
typedef struct
{
int x;
int y;
int z;
} Frame_t;

Resources