Declaration and definition of union - c

Why does this code work:
#include<stdio.h>
int main(){
union var{
int a, b, c;
};
printf("%ld", sizeof(union var));
return 0;
}
My doubt is that isn't union var a declaration only, and during declaration no memory is allocated. So why does this code print 4?

It prints 4 because that's how big a union var variable is. The fact that there are no union var variables in your program, is completely irrelevant. If you created one, for example by writing union var myUnionVar;, it would use 4 bytes of memory
You can also do this with structs:
struct list_node {
struct list_node *next;
int key;
int value;
};
// note that sizeof returns a size_t which should be printed with %zu
printf("%zu", sizeof(struct list_node)); // prints 12 or 16, probably

sizeof is an operator that yields the size in bytes of its argument (right-hand-side operand). The operand can either be a type or a variable. Most of the time, the result is a constant that can be evaluated at compile time.
union var is a type, therefore with sizeof(union var) you are asking "what size would occupy a variable if it had the type union var?" The answer is 4 bytes.

Related

Cast 0 to struct pointer

So, I have this code here:
struct mystruct {
char a;
union {
char a[8];
char b[16];
} u;
};
void fuu(void)
{
struct mystruct s;
printf("%ld %ld\n", sizeof(s), &((struct mystruct *)0)->u.b);
}
The snippet that confuses me is:
&((struct mystruct *)0)->u.b
As far as I can understand it, firstly pointer to struct is casted to lvalue int of 0(what?)
Then, u.b is taken out of this pointer(which should be a pointer to the start of char array b)
Then address of this pointer is taken and printed to the screen.
The most confusing moment of all this is cast to 0.
Can someone explain in detail what is happening in this snippet?
Let's break it down a bit. Assume that we have a variable x that is a pointer to a mystruct:
struct mystruct *x;
And replace that in the expression in question:
&(x->u.b)
This takes the address of the member u.b in the struct that x points to. We can guess that this address should be slightly higher than x itself, because u is not the first member in the struct.
Then, add the fact that x is zero:
struct mystruct *x = (struct mystruct *)0;
Then, the value of the expression above will be slightly higher than 0. Or in other words, it will be the offset of u.b within the memory layout of the struct.
In fact, at least on my machine it's 1, because the only member in the struct before u is char a, which takes up 1 byte.*
Another way to do this is to use the offsetof macro defined in stddef.h:
#include <stddef.h>
...
printf("%ld %ld\n", sizeof(s), offsetof(struct mystruct, u.b));
It has the same effect, but it might be easier to understand what the code is trying to do.
* And none of the members of the union have greater alignment requirements. Try changing either of the members of the union from char to int - what happens?
The offset is printed as 4 instead of 1, because an int needs to be stored in an address divisible by 4. So the struct will contain one byte for char a, three unused bytes for alignment, and then the union.

Why is initializing C union using "designated initializer" giving random values?

I had a "bug" which I spent quite a while chasing:
typedef union {
struct {
uint8_t mode: 1;
uint8_t texture: 4;
uint8_t blend_mode: 2;
};
uint8_t key;
} RenderKey;
Later this union would be initialized (on stack):
Buffers buffers[128]; // initialized somewhere else
void Foo(int a, int b)
{
//C99 style initialization (all the other values should be 0)
RenderKey rkey = {.blend_mode = 1};
//rkey.key would sometimes be >= 128 thus would write out of array bounds
DoStuffWithBuffer(&buffers[rkey.key]);
}
This seemed to indicate that the last bit of the union bitfield wouldn't be initialized. So I fixed it with adding the unused bit:
typedef union {
struct {
uint8_t mode: 1;
uint8_t texture: 4;
uint8_t blend_mode: 2;
uint8_t unused: 1;
};
uint8_t key;
} RenderKey;
This works, but I don't understand WHY exactly.
That random 1 bit comes from the random garbage on stack before, but why isn't the C99 style initialization working here? Because of the union and the anonymous struct?
This happens on Clang 3.5 and tcc, but not on gcc 4.9.2.
In C11 it is stated at ยง6.7.9 that
The initialization shall occur in initializer list order, each initializer provided for a particular subobject overriding any previously listed initializer for the same subobject; all subobjects that are not initialized explicitly shall be initialized implicitly the same as objects that have static storage duration.
But the hidden padding bit is not a subobject, it doesn't undergo that constraint because from the anonymous struct point of view it doesn't exist, so the compiler is not initializing something that is not a member of the struct, which isn't that strange after all.
A similar example would be to have something like
#include <stdio.h>
typedef struct {
unsigned char foo;
float value;
} Test;
int main(void) {
Test test = { .foo = 'a', .value = 1.2f};
printf("We expect 8 bytes: %zu\n", sizeof(Test));
printf("We expect 0: %zu\n", (void*)&test.foo - (void*)&test);
printf("We expect 4: %zu\n", (void*)&test.value - (void*)&test);
unsigned char* test_ptr = (unsigned char*) &test;
printf("value of 3rd byte: %d\n", test_ptr[2]);
}
What would expect test_ptr[2] to be? There are 3 bytes of padding between the two members of the struct which are not part of any subobject, initializing them would be a waste of time since in a normal scenario you can't access them.

Casting struct * to int * to be able to write into first field

I've recently found this page:
Making PyObject_HEAD conform to standard C
and I'm curious about this paragraph:
Standard C has one specific exception to its aliasing rules precisely designed to support the case of Python: a value of a struct type may also be accessed through a pointer to the first field. E.g. if a struct starts with an int , the struct * may also be cast to an int * , allowing to write int values into the first field.
So I wrote this code to check with my compilers:
struct with_int {
int a;
char b;
};
int main(void)
{
struct with_int *i = malloc(sizeof(struct with_int));
i->a = 5;
((int *)&i)->a = 8;
}
but I'm getting error: request for member 'a' in something not a struct or union.
Did I get the above paragraph right? If no, what am I doing wrong?
Also, if someone knows where C standard is referring to this rule, please point it out here. Thanks.
Your interpretation1 is correct, but the code isn't.
The pointer i already points to the object, and thus to the first element, so you only need to cast it to the correct type:
int* n = ( int* )i;
then you simply dereference it:
*n = 345;
Or in one step:
*( int* )i = 345;
1 (Quoted from: ISO:IEC 9899:201X 6.7.2.1 Structure and union specifiers 15)
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
You have a few issues, but this works for me:
#include <malloc.h>
#include <stdio.h>
struct with_int {
int a;
char b;
};
int main(void)
{
struct with_int *i = (struct with_int *)malloc(sizeof(struct with_int));
i->a = 5;
*(int *)i = 8;
printf("%d\n", i->a);
}
Output is:
8
Like other answers have pointed out, I think you meant:
// Interpret (struct with_int *) as (int *), then
// dereference it to assign the value 8.
*((int *) i) = 8;
and not:
((int *) &i)->a = 8;
However, none of the answers explain specifically why that error makes sense.
Let me explain what ((int *) &i)->a means:
i is a variable that holds an address to a (struct with_int). &i is the address on main() function's stack space. This means &i is an address, that contains an address to a (struct with_int). In other words, &i is a pointer to a pointer to (struct with_int). Then the cast (int *) of this would tell the compiler to interpret this stack address as an int pointer, that is, address of an int. Finally, with that ->a, you are asking the compiler to fetch the struct member a from this int pointer and then assign the value 8 to it. It doesn't make sense to fetch a struct member from an int pointer. Hence, you get error: request for member 'a' in something not a struct or union.
Hope this helps.

Generate padding bytes for a structure by nesting it in a union

I was going through this question to reaffirm my understanding of structure padding.I have a doubt now. When I do something like this:
#include <stdio.h>
#define ALIGNTHIS 16 //16,Not necessarily
union myunion
{
struct mystruct
{
char a;
int b;
} myst;
char DS4Alignment[ALIGNTHIS];
};
//Main Routine
int main(void)
{
union myunion WannaPad;
printf("Union's size: %d\n\
Struct's size: %d\n", sizeof(WannaPad),
sizeof(WannaPad.myst));
return 0;
}
Output:
Union's size: 16
Struct's size: 8
should I not expect the struct to have been padded by 8 bytes? If I explicitly pad eight bytes to the structure, the whole purpose of nesting it inside an union like this is nullified.
I feel that declaring a union containing a struct and a character array the size of the struct ought to be but isn't, makes way for a neater code.
Is there a work-around for making it work as I would like it?
Think about it logically.
Imagine I had a union with some basic types in it:
union my_union{
int i;
long l;
double d;
float f;
};
would you expect sizeof(int) == sizeof(double)?
The inner types will always be their size, but the union will always be large enough to hold any of its inner types.
should I not expect the struct to have been padded by 8 bytes?
No, as the struct mystruct is seen/handled on its own. The char had been padded by 3 sizeof (int) -1 bytes to let the int be properly aligned. This does not change, even if someone, somewhere, sometimes decides to use this very struct mystruct inside another type.
By default struct is padded, so int b field is aligned on sizeof(int) boundary. There are several workarounds for this:
explicitly use fillers where needed: char a; char _a[sizeof(int)-1]; int b;
use compiler-dependent pragma to pack struct on byte boundary
use command-line switch etc.

Explain the result of sizeof operator for a union containing structures

#include<stdio.h>
struct mystruct
{
char cc;
float abc;
};
union sample
{
int a;
float b;
char c;
double d;
struct mystruct s1;
};
int main()
{
union sample u1;
int k;
u1.s1.abc=5.5;
u1.s1.cc='a';
printf("\n%c %f\n",u1.s1.cc,u1.s1.abc);
k=sizeof(union sample);
printf("%d\n\n",k);
return 0;
}
The size of operator is returning 8 I am still able to access the structure elements, more than one at a time and still the sizeof operator is returning the max size of primitive data types i assume. Why is this behavior? Is the size actually allocated is 8? and the sizeof is returning a wrong value? Or is the actual allocated size is 8? Then how is the structure accommodated?? If we allocate an array of unions using malloc and sizeof will it allocate enough space in such case? Please eloborate.
Typically, the size of the union is the size of its biggest member. The biggest member is [likely] your struct member as well as the double member. Both have size 8. So, as sizeof correctly told you, the size of the union is indeed 8.
Why do you find it strange? Why do you call 8 "wrong value"?
struct mystruct
{
char cc; //1 -byte
//3 bytes Added here for Padding
float abc; //size of float is 4-bytes
};
so 1 + 3 + 4 = 8 bytes. We knew the memory will be allocated for largest member of union. In our case both sizeof(double) = sizeof(struct mystruct) is 8.
A union is used to place multiple members at the same memory location - you can't use more than one member at a time. All of them overlap, so the size of the union is the same as the size of the largest member.
Union types are special structures which allow access to the same memory using different type descriptions. one could, for example, describe a union of data types which would allow reading the same data as an integer, a float or a user declared type
union
{
int i;
float f;
struct
{
unsigned int u;
double d;
} s;
} u;
In the above example the total size of u is the size of u.s (which is the sum of the sizes of u.s.u and u.s.d), since s is larger than both i and f. When assigning something to u.i, some parts of u.f may be preserved if u.i is smaller than u.f.

Resources