access union member in c - c

I have a question about union in c language
for example:
typedef struct {
int a;
float c;
}Type1;
typedef struct {
int b;
char d;
}Type2;
union Select {
Type1 type1;
Type2 type2;
};
void main() {
Select* select;
//can we access type1 first and then access type2 immediately? like this way:
select->type1.a;
select->type2.b;
//after access type1, and then access type2 immediately, can we get the value b of type2?
//I modify the first post a little bit, because it is meanless at the beginning.
}

This is guaranteed to work by ISO/IEC 9899:1999 (see the draft here), 6.5.2.3 5:
One special guarantee is made in order to simplify the use of unions: if a union contains
several structures that share a common initial sequence (see below), and if the union
object currently contains one of these structures, it is permitted to inspect the common
initial part of any of them anywhere that a declaration of the complete type of the union is
visible. Two structures share a common initial sequence if corresponding members have
compatible types (and, for bit-fields, the same widths) for a sequence of one or more
initial members.

Yes, that is correct. In your example (ignoring the uninitialised pointer) the value of type1.a and type2.b will always be the same for any given instance of Select.

In your example it will work since both are types int
Normally you need a discriminator to know which union is used at a time.
The union has the size of the largest data type (if I recall corectly) and you set/check the type each time to know which data type to access:
struct myStruct {
int type;
union Select {
Type1 type1;
Type2 type2;
};
};
You would do a check before accessing to know how to use the union:
myStruct* aStruct;
//init pointer
if(aStruct->type == TYPE1)//TYPE1 is defined to indicate the coresponding type
{
//access fields of type1
aStruct->type1.a = //use it
}
Also before you should have done: aStruct->type = TYPE1

Yeah, you can access both of them when ever you want. basically select->type1 and select->type2 are pointers to the same location in memory. It's the progeammer job to know what sits in that location in memory, Usually by a flag:
union Select {
Type1 type1;
Type2 type2;
};
struct SelectType {
bool isType1;
Select select;
};
int getValue (struct SelectType s){
if (s.IsType1){
return s.type1.a;
} else {
return s.type2.b;
}
}
void main() {
struct SelectType select;
int value;
select.type1.a = 5;
select.isType1 = true;
select.type2.4 = 5;
select.isType1 = false;
value = getValue (select);
}

Yes we can. it's value does not change in this case. It's not wrong, but it is meaningless.
by the way, You forget to allocate memory for the pointer 'select'?
I really want to help, but my english is not very good. And this is my first post. So if I have said something wrong, tell me.

Yes, you can, because main has the scope for your union declaration.
Quote from the final version of the C99 standard.
The following is not a valid fragment (because the union type is not visible within function f)
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
/* ... */
return f(&u.s1, &u.s2);
}

all members in union reside in the same memory location.
its usually to access the same block of memory it more than one way.
for example , you can define:
union
{
char a[100];
int b[50];
}x;
union size will be 100 bytes , and read from b[0] is like reading a[0] and a[1] together

Related

C - what advantage do anonymous structures offer? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
I would like to know why and when one would explicitly choose to use an anonymous structure like so:
typedef struct my_struct_t my_struct_t;
struct my_struct_t
{
int a;
int b;
};
int main()
{
my_struct_t obj1 =
{
.a = 33,
.b = 44
};
return 0;
}
rather than doing:
typedef struct my_struct_t my_struct_t;
struct my_struct_t
{
int a;
int b;
};
int main()
{
my_struct_t obj2;
obj2.a = 55;
obj2.b = 66;
return 0;
}
What advantages does the former offer over the latter and/or vice-verca?
thanks
An anonymous struct is useful when you want to nest structures/unions without assigning a particular "meaning" to the inner structures. It lets you access the members of an inner structure as if they were a direct member of the enclosing struct. See the following example taken from cppreference.com:
Similar to union, an unnamed member of a struct whose type is a struct
without name is known as anonymous struct. Every member of an
anonymous struct is considered to be a member of the enclosing struct
or union. This applies recursively if the enclosing struct or union is
also anonymous.
struct v {
union { // anonymous union
struct { int i, j; }; // anonymous structure
struct { long k, l; } w;
};
int m;
} v1;
v1.i = 2; // valid
v1.k = 3; // invalid: inner structure is not anonymous
v1.w.k = 5; // valid
The code you showed actually has nothing to do with anonymous structs. It is rather an example of using an initializer list with designated members.
Hope it helps :-)
Here's one use. When working with unions, it's implementation defined what you'll get when reading a union member that wasn't last written to. But, in the case of structures, you can inspect the common initial sequence of fields. Here's an example to illustrate:
#include <assert.h>
struct S1 {
int type;
char value;
};
struct S2 {
int type;
float value;
};
union U {
struct S1 s1;
struct S2 s2;
};
int main() {
union U un;
un.s1.type = 1;
un.s1.value = 'c';
assert(un.s2.type == 1);
}
As you can see above, it is guaranteed by the C standard that whatever I write to un.s1.type I can then read from un.s2.type. So far so good. But there's a problem if I try to do something like this instead:
union U {
struct S1 s1;
int type;
};
Now there is no guarantee. We can't read un.s1.type from un.type under the protection of the standard. But hope is not lost, we can just make it a field of a structure again, an anonymous structure, like so:
union U {
struct S1 s1;
struct {
int type;
};
};
The fields of an anonymous structure are "injected" into the enclosing structure or union, so we may refer to type by accessing un.type. And now were are back to the warm embrace of the standard. Since now we again have two structures with a common initial sequence of fields.
Your example has nothing to do with anonymous structs, but only with initialization vs. assignement. It matters mainly when objects are declared const:
const my_struct_t obj1 =
{
.a = 33,
.b = 44
};
is correct while this is not:
const my_struct_t obj2;
obj2.a = 55; // error: try to assign to const object
Anonymous structures and unions allow members of a sub struct/union to be used as if they were members of the containing sub/union.
Draft n1570 for C11 says at 6.7.2.1 Structure and union specifiers §13 says:
An unnamed member whose type specifier is a structure specifier with no tag is called an
anonymous structure; an unnamed member whose type specifier is a union specifier with
no tag is called an anonymous union. The members of an anonymous structure or union
are considered to be members of the containing structure or union. This applies
recursively if the containing structure or union is also anonymous.
and even gives a (non normative) example:
struct v {
union { // anonymous union
struct { int i, j; }; // anonymous structure
struct { long k, l; } w;
};
int m;
} v1;
v1.i = 2; // valid
v1.k = 3; // invalid: inner structure is not anonymous
v1.w.k = 5; // valid
That's not an anonymous struct, that is simply using designated initializers, and you use them particularly when you don't need to provide values for all the members of a complicated struct.

C: casting to structure with different size

Quick simple question;
Does this
typedef struct {int a; int b;} S1;
typedef struct {int a;} S2;
((S2*)(POINTER_TO_AN_S1))->a=1;
Always return (and assign) the member a of the structure? Or is it undefined behavior?
In a conforming compiler, if both structure types appear within the complete definition of a union type which is visible where the structure is accessed, and if the target of the pointer happened to be an instance of that union type, behavior would be defined. Note that the Standard does not require that the compiler have any way of knowing that the target of the pointer is actually an object of that union type--merely that the declaration of the complete union type be visible.
Note, however, that gcc does not abide by the Standard here, unless the -fno-strict-aliasing flag is used. Even in cases where the complete union type is visible, and a compiler can see that it is in fact working with objects of the union type, gcc ignores the aliasing. For example, given:
struct s1 {int x;};
struct s2 {int x;};
union u { struct s1 s1; struct s2 s2;};
int read_s1_x(struct s1 *p) { return p->x; }
int read_s2_x(struct s2 *p) { return p->x; }
int write_s1_x(struct s1 *p, int value) { p->x = value; }
int write_s2_x(struct s2 *p, int value) { p->x = value; }
int test(union u *u1, union u *u2)
{
write_s2_x(&u2->s2, 0);
if (!read_s1_x(&u1->s1))
write_s2_x(&u2->s2, 1);
return read_s1_x(&u1->s1);
}
a compiler will decide that it no doesn't need to re-read the value of
u1->s1.x after it writes u2->s2.x, even though the complete union type
is visible and even though a compiler can see that both u1 and u2 are
pointers to objects of the union type. I'm not quite sure what the
authors of gcc think the address-of operator is supposed to mean when
applied to a union type if the resulting pointer can't even be used to
immediately access an object of that member type.

Assigning the value of an union containing structures to NULL in c programming

I have a union and 2 structures in the following format and I need to set one of them to NULL.
For example m = NULL or t = NULL;
typedef struct
{
char *population;
char *area;
} metropolitan;
typedef struct
{
char *airport;
char *type;
} tourist;
typedef union
{
tourist t;
metropolitan m;
} ex;
First of all, the members of your union are not pointers, so it doesn't make all that much sense setting the value to NULL.
Second, the way a union works is that if you set one of the members, you set the others as well. More precisely, all members of a union have the address, but different types. I.e. the different types gives you a way to interpret the same area in memory as multiple types at once.
To differentiate from a strutct:
struct a {
unsigned b;
char *c;
};
In this case, the appropriate number of bytes are allocated for each of the fields a and s, one after the other.
union a {
unsigned b;
char *c;
};
Here, the values of b and c are stored in the same address. I.e. if you're setting a.b to 0, a readout from a.c would give NULL (numeric 0x0).
Since tourist and metropolitan are structs, you cannot assign NULL to them.
But, if you declare like
typedef union
{
tourist* t;
metropolitan* m;
} ex;
you can do
ex e;
e.t=NULL; //which makes e.m=NULL as well
With unions you need to tell them apart.
So define a structure thus:
typedef enum { MET, TOURIST} Chap;
typedef struct {
Chap human;
union {
tourist t;
metropolitan m;
}
} Plonker;
Then you know what is what an then set the appropriate values
i.e
Plonker p;
p.human = MET;
p.m.population = NULL;
... etc

Are C-structs with the same members types guaranteed to have the same layout in memory?

Essentially, if I have
typedef struct {
int x;
int y;
} A;
typedef struct {
int h;
int k;
} B;
and I have A a, does the C standard guarantee that ((B*)&a)->k is the same as a.y?
Are C-structs with the same members types guaranteed to have the same layout in memory?
Almost yes. Close enough for me.
From n1516, Section 6.5.2.3, paragraph 6:
... if a union contains several structures that share a common initial sequence ..., and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.
This means that if you have the following code:
struct a {
int x;
int y;
};
struct b {
int h;
int k;
};
union {
struct a a;
struct b b;
} u;
If you assign to u.a, the standard says that you can read the corresponding values from u.b. It stretches the bounds of plausibility to suggest that struct a and struct b can have different layout, given this requirement. Such a system would be pathological in the extreme.
Remember that the standard also guarantees that:
Structures are never trap representations.
Addresses of fields in a structure increase (a.x is always before a.y).
The offset of the first field is always zero.
However, and this is important!
You rephrased the question,
does the C standard guarantee that ((B*)&a)->k is the same as a.y?
No! And it very explicitly states that they are not the same!
struct a { int x; };
struct b { int x; };
int test(int value)
{
struct a a;
a.x = value;
return ((struct b *) &a)->x;
}
This is an aliasing violation.
Piggybacking on the other replies with a warning about section 6.5.2.3. Apparently there is some debate about the exact wording of anywhere that a declaration of the completed type of the union is visible, and at least GCC doesn't implement it as written. There are a few tangential C WG defect reports here and here with follow-up comments from the committee.
Recently I tried to find out how other compilers (specifically GCC 4.8.2, ICC 14, and clang 3.4) interpret this using the following code from the standard:
// Undefined, result could (realistically) be either -1 or 1
struct t1 { int m; } s1;
struct t2 { int m; } s2;
int f(struct t1 *p1, struct t2 *p2) {
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g() {
union {
struct t1 s1;
struct t2 s2;
} u;
u.s1.m = -1;
return f(&u.s1,&u.s2);
}
GCC: -1, clang: -1, ICC: 1 and warns about the aliasing violation
// Global union declaration, result should be 1 according to a literal reading of 6.5.2.3/6
struct t1 { int m; } s1;
struct t2 { int m; } s2;
union u {
struct t1 s1;
struct t2 s2;
};
int f(struct t1 *p1, struct t2 *p2) {
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g() {
union u u;
u.s1.m = -1;
return f(&u.s1,&u.s2);
}
GCC: -1, clang: -1, ICC: 1 but warns about aliasing violation
// Global union definition, result should be 1 as well.
struct t1 { int m; } s1;
struct t2 { int m; } s2;
union u {
struct t1 s1;
struct t2 s2;
} u;
int f(struct t1 *p1, struct t2 *p2) {
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g() {
u.s1.m = -1;
return f(&u.s1,&u.s2);
}
GCC: -1, clang: -1, ICC: 1, no warning
Of course, without strict aliasing optimizations all three compilers return the expected result every time. Since clang and gcc don't have distinguished results in any of the cases, the only real information comes from ICC's lack of a diagnostic on the last one. This also aligns with the example given by the standards committee in the first defect report mentioned above.
In other words, this aspect of C is a real minefield, and you'll have to be wary that your compiler is doing the right thing even if you follow the standard to the letter. All the worse since it's intuitive that such a pair of structs ought to be compatible in memory.
This sort of aliasing specifically requires a union type. C11 §6.5.2.3/6:
One special guarantee is made in order to simplify the use of unions: if a union contains several structures that share a common initial sequence (see below), and if the union object currently contains one of these structures, it is permitted to inspect the common initial part of any of them anywhere that a declaration of the completed type of the union is visible. Two structures share a common initial sequence if corresponding members have compatible types (and, for bit-fields, the same widths) for a sequence of one or more initial members.
This example follows:
The following is not a valid fragment (because the union type is not
visible within function f):
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g() {
union {
struct t1 s1;
struct t2 s2;
} u;
/* ... */
return f(&u.s1, &u.s2);}
}
The requirements appear to be that 1. the object being aliased is stored inside a union and 2. that the definition of that union type is in scope.
For what it's worth, the corresponding initial-subsequence relationship in C++ does not require a union. And in general, such union dependence would be an extremely pathological behavior for a compiler. If there's some way the existence of a union type could affect a concerete memory model, it's probably better not to try to picture it.
I suppose the intent is that a memory access verifier (think Valgrind on steroids) can check a potential aliasing error against these "strict" rules.
I want to expand on #Dietrich Epp 's answer. Here is a quote from C99:
6.7.2.1 point 14
... A pointer to a union object, suitably converted, points to each of its members ... and vice versa.
Which means we can copy the memory from a struct to a union containing it:
struct a
{
int foo;
char bar;
};
struct b
{
int foo;
char bar;
};
union ab
{
struct a a;
struct b b;
};
void test(struct a *aa)
{
union ab ab;
memcpy(&ab, aa, sizeof *aa);
// ...
}
C99 also says:
6.5.2.3 point 5
One special guarantee is made in order to simplify the use of unions: if a union contains
several structures that share a common initial sequence ..., and if the union
object currently contains one of these structures, it is permitted to inspect the common
initial part of any of them anywhere that a declaration of the complete type of the union is
visible. Two structures share a common initial sequence if corresponding members have
compatible types .... for a sequence of one or more initial members.
Which means the following will also be legal after the memcpy:
ab.a.bar;
ab.b.bar;
The struct could be initialized in a separate translation unit and the copying is done in the standard library (out of the control of the compiler).
Thus, memcpy will copy byte-by-byte the value of the object of type struct a and the compiler has to ensure the result is valid for both structs.
The compiler cannot do anything other than generate instructions that read from the corresponding memory offset for both of those lines, thus the address needs to be the same.
Even though it is not stated explicitly, I would say the standard implies that C-structs with the same member types have the same layout in memory.

Why can union be used this way?

Isn't it true that members in union are exclusive, you can't refer to the other if you already refer to one of them?
union Ptrlist
{
Ptrlist *next;
State *s;
};
void
patch(Ptrlist *l, State *s)
{
Ptrlist *next;
for(; l; l=next){
next = l->next;
l->s = s;
}
}
But the above is referring to both next and s at the same time, anyone can explain this?
A union only defines that
&l->next == &l->s
that's all. There is no language-restriction of first accesses.
As others have already pointed out, all members of a union are active at all times. The only thing to consider is whether the members are each in a valid state.
If you ever do want some level of exclusivity, you would instead require a tagged union. The basic idea is to wrap the union in a struct, and the struct has a member identifying which element in the union should be used. Take this example:
enum Tag {
FIRST,
SECOND
};
struct {
Tag tag;
union {
int First;
double Second;
};
} taggedUnion;
Now taggedUnion could be used like:
if(taggedUnion.tag == FIRST)
// use taggedUnion.First;
else
// use taggedUnion.Second
Yes it's supposed to be like that. Both *s and *next point to the same memory location. And you can use both at the same time.. they are not exclusive.
You are performing an assignment to next from l->next. Then, you "overwrite" l->s through the assignment l->s = s.
When you assign to l->s, it overwrites the memory held in l->next. If next and s are the same "size", then both likely could be "active" at the same time.
No, it's not true. You can use any member of a union at any time, although the results if you read a member that wasn't the one most recently written to are a little bit complicated. But that isn't even happening in your code sample and there's absolutely nothing wrong with it. For each item in the list its next member is read and then its s member is written, overwriting its next.
See http://publications.gbdirect.co.uk/c_book/chapter6/unions.html for an introductory discussion on unions.
Basically, it's an easy way to do type casting in advance. So, instead of having
int query_my_data(void *data, int data_len) {
switch(data_len) {
case sizeof(my_data_t): return ((my_data_t *)data)->value;
case sizeof(my_other_data_t): return ((my_other_data_t *)data)->other_val;
default: return -1;
}
You could simplify it by doing
typedef struct {
int data_type;
union {
my_data_t my_data;
my_other_data_t other_data;
} union_data;
} my_union_data_t;
int query_my_data(my_union_data_t *data) {
switch(data->data_type) {
case TYPE_MY_DATA: return data->union_data.my_data.value;
case TYPE_MY_OTHER_DATA: return data->union_data.other_data.other_val;
default: return -1;
}
Where my_data and other_data would have the same starting address in memory.
An union is similar to a struct, but it only allocates memory for a variable. The size of the union will be equal to the size of the largest type stored in it. For example:
union A {
unsigned char c;
unsigned short s;
};
int sizeofA = sizeof(A); // = 2 bytes
union B {
unsigned char c[4];
unsigned short s[2];
unsigned int i;
};
int sizeofB = sizeof(B); // = 4 bytes
In the second example, s[0] == (c[1] << 8) & #ff00 | c[0];. The variables c, s and i overlap.
B b;
// This assignment
b.s[0] = 0;
// is similar to:
b.c[0] = 0;
b.c[1] = 0;
An union is restricted to primitive types and pointers. In C++, you cannot store classes in a union. All other rules remain basically the same as for a structure, such as public access, stack allocation and such.
Thus, in your example you must use a struct instead of an union.

Resources