Why can union be used this way? - c

Isn't it true that members in union are exclusive, you can't refer to the other if you already refer to one of them?
union Ptrlist
{
Ptrlist *next;
State *s;
};
void
patch(Ptrlist *l, State *s)
{
Ptrlist *next;
for(; l; l=next){
next = l->next;
l->s = s;
}
}
But the above is referring to both next and s at the same time, anyone can explain this?

A union only defines that
&l->next == &l->s
that's all. There is no language-restriction of first accesses.

As others have already pointed out, all members of a union are active at all times. The only thing to consider is whether the members are each in a valid state.
If you ever do want some level of exclusivity, you would instead require a tagged union. The basic idea is to wrap the union in a struct, and the struct has a member identifying which element in the union should be used. Take this example:
enum Tag {
FIRST,
SECOND
};
struct {
Tag tag;
union {
int First;
double Second;
};
} taggedUnion;
Now taggedUnion could be used like:
if(taggedUnion.tag == FIRST)
// use taggedUnion.First;
else
// use taggedUnion.Second

Yes it's supposed to be like that. Both *s and *next point to the same memory location. And you can use both at the same time.. they are not exclusive.

You are performing an assignment to next from l->next. Then, you "overwrite" l->s through the assignment l->s = s.
When you assign to l->s, it overwrites the memory held in l->next. If next and s are the same "size", then both likely could be "active" at the same time.

No, it's not true. You can use any member of a union at any time, although the results if you read a member that wasn't the one most recently written to are a little bit complicated. But that isn't even happening in your code sample and there's absolutely nothing wrong with it. For each item in the list its next member is read and then its s member is written, overwriting its next.

See http://publications.gbdirect.co.uk/c_book/chapter6/unions.html for an introductory discussion on unions.
Basically, it's an easy way to do type casting in advance. So, instead of having
int query_my_data(void *data, int data_len) {
switch(data_len) {
case sizeof(my_data_t): return ((my_data_t *)data)->value;
case sizeof(my_other_data_t): return ((my_other_data_t *)data)->other_val;
default: return -1;
}
You could simplify it by doing
typedef struct {
int data_type;
union {
my_data_t my_data;
my_other_data_t other_data;
} union_data;
} my_union_data_t;
int query_my_data(my_union_data_t *data) {
switch(data->data_type) {
case TYPE_MY_DATA: return data->union_data.my_data.value;
case TYPE_MY_OTHER_DATA: return data->union_data.other_data.other_val;
default: return -1;
}
Where my_data and other_data would have the same starting address in memory.

An union is similar to a struct, but it only allocates memory for a variable. The size of the union will be equal to the size of the largest type stored in it. For example:
union A {
unsigned char c;
unsigned short s;
};
int sizeofA = sizeof(A); // = 2 bytes
union B {
unsigned char c[4];
unsigned short s[2];
unsigned int i;
};
int sizeofB = sizeof(B); // = 4 bytes
In the second example, s[0] == (c[1] << 8) & #ff00 | c[0];. The variables c, s and i overlap.
B b;
// This assignment
b.s[0] = 0;
// is similar to:
b.c[0] = 0;
b.c[1] = 0;
An union is restricted to primitive types and pointers. In C++, you cannot store classes in a union. All other rules remain basically the same as for a structure, such as public access, stack allocation and such.
Thus, in your example you must use a struct instead of an union.

Related

Is it possible to choose one of 2 structures dynamically which are under UNION?

This question is regarding C language concepts.
Existing structure:
struct parent{
char c[4];
float f;
double d;
int flag;
struct child_old
{
int i;
float l;
}c1;
}
I want to add a new structure under parent (lets call it - child_new).
I use only one of the child structures at a time based on scenario. Not both at a time.
So I can put them under UNION.
Modified structure:
struct parent{
char c[4];
float f;
double d;
int flag;
union{
struct child_old
{
int i;
float j;
}c1;
struct child_new
{
int i;
float j;
char c[32];
double d;
}c2;
}UU;
}
Here my requirement is, based on struct member "flag" value (0/1), I need to decide which child structure I need to use.
This is because:
There is huge data stored in my file system of type parent structure. There should not be any problem while reading them.
while using child_old, I dont want to consume extra space needed by child_new.
Is it possible in C?
Or is there any work around solution?
What you've written is fine, and you would consume it with something like:
switch(p.flag) {
case CHILD_OLD:
// work with p.c1
break;
case CHILD_NEW:
// work with p.c2
break;
}
However, your full struct will always be big enough for the largest member of the union. So when you use c1, you still have enough space for c2 allocated. But, at least you're not allocating sizeof(c1) + sizeof(c2) each time.
If you really want to allocate more or less space depending on which variant each record uses, you'll need to put a pointer in the struct, and dynamically allocate a separate record for the child elements.
All of this does mean that if you're reading byte arrays from disk then casting them to a struct:
parent *p = (parent*) addressOfSomeDataReadFromAFile;
(Not a great idea, but not unusual in the wild)
... then expanding the parent struct using the union technique will not generally work. Your existing files will represent a record as fewer bytes than the new struct.
Given your requirement of
while using child_old, I dont want to consume extra space needed by child_new.
you can't use a union.
Per 6.7.2.1 Structure and union specifiers, paragraph 16 of the C Standard:
The size of a union is sufficient to contain the largest of its
members.
Thus the size of the union would be that of the largest member.
Note also, as pointed out in the comments, that changing the union may also impact the padding/alignment of other elements of any structure containing that union.
It is possible, but it is neither pretty nor effective. It is considered bad practice to use unions for storing different kinds of unrelated data. And as already mentioned, the size will be that of the biggest member of the union, so it is not memory-efficient either.
Instead, here is a more sensible solution:
struct parent {
char c[4];
float f;
double d;
int flag;
void* data;
}
...
struct parent x;
x.data = malloc(sizeof(struct child_old));
struct child_old* co_ptr = x.data;
co_ptr->i = ...;
Here the void* points at the actual data which is allocated elsewhere. You will also need some means to keep track of which kind of data that is stored there.
It's ugly but it work.
#include <sys/stat.h>
#include <fcntl.h>
enum Type {
A,
B,
};
struct Base {
enum Type type;
};
struct A {
char foo[4];
};
struct B {
char bar[8];
};
struct BaseA {
struct Base base;
struct A a;
};
struct BaseB {
struct Base base;
struct B b;
};
int main(void) {
int fd = open("foo.bar", O_RDONLY);
if (fd == -1) {
return 1;
}
union {
struct Base base;
struct BaseA base_a;
struct BaseB base_b;
} buffer;
if (read(fd, &buffer.base, sizeof buffer.base) != sizeof buffer.base) {
return 1;
}
if (buffer.base.type == A) {
if (read(fd, &buffer.base_a.a, sizeof buffer.base_a.a) !=
sizeof buffer.base_a.a) {
return 1;
}
} else if (buffer.base.type == B) {
if (read(fd, &buffer.base_b.b, sizeof buffer.base_b.b) !=
sizeof buffer.base_b.b) {
return 1;
}
} else {
return 1;
}
}
You could do what you want in if statement. Add every Base_A in an array, and all Base_B in another one.
If you don't want padding use something like #pragma pack(n).

Assigning the value of an union containing structures to NULL in c programming

I have a union and 2 structures in the following format and I need to set one of them to NULL.
For example m = NULL or t = NULL;
typedef struct
{
char *population;
char *area;
} metropolitan;
typedef struct
{
char *airport;
char *type;
} tourist;
typedef union
{
tourist t;
metropolitan m;
} ex;
First of all, the members of your union are not pointers, so it doesn't make all that much sense setting the value to NULL.
Second, the way a union works is that if you set one of the members, you set the others as well. More precisely, all members of a union have the address, but different types. I.e. the different types gives you a way to interpret the same area in memory as multiple types at once.
To differentiate from a strutct:
struct a {
unsigned b;
char *c;
};
In this case, the appropriate number of bytes are allocated for each of the fields a and s, one after the other.
union a {
unsigned b;
char *c;
};
Here, the values of b and c are stored in the same address. I.e. if you're setting a.b to 0, a readout from a.c would give NULL (numeric 0x0).
Since tourist and metropolitan are structs, you cannot assign NULL to them.
But, if you declare like
typedef union
{
tourist* t;
metropolitan* m;
} ex;
you can do
ex e;
e.t=NULL; //which makes e.m=NULL as well
With unions you need to tell them apart.
So define a structure thus:
typedef enum { MET, TOURIST} Chap;
typedef struct {
Chap human;
union {
tourist t;
metropolitan m;
}
} Plonker;
Then you know what is what an then set the appropriate values
i.e
Plonker p;
p.human = MET;
p.m.population = NULL;
... etc

How to check what type is currently used in union?

let's say we have a union:
typedef union someunion {
int a;
double b;
} myunion;
Is it possible to check what type is in union after I set e.g. a=123?
My approach is to add this union to some structure and set uniontype to 1 when it's int and 2 when it's double.
typedef struct somestruct {
int uniontype
myunion numbers;
} mystruct;
Is there any better solution?
Is there any better solution?
No, the solution that you showed is the best (and the only) one. unions are pretty simplistic - they do not "track" what you've assigned to what. All they do is let you reuse the same memory range for all their members. They do not provide anything else beyond that, so enclosing them in a struct and using a "type" field for tracking is precisely the correct thing to do.
C does not automatically keep track of which field in a union is currently in use. (In fact, I believe reading from the "wrong" field results in implementation defined behavior.) As such, it is up to your code to keep track of which one is currently used / filled out.
Your approach to keeping a separate 'uniontype' variable is a very common approach to this, and should work well.
There is no way to directly query the type currently stored in a union.
The only ways to know the type stored in a union are to have an explicit flag (as in your mystruct example), or to ensure that control only flows to certain parts of the code when the union has a known active element.
Depending on the application, if it is a short lived object you may be able to encode the type in the control flow, ie. have separate blocks/functions for both cases
struct value {
const char *name;
myunion u;
};
void throwBall(Ball* ball)
{
...
struct value v;
v.name = "Ball"; v.u.b = 1.2;
process_value_double(&v); //double
struct value v2;
v2.name = "Age";
v2.u.a = 19;
check_if_can_drive(&v2); //int
...
}
void countOranges()
{
struct value v;
v.name = "counter";
v.u.a = ORANGE;
count_objects(&v); //int
}
Warning: the following is just for learning purpose:
You could use some ugly tricks to do so (as long as the data types in your union have different sizes, which is the present case):
#include <stdio.h>
typedef union someunion {
int a;
double b;
} myunion;
typedef struct somestruct {
int uniontype;
myunion numbers;
} mystruct;
#define UPDATE_CONTENT(container, value) if ( \
((sizeof(value) == sizeof(double)) \
? (container.uniontype = ((container.numbers.b = value), 2)) \
: (container.uniontype = ((container.numbers.a = value), 1))))
int main()
{
mystruct my_container;
UPDATE_CONTENT(my_container, 42);
printf("%d\n", my_container.uniontype);
UPDATE_CONTENT(my_container, 37.1);
printf("%d\n", my_container.uniontype);
return (0);
}
But I advise you never do this.
Maybe my variant is helping
struct Table
{
char mas[10];
int width;
int high;
union stat
{
int st;
char v;
} un;
};
Table tble[2];
strcpy(tble[0].mas, "box");
tble[0].high = 12;
tble[0].width = 14;
tble[0].un.v = 'S';
strcpy(tble[1].mas, "bag");
tble[1].high = 12;
tble[1].width = 14;
tble[1].un.st = 40;
//struct Table *ptbl = &tble[0];
//ptbl++;
for (int i = 0; i < 2; i++)
{
void *pt = &tble[i].un;
if(*((char*)pt) == 'S')
sort(put_on_bag_line);
else
sort(put_on_box_line);
}

How to check if a union in C has not been "initialized"?

Can I check if union is null in c? For example:
union{ char * name, struct Something * something ... }someUnion;
Is there a way to check if no element has been initialized, without doing element vise check?
Thanks.
No, not without adding a specific flag for that purpose. For example:
struct someStruct {
int initialized;
union {
char *name;
struct Something *something;
};
};
You could even store a flag instead of initialized that indicates which kind of data the union contains. This is commonly called a Tagged union.
Yes, under the conditions that all members of the union are of pointer type or a integral type and with initialization you mean a value which is not NULL has been assigned, it is sufficient to check one element for NULL.
union {
char * name;
struct Something * something; } someUnion;
if (someUnion.name != 0) {
// here you know that someUnion.something is not NULL too.
// You don't know if it has been initialized as char*
// or as struct something* though. Presumeably since
// it is a unionboth interpretations make some sense.
}

access union member in c

I have a question about union in c language
for example:
typedef struct {
int a;
float c;
}Type1;
typedef struct {
int b;
char d;
}Type2;
union Select {
Type1 type1;
Type2 type2;
};
void main() {
Select* select;
//can we access type1 first and then access type2 immediately? like this way:
select->type1.a;
select->type2.b;
//after access type1, and then access type2 immediately, can we get the value b of type2?
//I modify the first post a little bit, because it is meanless at the beginning.
}
This is guaranteed to work by ISO/IEC 9899:1999 (see the draft here), 6.5.2.3 5:
One special guarantee is made in order to simplify the use of unions: if a union contains
several structures that share a common initial sequence (see below), and if the union
object currently contains one of these structures, it is permitted to inspect the common
initial part of any of them anywhere that a declaration of the complete type of the union is
visible. Two structures share a common initial sequence if corresponding members have
compatible types (and, for bit-fields, the same widths) for a sequence of one or more
initial members.
Yes, that is correct. In your example (ignoring the uninitialised pointer) the value of type1.a and type2.b will always be the same for any given instance of Select.
In your example it will work since both are types int
Normally you need a discriminator to know which union is used at a time.
The union has the size of the largest data type (if I recall corectly) and you set/check the type each time to know which data type to access:
struct myStruct {
int type;
union Select {
Type1 type1;
Type2 type2;
};
};
You would do a check before accessing to know how to use the union:
myStruct* aStruct;
//init pointer
if(aStruct->type == TYPE1)//TYPE1 is defined to indicate the coresponding type
{
//access fields of type1
aStruct->type1.a = //use it
}
Also before you should have done: aStruct->type = TYPE1
Yeah, you can access both of them when ever you want. basically select->type1 and select->type2 are pointers to the same location in memory. It's the progeammer job to know what sits in that location in memory, Usually by a flag:
union Select {
Type1 type1;
Type2 type2;
};
struct SelectType {
bool isType1;
Select select;
};
int getValue (struct SelectType s){
if (s.IsType1){
return s.type1.a;
} else {
return s.type2.b;
}
}
void main() {
struct SelectType select;
int value;
select.type1.a = 5;
select.isType1 = true;
select.type2.4 = 5;
select.isType1 = false;
value = getValue (select);
}
Yes we can. it's value does not change in this case. It's not wrong, but it is meaningless.
by the way, You forget to allocate memory for the pointer 'select'?
I really want to help, but my english is not very good. And this is my first post. So if I have said something wrong, tell me.
Yes, you can, because main has the scope for your union declaration.
Quote from the final version of the C99 standard.
The following is not a valid fragment (because the union type is not visible within function f)
struct t1 { int m; };
struct t2 { int m; };
int f(struct t1 *p1, struct t2 *p2)
{
if (p1->m < 0)
p2->m = -p2->m;
return p1->m;
}
int g()
{
union {
struct t1 s1;
struct t2 s2;
} u;
/* ... */
return f(&u.s1, &u.s2);
}
all members in union reside in the same memory location.
its usually to access the same block of memory it more than one way.
for example , you can define:
union
{
char a[100];
int b[50];
}x;
union size will be 100 bytes , and read from b[0] is like reading a[0] and a[1] together

Resources