Equality of structure using pragma pack in C - c

The reason behind the fact that the structures can't be checked for equality in C is the presence of slack bytes,which makes the comparison impossible.
But if I use #pragma pack(1) which removes the slack bytes then the comparison should be done smoothly,but it still gives error on being compared.
Example Code
#include<stdio.h>
#pragma pack(1)
struct person
{
int uid;
char nameStart;
};
struct personDupe
{
int uid;
char nameStart;
};
int main()
{
struct person var;
struct personDupe varDupe;
printf("\nSize of person : %3d\n",sizeof(var));
printf("\nSize of personDupe : %3d\n",sizeof(varDupe));
var.uid = 12;
var.nameStart = 'a';
varDupe.uid = 12;
varDupe.nameStart = 'a';
if(var == varDupe) //Error is introduced
printf("\nStructures are equal\n");
return 0;
}

Your code doesn't compile since you can't compare directly two struct.
You should use something like memcmp:
memcmp(&var, &varDupe, sizeof(var));
This doesn't solve the padding problem, which can be solved by ensuring that a struct is properly initialized to a known value even on padding bytes (which can be obtained by memset prior to initialization of fields).
But the approach of packing a struct to remove padding just to check if they are equal seems a fragile solution. If the compiler wants padding then it has a good reason for it, possibly performance related.

You can also tell the compiler how you detect that two values are the same
bool same_person(struct person* p, struct personDupe* dupe)
{ return p->uid == dupe->uid && p->nameStart == dupe->nameStart; }
And then you can do
if(same_person(&var, &varDupe))
printf("\nStructures are equal\n");

Related

Casting a void pointer (that is part of a struct) into another pointer data type

I'm trying to figure out how to parse S-expressions in C on my own, in order to store data and code for my own rudimentary Lisp (written as a learning exercise, not for production).
Before explaining my code and my reasoning, I should explain that all I know about S-expressions is the introductory section of the Wikipedia article on it, and the occasional glance at Common Lisp code, so the naming of my structs and variables may be a bit off.
My language of implementation is C, and before I defined any functions I created the following structs:
typedef enum {
string,
letter,
integer,
} atom_type;
typedef struct {
void* blob;
atom_type type;
} atom;
typedef struct expr {
atom* current;
struct expr* next;
} expr;
Each atom is stored in a struct atom, which contains a enum instance (? I'm not sure of the correct jargon for this) and a void pointer pointing to the data to be stored. Each S-expression "node" consists of a pointer to an atom and a pointer to the next S-expression node.
I've written a rudimentary function that accepts a string and parses it into an atom, like the following:
atom* parse_term(char* str) {
size_t len = strlen(str);
atom* current = malloc(sizeof(atom));
if(str[0] == '\'') {
current->blob = (char*) &str[1];
current->type = letter;
} else if(str[0] == '\"') {
char temp[256];
int pos = 1;
while(str[pos] != '\"') {
temp[pos] = str[pos];
pos++;
}
current->blob = malloc(256 * sizeof(char));
current->blob = (char*) &temp;
current->type = string;
} else if(isdigit(str[0])){
char temp[256];
int pos = 0;
while(str[pos] != ' ') {
temp[pos] = str[pos];
pos++;
}
int tmp = atoi(temp);
current->blob = (int*) &tmp;
current->type = integer;
}
return current;
}
The function seems to be working correctly; at least, when I print out the data type it shows it correctly. But apart from this I can't figure out how to print out the actual 'blob': I've tried using the %p formatting code, as well as a switch statement:
void print_atom(atom* current) {
switch(current->type) {
case string:
printf("atom%s\ttype:%d", current->blob, current->type);
case letter:
printf("atom%c\ttype:%d", current->blob, current->type);
case integer:
printf("atom%c\ttype:%d", current->blob, current->type);
}
}
But this doesn't work. In the case of a string, it returns garbled text and in the case of everything else, it just doesn't print anything where the atom's information is supposed to be.
I imagine this is a product of my use of a void* within a struct; how could I remedy this? I think I did cast properly (though I could very well be wrong, please tell me), the only other option I could concieve of is storing a hardcoded variable for every supported data type in the 'atom' struct, but this seems wasteful of resources.
Don't use void*. Use a union. That's what unions are for.
In this example, I use an "anonymous union", which means that I can just refer to its fields as though they were directly inside the Atom struct. (I changed the spelling of names according to my prejudices, so that types are Capitalised and constants are ALLCAPS. I also separated the typedef and struct declarations for Atom, in case Atom turns out to be self-referential.
typedef enum {
STRING,
LETTER,
INTEGER
} AtomType;
typedef struct Atom Atom;
struct Atom {
union {
char* str;
char let;
int num;
};
AtomType type;
};
void print_atom(Atom* current) {
switch(current->type) {
case STRING:
printf("atom %s\ttype:%d", current->str, current->type);
case LETTER:
printf("atom %c\ttype:%d", current->let, current->tyoe);
case INTEGER:
printf("atom %d\ttype:%d", current->num, current->type);
}
}
As someone says in a comment, that's not actually how Lisp objects look. The usual implementation is combine cons cells and atoms, something like this (instead of AtomType). You'll also need to add CELL to your enum.
typedef struct Cell Cell;
struct Cell {
union {
char* str;
char let;
int num;
struct {
Cell* hd; // Historic name: car
Cell* tl; // Historic name: cdr
};
};
CellType type;
};
Here there's an anonymous struct inside an anonymous union. Some people say this is confusing. Others (me, anyway) say it's less syntactic noise. Use your own judgement.
The use of Cell* inside the definition of Cell is the motivation for typedef struct Cell Cell.
You can play not-entirely-portable-but-usually-ok games to reduce the memory consumption of Cell, and most real implementations do. I didn't, because this is a learning experience.
Also note that real Lisps (and many toy ones) effectively avoid most parsing tasks; the language includes character macros which effectively do what parsing is needed (which isn't much); for the most part, they can be implemented in Lisp itself (although you need some way to bootstrap).

If only using the first element, do I have to allocate mem for the whole struct?

I have a structure where the first element is tested and dependent on its value the rest of the structure will or will not be read. In the cases where the first element's value dictates that the rest of the structure will not be read, do I have to allocate enough memory for the entire structure or just the first element?
struct element
{
int x;
int y;
};
int foo(struct element* e)
{
if(e->x > 3)
return e->y;
return e->x;
}
in main:
int i = 0;
int z = foo((struct element*)&i);
I assume that if only allocating for the first element is valid, then I will have to be wary of anything that may attempt to copy the structure. i.e. passing the struct to a function.
don't force your information into structs where it's not needed: don't use the struct as the parameter of your function.
either pass the member of your struct to the function or use inheritance:
typedef struct {
int foo;
} BaseA;
typedef struct {
int bar;
} BaseB;
typedef struct {
BaseA a;
BaseB b;
} Derived;
void foo(BaseB* info) { ... }
...
Derived d;
foo(&d.b);
BaseB b;
foo(&b);
if you're just curious (and seriously don't use this): you may.
typedef struct {
int foo, goo, hoo, joo;
} A;
typedef struct {
int unused, goo;
} B;
int foo(A* a) { return a->goo; }
...
B b;
int goo = foo((A*)&b);
In general you'll have to allocate a block of memory at least as many bytes as are required to fully read the accessed member with the largest offset in your structure. In addition when writing to this block you have to make sure to use the same member offsets as in the original structure.
The point being, a structure is only a block of memory with different areas assigned different interpretations (int, char, other structs etc...) and accessing a member of a struct (after reordering and alignment) boils down to simply reading from or writing to a bit of memory.
I do not think the code as given is legitimate. To understand why, consider:
struct CHAR_AND_INT { unsigned char c; int i; }
CHAR_AND_INT *p;
A compiler would be entitled to assume that p->c will be word-aligned and have whatever padding would be necessary for p->i to also be word-aligned. On some processors, writing a byte may be slower than writing a word. For example, a byte-store instruction may require the processor to read a word from memory, update one byte within it, and write the whole thing back, while a word-store instruction could simply store the new data without having to read anything first. A compiler that knew that p->c would be word-aligned and padded could implement p->c = 12; by using a word store to write the value 12. Such behavior wouldn't yield desired results, however, if the byte following p->c wasn't padding but instead held useful data.
While I would not expect a compiler to impose "special" alignment or padding requirements on any part of the structure shown in the original question (beyond those which apply to int) I don't think anything in the standard would forbid a compiler from doing so.
You need to only check that the structure itself is allocated; not the members (in that case at least)
int foo(struct element* e)
{
if ( e != 0) // check that the e pointer is valid
{
if(e->x != 0) // here you only check to see if x is different than zero (values, not pointers)
return e->y;
}
return 0;
}
In you edited change, I think this is poor coding
int i = 0;
int z = foo((struct element*)&i);
In that case, i will be allocation on the stack, so its address is valid; and will be valid in foo; but since you cast it into something different, the members will be garbage (at best)
Why do you want to cast an int into a structure?
What is your intent?

Kind of polymorphism in C

I'm writing a C program in which I define two types:
typedef struct {
uint8_t array[32];
/* struct A's members */
...
} A;
typedef struct {
uint8_t array[32];
/* struct B's members, different from A's */
...
} B;
Now I would like to build a data structure which is capable of managing both types without having to write one for type A and one for type B, assuming that both have a uint8_t [32] as their first member.
I read how to implement a sort of polymorphism in C here and I also read here that the order of struct members is guaranteed to be kept by the compiler as written by the programmer.
I came up with the following idea, what if I define the following structure:
typedef struct {
uint8_t array[32];
} Element;
and define a data structure which only deals with data that have type Element? Would it be safe to do something like:
void f(Element * e){
int i;
for(i = 0; i < 32; i++) do_something(e->array[i]);
}
...
A a;
B b;
...
f(((Element *)&a));
...
f(((Element *)&b));
At a first glance it looks unclean, but I was wondering whether there are any guarantees that it will not break?
If array is always the first in your struct, you can simply access it by casting pointers. There is no need for a struct Element. You data structure can store void pointers.
typedef struct {
char array[32];
} A;
typedef struct {
void* elements;
size_t elementSize;
size_t num;
} Vector;
char* getArrayPtr(Vector* v, int i) {
return (char*)(v->elements) + v->elementSize*i;
}
int main()
{
A* pa = malloc(10*sizeof(A));
pa[3].array[0] = 's';
Vector v;
v.elements = pa;
v.num = 10;
v.elementSize = sizeof(A);
printf("%s\n", getArrayPtr(&v, 3));
}
but why not have a function that works with the array directly
void f(uint8_t array[32]){
int i;
for(i = 0; i < 32; i++) do_something(array[i]);
}
and call it like this
f(a.array)
f(b.array)
polymorphism makes sense when you want to kepp
a and b in a container of some sorts
and you want to iterate over them but you dont want to care that they are different types.
This should work fine if you, you know, don't make any mistakes. A pointer to the A struct can be cast to a pointer to the element struct, and so long as they have a common prefix, access to the common members will work just fine.
A pointer to the A struct, which is then cast to a pointer to the element struct can also be cast back to a pointer to the A struct without any problems. If element struct was not originally an A struct, then casting the pointer back to A will be undefined behavior. And this you will need to manage manually.
One gotcha (that I've run into) is, gcc will also allow you to cast the struct back and forth (not just pointer to struct) and this is not supported by the C standard. It will appear to work fine until your (my) friend tries to port the code to a different compiler (suncc) at which point it will break. Or rather, it won't even compile.

Generate version ID of struct definition?

Basically, what I want is some kind of compile-time generated version that is associated with the exact definition of a struct. If the definition of the struct changes in any way (field added, moved, maybe renamed), I want that version to change, too.
Such a version constant would be useful when reading in a previously serialized struct, to make sure that it's still compatible. The alternative would be manually keeping track of a manually specified constant, which has potentially confusing effects if incrementing it is forgotten (deserializing produces garbage), and also raises the question when exactly to increment it (during development and testing, or only during some kind of release).
This could be achieved by using an external tool to generate a hash over the struct definition, but I'm wondering if it is possible with the C compiler (and/or maybe its preprocessor) itself.
This is actually some form of introspection and so I suspect that this may not be possible at all in ANSI C, but I would be happy with a solution that works with gcc and clang.
The Windows API used to (still does?) have a size member as one of the first members of a struct, so that it knew what version of the struct it was being passed (see WNDCLASSEX as an example):
struct Foo
{
size_t size;
char *bar;
char *baz;
/* Other fields */
};
And before calling you set the size using sizeof:
struct Foo f;
f.size = sizeof(struct Foo);
f.bar = strdup("hi");
f.baz = strdup("there");
somefunc(&f);
Then somefunc would know, based on the size member, which version of the struct it was dealing with. Because sizeof is evaluated at compile time instead of run-time, this allows for backwards ABI compatibility.
There is nothing that would do it automatically, but you can build something that works reasonably reliably: you can use sizeof and offsetof, and combine them in such a way that the order in which you combine them mattered. Here is an example:
#include <stdio.h>
#include <stddef.h>
#define COMBINE2(a,b) ((a)*31+(b)*11)
#define COMBINE3(a,b,c) COMBINE2(COMBINE2(a,b),c)
#define COMBINE4(a,b,c,d) COMBINE2(COMBINE3(a,b,c),d)
typedef struct A {
int a1;
char a2;
float a3;
} A;
typedef struct B {
int b1;
char b2;
double b3;
} B;
typedef struct C {
char c2;
int c1;
float c3;
} C;
typedef struct D {
int d1;
char d2;
float d3;
int forgotten[2];
} D;
int main(void) {
size_t aSign = COMBINE4(sizeof(A), offsetof(A,a1), offsetof(A,a2), offsetof(A,a3));
size_t bSign = COMBINE4(sizeof(B), offsetof(B,b1), offsetof(B,b2), offsetof(B,b3));
size_t cSign = COMBINE4(sizeof(C), offsetof(C,c1), offsetof(C,c2), offsetof(C,c3));
size_t dSign = COMBINE4(sizeof(D), offsetof(D,d1), offsetof(D,d2), offsetof(D,d3));
printf("%ld %ld %ld %ld", aSign, bSign, cSign, dSign);
return 0;
}
This code prints
358944 478108 399864 597272
As you can see, this code produces run-time constants for each structure that reacts to re-ordering of fields of different lengths and changing fields' types. It also reacts to adding fields even if you forget to update the list of fields on which you base your computation, which should produce some sort of a safety net.

How to copy the contents of an unpacked struct to a __packed__ struct?

I read about __packed__ from here and, I understood that when __packed__ is used in a struct or union, it means that the member variables are placed in such a way to minimize the memory required to store the struct or union.
Now, consider the structures in the following code. They contain same elements (same type, same variable names and placed in the same order). The difference is, one is __packed__ and the other is not.
#include <stdio.h>
int main(void)
{
typedef struct unpacked_struct {
char c;
int i;
float f;
double d;
}ups;
typedef struct __attribute__ ((__packed__)) packed_struct {
char c;
int i;
float f;
double d;
}ps;
printf("sizeof(my_unpacked_struct) : %d \n", sizeof(ups));
printf("sizeof(my_packed_struct) : %d \n", sizeof(ps));
ups ups1 = init_ups();
ps ps1;
return 0;
}
Is there a way where we can copy unpacked structure ups1 into packed structure ps1 without doing a member-variable-wise-copy? Is there something like memcpy() that is applicable here?
I'm afraid you've just gotta write it out. Nothing in standard C (or any standard I know of) will do this for you. Write it once and never think about it again.
ps ups_to_ps(ups ups) {
return (ps) {
.c = ups.c,
.i = ups.i,
.f = ups.f,
.d = ups.d,
};
}
Without detailed knowlegde of the differences of the memory layout of the two structures: No.

Resources