What does (std::size_t)&((Structure *)0)->last do? - c

I recently came across this code and thought it was a little overkill.
struct Structure {
unsigned int first;
unsigned int last;
};
(std::size_t)&((Structure *)0)->last; // 4
So I'm wondering if I can safely do:
sizeof(unsigned int); // 4
instead of:
(std::size_t)&((Structure *)0)->last; // 4
or what that code is doing anyway if it's so much better.
EDIT
That code more or less euqals to offsetof as pointed out in https://stackoverflow.com/a/1379370/1001563
If you know what you're searching for you'll find the answer without having to ask. Thanks to #VladfromMoscow

Programs are usually being changed. So it is better to use a general approach. For example the type of data member first can be changed or before data member last there can be added one more data member.
Take into accpunt that there is already a similar macro in C defined in <stddef.h> (or <cstddef> in C++)
offsetof(type, member-designator)

Related

C bit field/array

I have to do a project for school. But I got stuck right at the beginning.
I have to define type for bitfield. It isn't a problem, it would look like this:
typedef struct {
unsigned flag1 : 1;
unsigned flag2 : 1;
}BitArray;
But next task is to do set of macros for working with bitfield. One of them is:
create(field_name,size) /* defines and initializes bitfield */
The question is how can I typedef bitfield so I could change number of its members later?
Second method that came to my mind is to use bool array. But again, how can I typedef bool array? At frist I tried:
typedef bool BitArray[]; //new identifier BitArray for bool array
BitArray Array[5]; //new BitArray variable Array - this line would be in the macro mentioned above
It didn't take me a long time to realise that it won't work. However I can do:
typedef bool BitArray;
BitArray Array[5];
But it just ranames identifier for bool.
I hope my post makes sense and thank you for any advise you can give.
The question is how can I typedef bitfield so I could change number of its members later?
That's what the typedef struct ... BitArray does. No need for additional types.
A bool array will most likely not compile into bit-wise code, it will likely compile into an array of bytes, which is not what you want.
In addition, it is a very bad idea to hide arrays or pointers behind typedefs, so that they don't look like arrays or pointers no longer.
Some recommendations:
I have to define type for bitfield
I would not recommend to use bit fields for any purpose what-so-ever. You should question your teacher why they are teaching you to use dangerous and poorly specified parts of the C language.
But next task is to do set of macros for working with bitfield
You should ask your teacher why they teach you to use function-like macros and not proper functions. Using a function-like macro might be the worst thing you can ever do in C: they are dangerous, they are unreadable, they are hard to debug, they are hard to maintain.
Combining function-like macros mixed with bit-fields seems like a really stupid idea, but of course that is just my personal opinion. The safe and 100% portable way is to use bit-wise operators with masks on byte-level variables, such as:
uint8_t my_var=0;
my_var |= 0x80; // set msb, bit 7
my_var &= ~0x80; // clear msb, bit 7
In C you can declare a typedef only for fixed-size array, like this:
typedef bool bitset8[8]; // 8 is constant expression
bitset8 bs8;
bs8[0] = true;
I don't quite understand how exactly create macro from your post is going to be used, but if you need to dynamically change number of fields you have to use malloc'ed objects ANYWAY, so the declaration of BitArray struct should contain a pointer to let's say unsigned char (that is a pointer to byte array essentially). The content of the array should be managed by separate functions, that may be called from macros (though there is no real need in them).

What's the benefit of encapsulating only one basic field into a struct in C?

I saw some C code like this:
// A:
typedef uint32_t in_addr_t;
struct in_addr { in_addr_t s_addr; };
And I always prefer like this:
// B:
typedef uint32_t in_addr;
So my question is: what's the difference / benefit of doing it in A from B?
It's a layer to introduce type safety, and it can be helpful 'for future expansion'.
One problem with the former is that it's easy to 'convert' a value of a type represented by a typedefed builtin to any of several other types or typedefed builtins.
consider:
typedef int t_millisecond;
typedef int t_second;
typedef int t_degrees;
versus:
// field notation could vary greatly here:
struct t_millisecond { int ms; };
struct t_second { int s; };
struct t_degrees { int f; };
In some cases, it makes it a little clearer to use a notation, and the compiler will also forbid erroneous conversions. Consider:
int a = millsecond * second - degree;
this is a suspicious program. using typedefed ints, that's a valid program. Using structs, it's ill-formed -- compiler errors will require your corrections, and you can make your intent explicit.
Using typedefs, arbitrary arithmetic and conversions may be applied, and they may be assigned to each other without warning, which can can become a burden to maintain.
Consider also:
t_second s = millisecond;
that would also be a fatal conversion.
It's just another tool in the toolbox -- use at your discretion.
Justin's answer is essentially correct, but I think some expansion is needed:
EDIT: Justin expanded his answer significantly, which makes this one somewhat redundant.
Type safety - you want to provide your users with API functions which manipulate the data, not let it just treat it as an integer. Hiding the field in a structure makes it harder to use it the wrong way, and pushes the user towards the proper API.
For future expansion - perhaps a future implementation would like to change things. Maybe add a field, or break the existing field into 4 chars. With a struct, this can be done without changing APIs.
What's your benefit? That your code won't break if implementation changes.

Access struct members as if they are a single array?

I have two structures, with values that should compute a pondered average, like this simplified version:
typedef struct
{
int v_move, v_read, v_suck, v_flush, v_nop, v_call;
} values;
typedef struct
{
int qtt_move, qtt_read, qtt_suck, qtd_flush, qtd_nop, qtt_call;
} quantities;
And then I use them to calculate:
average = v_move*qtt_move + v_read*qtt_read + v_suck*qtt_suck + v_flush*qtd_flush + v_nop*qtd_nop + v_call*qtt_call;
Every now and them I need to include another variable. Now, for instance, I need to include v_clean and qtt_clean. I can't change the structures to arrays:
typedef struct
{
int v[6];
} values;
typedef struct
{
int qtt[6];
} quantities;
That would simplify a lot my work, but they are part of an API that need the variable names to be clear.
So, I'm looking for a way to access the members of that structures, maybe using sizeof(), so I can treat them as an array, but still keep the API unchangeable. It is guaranteed that all values are int, but I can't guarantee the size of an int.
Writing the question came to my mind... Can a union do the job? Is there another clever way to automatize the task of adding another member?
Thanks,
Beco
What you are trying to do is not possible to do in any elegant way. It is not possible to reliably access consecutive struct members as an array. The currently accepted answer is a hack, not a solution.
The proper solution would be to switch to an array, regardless of how much work it is going to require. If you use enum constants for array indexing (as #digEmAll suggested in his now-deleted answer), the names and the code will be as clear as what you have now.
If you still don't want to or can't switch to an array, the only more-or-less acceptable way to do what you are trying to do is to create an "index-array" or "map-array" (see below). C++ has a dedicated language feature that helps one to implement it elegantly - pointers-to-members. In C you are forced to emulate that C++ feature using offsetof macro
static const size_t values_offsets[] = {
offsetof(values, v_move),
offsetof(values, v_read),
offsetof(values, v_suck),
/* and so on */
};
static const size_t quantities_offsets[] = {
offsetof(quantities, qtt_move),
offsetof(quantities, qtt_read),
offsetof(quantities, qtt_suck),
/* and so on */
};
And if now you are given
values v;
quantities q;
and index
int i;
you can generate the pointers to individual fields as
int *pvalue = (int *) ((char *) &v + values_offsets[i]);
int *pquantity = (int *) ((char *) &q + quantities_offsets[i]);
*pvalue += *pquantity;
Of course, you can now iterate over i in any way you want. This is also far from being elegant, but at least it bears some degree of reliability and validity, as opposed to any ugly hack. The whole thing can be made to look more elegantly by wrapping the repetitive pieces into appropriately named functions/macros.
If all members a guaranteed to be of type int you can use a pointer to int and increment it:
int *value = &(values.v_move);
int *quantity = &(quantities.qtt_move);
int i;
average = 0;
// although it should work, a good practice many times IMHO is to add a null as the last member in struct and change the condition to quantity[i] != null.
for (i = 0; i < sizeof(quantities) / sizeof(*quantity); i++)
average += values[i] * quantity[i];
(Since the order of members in a struct is guaranteed to be as declared)
Writing the question came to my mind... Can a union do the job? Is there another clever way to automatize the task of adding another member?
Yes, a union can certainly do the job:
union
{
values v; /* As defined by OP */
int array[6];
} u;
You can use a pointer to u.values in your API, and work with u.array in your code.
Personally, I think that all the other answers break the rule of least surprise. When I see a plain struct definition, I assume that the structure will be access using normal access methods. With a union, it's clear that the application will access it in special ways, which prompts me to pay extra attention to the code.
It really sounds as if this should have been an array since the beggining, with accessor methods or macros enabling you to still use pretty names like move, read, etc. However, as you mentioned, this isn't feasible due to API breakage.
The two solutions that come to my mind are:
Use a compiler specific directive to ensure that your struct is packed (and thus, that casting it to an array is safe)
Evil macro black magic.
How about using __attribute__((packed)) if you are using gcc?
So you could declare your structures as:
typedef struct
{
int v_move, v_read, v_suck, v_flush, v_nop, v_call;
} __attribute__((packed)) values;
typedef struct
{
int qtt_move, qtt_read, qtt_suck, qtd_flush, qtd_nop, qtt_call;
} __attribute__((packed)) quantities;
According to the gcc manual, your structures will then use the minimum amount of memory possible for storing the structure, omitting any padding that might have normally been there. The only issue would then be to determine the sizeof(int) on your platform which could be done through either some compiler macros or using <stdint.h>.
One more thing is that there will be a performance penalty for unpacking and re-packing the structure when it needs to be accessed and then stored back into memory. But at least you can be assured then that the layout is consistent, and it could be accessed like an array using a cast to a pointer type like you were wanting (i.e., you won't have to worry about padding messing up the pointer offsets).
Thanks,
Jason
this problem is common, and has been solved in many ways in the past. None of them is completely safe or clean. It depends on your particuar application. Here's a list of possible solutions:
1) You can redefine your structures so fields become array elements, and use macros to map each particular element as if it was a structure field. E.g:
struct values { varray[6]; };
#define v_read varray[1]
The disadvantage of this approach is that most debuggers don't understand macros. Another problem is that in theory a compiler could choose a different alignment for the original structure and the redefined one, so the binary compatibility is not guaranted.
2) Count on the compiler's behaviour and treat all the fields as it they were array fields (oops, while I was writing this, someone else wrote the same - +1 for him)
3) create a static array of element offsets (initialized at startup) and use them to "map" the elements. It's quite tricky, and not so fast, but has the advantage that it's independent of the actual disposition of the field in the structure. Example (incomplete, just for clarification):
int positions[10];
position[0] = ((char *)(&((values*)NULL)->v_move)-(char *)NULL);
position[1] = ((char *)(&((values*)NULL)->v_read)-(char *)NULL);
//...
values *v = ...;
int vread;
vread = *(int *)(((char *)v)+position[1]);
Ok, not at all simple. Macros like "offsetof" may help in this case.

Portable way to "unpoint" a pointer typedef?

This is unfortunately defined in some external library: cannot touch!
// library.h
typedef struct {
long foo;
char *bar;
/* ... (long & complex stuff omitted) */
} *pointer_to_complex_struct_t;
Now The Question: how to declare an complex_struct_t variable?
Ideal solution but not allowed! (cannot change external library):
// library.h
/* ... (long & complex stuff omitted) */
} complex_struct_t, *pointer_to_complex_struct_t;
// my.h
extern complex_struct_t my_variable;
Non-portable solution (gcc):
// my.h
extern typeof( * (type_placeholder)0 ) my_variable; // Thanks caf!
Other? Better? Thanks!
Bonus question: same question for a function pointer (in case there is any difference; I doubt it).
ADDED bonus: below is the exact same question but with functions instead of structs. This should not make any difference to the short answer ("No."), the only answer I was initially interested in. I did not expect some people to die trying to know and get my job done with creative workarounds, which is why I simplified the question from functions to structs (function pointers have special implicit conversion rules for convenience and confusion). But hey, why not? Let's get the copy-paste workaround competition started. Some workarounds are probably better than others.
///// library.h //////
// Signature has been simplified
typedef double (*ptr_to_callback_t)(long, int, char *);
// Too bad this is not provided: typedef double callback_t(long, int, char *);
///// my.h /////
// This avoids copy-paste but is not portable
typedef typeof( * (ptr_to_callback_t)0 ) callback_t;
extern callback_t callback_1;
extern callback_t callback_2;
extern callback_t callback_3;
// etc.
Short answer = no, there is currently no portable alternative to typeof
A basic copy-paste workaround works OK for functions but not for structs. The compiler will match the duplicated function types, but will not relate the duplicated struct types: a cast is required and the duplicated struct types will diverge without compilation warning.
No, unfortunately you cannot do it with standard C. With C++ a simple metafunction would do the trick though.
However you could just copy-paste the definition of the struct thus leaving the original untouched
typedef struct {
///same struct
} complex_struct_t;
The downside of this solution is that the expression &complex_struct_t won't be of type pointer_to_complex_struct_t, instead it will be of type pointer to unnamed struct {//your members};
You'll need reinterpret_casting, if you need that feature...
As written, the answer to your question is "no"; if all you have is a type definition of
typedef struct {...} *ptr_to_struct;
then there's no (standard, portable) way to extract the struct type. If you have to create an instance of the struct, the best you will be able to do is
ptr_to_struct s = malloc(sizeof *s);
and then refer to the fields in the struct using the -> component selection operator (or by dereferencing s and using the . operator, but you don't want to do that).
You asked if the same thing applied to function pointers; you really need to state exactly what you mean. If you have a situation like
typedef struct {...} *ptr_to_struct;
ptr_to_struct foo() {...}
then the situation is exactly like the above; you don't have a way to declare a variable of that type.
Make a local copy of the header file and include it instead of the original. Now you can do anything you want. If this library could change (update or anything else), you could write a little script to automate these steps and call it from your makefile whenever you compile. Just make sure to not blindly paste into the header, search for the specific line (} *pointer_to_complex_struct_t;) and throw an error if it is no longer found.
Maybe you have to be a bit careful with the search paths for includes if this header uses other headers of this library. Plus, with the order of includes if itself is included by other headers.
EDIT (for your real goal mentioned in a comment): You can't do this with function pointers. Just write the function you want with the signature of the typedef, and it will be compatible to the pointer and can be called by it.
How about:
pointer_to_complex_struct_t newStruct ( void ) {
pointer_to_complex_struct_t ptr =
(pointer_to_complex_struct_t) malloc ( sizeof (*pointer_to_complex_struct_t) );
return ptr;
}
You'd have to reference your newly created struct through ptr-> but you could create new ones.
Of course, this may or may not work, depending on how the struct is actually used. The example that comes to mind is: what if the struct ends with
char data[0];
and data is used to point into the memory following the structure.

union versus void pointer

What would be the differences between using simply a void* as opposed to a union? Example:
struct my_struct {
short datatype;
void *data;
}
struct my_struct {
short datatype;
union {
char* c;
int* i;
long* l;
};
};
Both of those can be used to accomplish the exact same thing, is it better to use the union or the void* though?
I had exactly this case in our library. We had a generic string mapping module that could use different sizes for the index, 8, 16 or 32 bit (for historic reasons). So the code was full of code like this:
if(map->idxSiz == 1)
return ((BYTE *)map->idx)[Pos] = ...whatever
else
if(map->idxSiz == 2)
return ((WORD *)map->idx)[Pos] = ...whatever
else
return ((LONG *)map->idx)[Pos] = ...whatever
There were 100 lines like that. As a first step, I changed it to a union and I found it to be more readable.
switch(map->idxSiz) {
case 1: return map->idx.u8[Pos] = ...whatever
case 2: return map->idx.u16[Pos] = ...whatever
case 3: return map->idx.u32[Pos] = ...whatever
}
This allowed me to see more clearly what was going on. I could then decide to completely remove the idxSiz variants using only 32-bit indexes. But this was only possible once the code got more readable.
PS: That was only a minor part of our project which is about several 100’000 lines of code written by people who do not exist any more. The changes to the code have to be gradual, in order not to break the applications.
Conclusion: Even if people are less used to the union variant, I prefer it because it can make the code much lighter to read. On big projects, readability is extremely important, even if it is just you yourself, who will read the code later.
Edit: Added the comment, as comments do not format code:
The change to switch came before (this is now the real code as it was)
switch(this->IdxSiz) {
case 2: ((uint16_t*)this->iSort)[Pos-1] = (uint16_t)this->header.nUz; break;
case 4: ((uint32_t*)this->iSort)[Pos-1] = this->header.nUz; break;
}
was changed to
switch(this->IdxSiz) {
case 2: this->iSort.u16[Pos-1] = this->header.nUz; break;
case 4: this->iSort.u32[Pos-1] = this->header.nUz; break;
}
I shouldn't have combined all the beautification I did in the code and only show that step. But I posted my answer from home where I had no access to the code.
In my opinion, the void pointer and explicit casting is the better way, because it is obvious for every seasoned C programmer what the intent is.
Edit to clarify: If I see the said union in a program, I would ask myself if the author wanted to restrict the types of the stored data. Perhaps some sanity checks are performed which make sense only on integral number types.
But if I see a void pointer, I directly know that the author designed the data structure to hold arbitrary data. Thus I can use it for newly introduced structure types, too.
Note that it could be that I cannot change the original code, e.g. if it is part of a 3rd party library.
It's more common to use a union to hold actual objects rather than pointers.
I think most C developers that I respect would not bother to union different pointers together; if a general-purpose pointer is needed, just using void * certainly is "the C way". The language sacrifices a lot of safety in order to allow you to deliberately alias the types of things; considering what we have paid for this feature we might as well use it when it simplifies the code. That's why the escapes from strict typing have always been there.
The union approach requires that you know a priori all the types that might be used. The void * approach allows storing data types that might not even exist when the code in question is written (though doing much with such an unknown data type can be tricky, such as requiring passing a pointer to a function to be invoked on that data instead of being able to process it directly).
Edit: Since there seems to be some misunderstanding about how to use an unknown data type: in most cases, you provide some sort of "registration" function. In a typical case, you pass in pointers to functions that can carry out all the operations you need on an item being stored. It generates and returns a new index to be used for the value that identifies the type. Then when you want to store an object of that type, you set its identifier to the value you got back from the registration, and when the code that works with the objects needs to do something with that object, it invokes the appropriate function via the pointer you passed in. In a typical case, those pointers to functions will be in a struct, and it'll simply store (pointers to) those structs in an array. The identifier value it returns from registration is just the index into the array of those structs where it has stored this particular one.
Although using union is not common nowadays, since union is more definitive for your usage scenario, suits well. In the first code sample it's not understood the content of data.
My preference would be to go the union route. The cast from void* is a blunt instrument and accessing the datum through a properly typed pointer gives a bit of extra safety.
Toss a coin. Union is more commonly used with non-pointer types, so it looks a bit odd here. However the explicit type specification it provides is decent implicit documentation. void* would be fine so long as you always know you're only going to access pointers. Don't start putting integers in there and relying on sizeof(void*) == sizeof (int).
I don't feel like either way has any advantage over the other in the end.
It's a bit obscured in your example, because you're using pointers and hence indirection. But union certainly does have its advantages.
Imagine:
struct my_struct {
short datatype;
union {
char c;
int i;
long l;
};
};
Now you don't have to worry about where the allocation for the value part comes from. No separate malloc() or anything like that. And you might find that accesses to ->c, ->i, and ->l are a bit faster. (Though this might only make a difference if there are lots of these accesses.)
It really depends on the problem you're trying to solve. Without that context it's really impossible to evaluate which would be better.
For example, if you're trying to build a generic container like a list or a queue that can handle arbitrary data types, then the void pointer approach is preferable. OTOH, if you're limiting yourself to a small set of primitive data types, then the union approach can save you some time and effort.
If you build your code with -fstrict-aliasing (gcc) or similar options on other compilers, then you have to be very careful with how you do your casting. You can cast a pointer as much as you want, but when you dereference it, the pointer type that you use for the dereference must match the original type (with some exceptions). You can't for example do something like:
void foo(void * p)
{
short * pSubSetOfInt = (short *)p ;
*pSubSetOfInt = 0xFFFF ;
}
void goo()
{
int intValue = 0 ;
foo( &intValue ) ;
printf( "0x%X\n", intValue ) ;
}
Don't be suprised if this prints 0 (say) instead of 0xFFFF or 0xFFFF0000 as you may expect when building with optimization. One way to make this code work is to do the same thing using a union, and the code will probably be easier to understand too.
The union reservs enough space for the largest member, they don't have to be same, as void* has a fixed size, whereas the union can be used for arbitrary size.
#include <stdio.h>
#include <stdlib.h>
struct m1 {
union {
char c[100];
};
};
struct m2 {
void * c;
};
int
main()
{
printf("sizeof m1 is %d ",sizeof(struct m1));
printf("sizeof m2 is %d",sizeof(struct m2));
exit(EXIT_SUCCESS);
}
Output:
sizeof m1 is 100 sizeof m2 is 4
EDIT: assuming you only use pointers of the same size as void* , I think the union is better, as you will gain a bit of error detection when trying to set .c with an integer pointer, etc'.
void* , unless you're creating you're own allocator, is definitely quick and dirty, for better or for worse.

Resources