As most C programmers know, you can't directly compare two structures.
Consider:
void isequal(MY_STRUCT a, MY_STRUCT b)
{
if (a == b)
{
puts("equal");
}
else
{
puts("not equal");
}
}
The a==b comparison will AFAIK throw a compile error on any sensible C compiler, because the C standard doesn't allow for built-in structure comparison. Workarounds using memcmp are of course a bad idea due to alignment, packing, bitfields etc., so we end up writing element by element comparison functions.
On the other hand it DOES allow for structure assignment e.g. a = b is entirely legal.
Clearly the compiler can cope with that fairly trivially, so why not comparison?
The only idea I had was that structure assignment is probably fairly close to memcpy(), as the gaps due to alignment etc. don't matter. On the other hand, a comparison might be more complicated. Or is this something I'm missing?
Obviously, I'm aware that doing a simple element by element comparison isn't necessarily enough, e.g. if the structure contains a pointer to a string, but there are circumstances where it would be useful.
As others have mentioned, here's an extract from C: A Reference Manual by Harbison and Steele:
Structures and unions cannot be compared for equality, even though assignment for these types is allowed. The gaps in structures and unions caused by alignment restrictions could contain arbitrary values, and compensating for this would impose an unacceptable overhead on the equality comparison or on all operations that modified structure and union types.
Comparison is unsupported for the same reason memcmp fails.
Due to padding fields the comparison would fail in unpredictable ways which would be unacceptable for most programmers. Assignment changes the invisible padding fields, but these are invisible anyway, so nothing unexpected there.
Obviously, you may ask: so why doesn't it just zero-fill all the padding fields ? Sure that would work but it would also make all programs pay for something they might not need.
EDIT
Oli Charlesworth notes in the comments that you may be asking: "why doesn't the compiler generate code for member-by-member comparison". If that is the case, I must confess: I don't know :-). The compiler would have all the needed information if it would only allow comparing complete types.
I found this in the C rationale (C99 rationale V5.10), 6.5.9:
The C89 Committee considered, on more than one occasion, permitting comparison of structures
for equality. Such proposals foundered on the problem of holes in structures. A byte-wise
comparison of two structures would require that the holes assuredly be set to zero so that all
holes would compare equal, a difficult task for automatic or dynamically allocated variables.
The possibility of union-type elements in a structure raises insuperable problems with this
approach. Without the assurance that all holes were set to zero, the implementation would have
to be prepared to break a structure comparison into an arbitrary number of member comparisons;
a seemingly simple expression could thus expand into a substantial stretch of code, which is
contrary to the spirit of C
In plain English: Since structs/unions may contain padding bytes, and the committee had not enforced these to hold certain values, they wouldn't implement this feature. Because if all padding bytes must be set to zero, it would require additional run-time overhead.
Auto-generate comparison operator is bad idea. Imagine how comparison would work for this structure:
struct s1 {
int len;
char str[100];
};
This is pascal like string with maximum length 100
Another case
struct s2 {
char a[100];
}
How can the compiler know how to compare a field? If this is a NUL-terminated string, the compiler must use strcmp or strncmp. If this is char array compiler must use memcmp.
To add to the existing good answers:
struct foo {
union {
uint32_t i;
float f;
} u;
} a, b;
a.u.f = -0.0;
b.u.f = 0.0;
if (a==b) // what to do?!
The problem arises inherently from unions not being able to store/track which member is current.
Related
Enumerations in languages like e.g. Swift or Rust support a kind of hybrid "choice plus data" mechanism, such that I could define a type like:
enum SomeOption {
None,
Index(int),
Key(string),
Callback(fn),
}
Now if I were to implement this in C, my understanding is that something like this would not be valid:
typedef enum {
is_callback_or_none,
is_string,
is_number
} my_choice;
typedef struct {
my_choice value_type;
void* value;
} my_option;
my_option x = {
.value_type = is_number,
.value = (void*)42
};
if (x.value_type == is_number) {
int n = (int)x.value;
// … use n…
}
I'm not sure what exactly I risk in doing this, but according to e.g. Can pointers store values and what is the use of void pointers? the only things I should store in a void* are actual addresses and NULL. [Aside: please turn a blind eye to the separate question of storing callback function pointers in a void* which I forgot was problematic when I made up this example.]
I suppose a more proper way to do this would be to use a union, e.g.:
typedef struct {
my_choice value_type;
union {
int number_value;
char* string_value;
void* pointer_value;
};
} my_option;
…which is probably nicer all around anyway. But I'm wondering specifically about the invalidity of the void* value version . What if I were (instead of the union solution) to simply substitute uintptr_t in place of the void*?
typedef struct {
my_choice value_type;
uintptr_t value;
} my_option;
Would storing either a pointer to a string/callback/null or a number within the uintptr_t value field of this struct be valid and [at least POSIX-]portable code? And if so, why is that okay, but not the seemingly equivalent void* value version?
The problem is that I don't understand if the rules are different re. what I can do with a uintptr_t/intptr_t vs. a void*, and if so, why they would be different?
The rules are different because they're out at the edge, within a boundary (or a grey area) between what machines can actually do, and what people want to do, and what a language standard says they can do.
Now, yes, on a "conventional" architecture, pointers and ints are both just binary integers of some size, so clearly it's possible to mix'n'match between the two.
And, again yes, this is clearly something that some people find themselves wanting to do. You've got a big, heterogeneous array of things, and some of them are plain numbers, and some of them are data pointers, and maybe some of them are function pointers, and you've got some way of knowing which is which, and sometimes it really does seem tidy to store them in one big heterogeneous array. Or you've got a function with a parameter that sometimes wants to be an integer and sometimes wants to be a pointer, and you're fine with that, too. (Well, except for all the warnings you get from your compilers, and the lectures from language lawyers and SO regulars.)
But then there's the C Standard, which goes to some pains to distinguish between integers, and data pointers, and function pointers. There are architectures where these truly aren't interchangeable, and where it's a bad idea to try. And the C Standard has always tried to accommodate those architectures.
(If you don't believe me, ask, because examples do exist.)
The C Standard could say that all pointers and all integers are more freely interchangeable. It sounds like that's what Swift and Rust have done. But it also sounds like Swift and Rust would not be implementable on those hypothetical "exotic" architectures.
These discussions get tricky because they're also at the intersection between language standards and programming practices. If you know you're always going to be using machines where integers and pointers are interchangeable, if you don't care about portability to the other, "exotic" architectures, you could just say so, and ignore the warnings, and move on -- unless your house style guide says that casts are forbidden, or your software development plan says that your code must be strictly conforming and compile without warnings. Then you might find yourself arguing with, or trying to change, the C Standard, just to get around your own SDP. (My advice: make sure that your style guide and your SDP have waiver mechanisms.)
Some day the C standard will probably get less "protective" (enabling?) of those exotic architectures. For example, I've heard it's been debated to drop the accommodation of one's complement and sign/magnitude machines, and to build the definition of two's complement into the next revision of the C Standard. But if that happens, or if other guarantees/accommodations change, it won't mean that C compilers and C programs for the exotic machines can't be written any more -- it will just mean that programmers for those machines will have to apply their own rules (like, "don't assign between int and void * and void (*)()") that aren't actually in the standard any more. (Or, equivalently, it means that strictly conforming code written for "normal" architectures won't automatically be portable to the exotic ones. Also, I guess, that the vendors of compilers for the exotic architectures won't be able to claim standards compliance any more.)
Even if void * values are represented as numbers, that does not mean the compiler handles them as it does numbers.
A uintptr_t is a number; C 2018 7.20.1.4 1 says it designates an unsigned integer type. So it behaves like other unsigned integer types: You can put any number within its range and get the same number back (and you can do arithmetic with it). The paragraph further says any valid void * can be converted to uintptr_t and that converting it back will produce the original pointer (or something equivalent, such as a pointer to the same place but with a different representation). So you can store pointers in uintptr_t objects.
However, the C standard does not say there is a range of numbers you can put into void * and get them back. 6.3.2.3 5 says that when an integer is converted to a pointer type, the result is implementation-defined (except that converting a constant zero to void * yields a null pointer, per 6.3.2.3 3). 6.3.2.3 6 says when you convert a pointer to an integer, the result is implementation-defined. (7.20.1.4 overrides this when the number is a uintptr_t that came from a pointer originally.)
So, if you store a number in a void *, how do you know it will work? The C standard does not guarantee to you that it will work. You would need some documentation for the compiler that says it will work.
I'm not trying to replicate the usual question about C not being able to return arrays but to dig a bit more deeply into it.
We cannot do this:
char f(void)[8] {
char ret;
// ...fill...
return ret;
}
int main(int argc, char ** argv) {
char obj_a[10];
obj_a = f();
}
But we can do:
struct s { char arr[10]; };
struct s f(void) {
struct s ret;
// ...fill...
return ret;
}
int main(int argc, char ** argv) {
struct s obj_a;
obj_a = f();
}
So, I was skimming the ASM code generated by gcc -S and seems to be working with the stack, addressing -x(%rbp) as with any other C function return.
What is it with returning arrays directly? I mean, not in terms of optimization or computational complexity but in terms of the actual capability of doing so without the struct layer.
Extra data: I am using Linux and gcc on a x64 Intel.
First of all, yes, you can encapsulate an array in a structure, and then do anything you want with that structure (assign it, return it from a function, etc.).
Second of all, as you've discovered, the compiler has little difficulty emitting code to return (or assign) structures. So that's not the reason you can't return arrays, either.
The fundamental reason you cannot do this is that, bluntly stated, arrays are second-class data structures in C. All other data structures are first-class. What are the definitions of "first-class" and "second-class" in this sense? Simply that second-class types cannot be assigned.
(Your next question might be, "Other than arrays, are there any other second-class data types?", and I think the answer is "Not really, unless you count functions".)
Intimately tied up with the fact that you can't return (or assign) arrays is that there are no values of array type, either. There are objects (variables) of array type, but whenever you try to take the value of one, you immediately get a pointer to the array's first element. [Footnote: more formally, there are no rvalues of array type, although an object of array type can be thought of as an lvalue, albeit a non-assignable one.]
So, quite aside from the fact that you can't assign to an array, you can't even generate a value to try to assign. If you say
char a[10], b[10];
a = b;
it's as if you had written
a = &b[0];
So we've got an array on the left, but a pointer on the right, and we'd have a massive type mismatch even if arrays somehow were assignable. Similarly (from your example) if we try to write
a = f();
and somewhere inside the definition of function f() we have
char ret[10];
/* ... fill ... */
return ret;
it's as if that last line said
return &ret[0];
and, again, we have no array value to return and assign to a, merely a pointer.
(In the function call example, we've also got the very significant issue that ret is a local array, perilous to try to return in C. More on this point later.)
Now, part of your question is probably "Why is it this way?", and also "If you can't assign arrays, why can you assign structures containing arrays?"
What follows is my interpretation and my opinion, but it's consistent with what Dennis Ritchie describes in his paper The Development of the C Language.
The non-assignability of arrays arises from three facts:
C is intended to be syntactically and semantically close to the machine hardware. An elementary operation in C should compile down to one or a handful of machine instructions taking one or a handful of processor cycles.
Arrays have always been special, especially in the way they relate to pointers; this special relationship evolved from and was heavily influenced by the treatment of arrays in C's predecessor language B.
Structures weren't initially in C.
Due to point 2, it's impossible to assign arrays, and due to point 1, it shouldn't be possible anyway, because a single assignment operator = shouldn't expand to code that might take N thousand cycles to copy an N thousand element array.
And then we get to point 3, which really ends up leading to a contradiction.
When C got structures, they initially weren't fully first-class either, in that you couldn't assign or return them. But the reason you couldn't was simply that the first compiler wasn't smart enough, at first, to generate the code. There was no syntactic or semantic roadblock, as there was for arrays.
And the goal all along was for structures to be first-class, and this was achieved relatively early on. The compiler caught up, and learned how to emit code to assign and return structures, shortly around the time that the first edition of K&R was going to print.
But the question remains, if an elementary operation is supposed to compile down to a small number of instructions and cycles, why doesn't that argument disallow structure assignment? And the answer is, yes, it's a contradiction.
I believe (though this is more speculation on my part) that the thinking was something like this: "First-class types are good, second-class types are unfortunate. We're stuck with second-class status for arrays, but we can do better with structs. The no-expensive-code rule isn't really a rule, it's more of a guideline. Arrays will often be large, but structs will usually be small, tens or hundreds of bytes, so assigning them won't usually be too expensive."
So a consistent application of the no-expensive-code rule fell by the wayside. C has never been perfectly regular or consistent, anyway. (Nor, for that matter, are the vast majority of successful languages, human as well as artificial.)
With all of this said, it may be worth asking, "What if C did support assigning and returning arrays? How might that work?" And the answer will have to involve some way of turning off the default behavior of arrays in expressions, namely that they tend to turn into pointers to their first element.
Sometime back in the '90's, IIRC, there was a fairly well-thought-out proposal to do exactly this. I think it involved enclosing an array expression in [ ] or [[ ]] or something. Today I can't seem to find any mention of that proposal (though I'd be grateful if someone can provide a reference). At any rate, I believe we could extend C to allow array assignment by taking the following three steps:
Remove the prohibition of using an array on the left-hand side of an assignment operator.
Remove the prohibition of declaring array-valued functions. Going back to the original question, make char f(void)[8] { ... } legal.
(This is the biggie.) Have a way of mentioning an array in an expression and ending up with a true, assignable value (an rvalue) of array type. For the sake of argument I'll posit a new operator or pseudofunction called arrayval( ... ).
[Side note: Today we have a "key definition" of array/pointer correspondence, namely that:
A reference to an object of array type which appears in an expression decays (with three exceptions) into a pointer to its first element.
The three exceptions are when the array is the operand of a sizeof operator, or a & operator, or is a string literal initializer for a character array. Under the hypothetical modifications I'm discussing here, there would be a fourth exception, namely when the array was an operand of this new arrayval operator.]
Anyway, with these modifications in place, we could write things like
char a[8], b[8] = "Hello";
a = arrayval(b);
(Obviously we would also have to decide what to do if a and b were not the same size.)
Given the function prototype
char f(void)[8];
we could also do
a = f();
Let's look at f's hypothetical definition. We might have something like
char f(void)[8] {
char ret[8];
/* ... fill ... */
return arrayval(ret);
}
Note that (with the exception of the hypothetical new arrayval() operator) this is just about what Dario Rodriguez originally posted. Also note that — in the hypothetical world where array assignment was legal, and something like arrayval() existed — this would actually work! In particular, it would not suffer the problem of returning a soon-to-be-invalid pointer to the local array ret. It would return a copy of the array, so there would be no problem at all — it would be just about perfectly analogous to the obviously-legal
int g(void) {
int ret;
/* ... compute ... */
return ret;
}
Finally, returning to the side question of "Are there any other second-class types?", I think it's more than a coincidence that functions, like arrays, automatically have their address taken when they are not being used as themselves (that is, as functions or arrays), and that there are similarly no rvalues of function type. But this is mostly an idle musing, because I don't think I've ever heard functions referred to as "second-class" types in C. (Perhaps they have, and I've forgotten.)
Footnote: Because the compiler is willing to assign structures, and typically knows how to emit efficient code for doing so, it used to be a somewhat popular trick to co-opt the compiler's struct-copying machinery in order to copy arbitrary bytes from point a to point b. In particular, you could write this somewhat strange-looking macro:
#define MEMCPY(b, a, n) (*(struct foo { char x[n]; } *)(b) = \
*(struct foo *)(a))
that behaved more or less exactly like an optimized in-line version of memcpy(). (And in fact, this trick still compiles and works under modern compilers today.)
What is it with returning arrays directly? I mean, not in terms of optimization or computational complexity but in terms of the actual capability of doing so without the struct layer.
It has nothing to do with capability per se. Other languages do provide the ability to return arrays, and you already know that in C you can return a struct with an array member. On the other hand, yet other languages have the same limitation that C does, and even more so. Java, for instance, cannot return arrays, nor indeed objects of any type, from methods. It can return only primitives and references to objects.
No, it is simply a question of language design. As with most other things to do with arrays, the design points here revolve around C's provision that expressions of array type are automatically converted to pointers in almost all contexts. The value provided in a return statement is no exception, so C has no way of even expressing the return of an array itself. A different choice could have been made, but it simply wasn't.
For arrays to be first-class objects, you would expect at least to be able to assign them. But that requires knowledge of the size, and the C type system is not powerful enough to attach sizes to any types. C++ could do it, but doesn't due to legacy concerns—it has references to arrays of particular size (typedef char (&some_chars)[32]), but plain arrays are still implicitly converted to pointers as in C. C++ has std::array instead, which is basically the aforementioned array-within-struct plus some syntactic sugar.
Bounty hunting.
The authors of C did not aspire to be language or type system designers. They were tool designers. C was a tool to make system programming easier.
ref: B Kernighan on Pascal Ritchie on C
There was no compelling case for C to do anything unexpected; especially as UNIX and C were ushering in the era of least surprise. Copying arrays, and making complex syntax to do so when it was the metaphorical equivalent of having a setting to burn the toast did not fit the C model.
Everything in C, the language, is effectively constant time, constant size. C, the standard, seems preoccupied with doing away with this core feature which made C so popular; so expect the, uh, standard C/2023.feb07 to feature a punctuation nightmare that enables arrays as r-values.
The decision of the C authors makes eminent sense if you view the programming world pragmatically. If you view it as a pulpit for treasured beliefs, then get onboard for C/2023.feb07 before C/2023.feb08 nullifies it.
I'm afraid in my mind it's not so much a debate of first or second class objects, it's a religious discussion of good practice and applicable practice for deep embedded applications.
Returning a structure either means a root structure being changed by stealth in the depths of the call sequence, or a duplication of data and the passing of large chunks of duplicated data. The main applications of C are still largely concentrated around the deep embedded applications. In these domains you have small processors that don't need to be passing large blocks of data. You also have engineering practice that necessitates the need to be able to operate without dynamic RAM allocation, and with minimal stack and often no heap. It could be argued the return of the structure is the same as modification via pointer, but abstracted in syntax... I'm afraid I'd argue that's not in the C philosophy of "what you see is what you get" in the same way a pointer to a type is.
Personally, I would argue you have found a loop hole, whether standard approved or not. C is designed in such a way that allocation is explicit. You pass as a matter of good practice address bus sized objects, normally in an aspirational one cycle, referring to memory that has been allocated explicitly at a controlled time within the developers ken. This makes sense in terms of code efficiency, cycle efficiency, and offers the most control and clarity of purpose. I'm afraid, in code inspection I'd throw out a function returning a structure as bad practice. C does not enforce many rules, it's a language for professional engineers in many ways as it relies upon the user enforcing their own discipline. Just because you can, doesn't mean you should... It does offer some pretty bullet proof ways to handle data of very complex size and type utilising compile time rigour and minimising the dynamic variations of footprint and at runtime.
I'm reading paragraph 7 of 6.5 in ISO/IEC 9899:TC2.
It condones lvalue access to an object through:
an aggregate or union type that includes one of the aforementioned
types among its members (including, recursively, a member of a
subaggregate or contained union),
Please refer to the document for what 'aforementioned' types are but they certainly include the effective type of the object.
It is in a section noted as:
The intent of this list is to specify those circumstances in which an
object may or may not be aliased.
I read this as saying (for example) that the following is well defined:
#include <stdlib.h>
#include <stdio.h>
typedef struct {
unsigned int x;
} s;
int main(void){
unsigned int array[3] = {73,74,75};
s* sp=(s*)&array;
sp->x=80;
printf("%d\n",array[0]);
return EXIT_SUCCESS;
}
This program should output 80.
I'm not advocating this as a good (or very useful) idea and concede I'm in part interpreting it that way because I can't think what else that means and can't believe it's a meaningless sentence!
That said, I can't see a very good reason to forbid it. What we do know is that the alignment and memory contents at that location are compatible with sp->x so why not?
It seems to go so far as to say if I add (say) a double y; to the end of the struct I can still access array[0] through sp->x in this way.
However even if the array is larger than sizeof(s) any attempt to access sp->y is 'all bets off' undefined behaviour.
Might I politely ask for people to say what that sentence condones rather than go into a flat spin shouting 'strict aliasing UB strict aliasing UB' as seems to be all too often the way of these things.
The answer to this question is covered in proposal: Fixing the rules for type-based aliasing which we will see, unfortunately was not resolved in 2010 when the proposal was made which is covered in Hedquist, Bativa, November 2010 minutes . Therefore C11 does not contain a resolution to N1520, so this is an open issue:
There does not seem to be any way that this matter will be
resolved at this meeting. Each thread of proposed approaches
leads to more questions. 1 1:48 am, Thurs, Nov 4, 2010.
ACTION – Clark do more work in
N1520 opens saying (emphasis mine going forward):
Richard Hansen pointed out a problem in the type-based aliasing
rules, as follows:
My question concerns the phrasing of bullet 5 of 6.5p7 (aliasing as it applies to unions/aggregates). Unless my understanding of
effective type is incorrect, it seems like the union/aggregate
condition should apply to the effective type, not the lvalue type.
Here are some more details:
Take the following code snippet as an example:
union {int a; double b;} u;
u.a = 5;
From my understanding of the definition of effective type (6.5p6), the effective type of the object at location &u is union {int a;
double b;}. The type of the lvalue expression that is accessing the
object at &u (in the second line) is int.
From my understanding of the definition of compatible type (6.2.7), int is not compatible with union {int a; double b;}, so
bullets 1 and 2 of 6.5p7 do not apply. int is not the signed or
unsigned type of the union type, so bullets 3 and 4 do not apply. int
is not a character type, so bullet 6 does not apply.
That leaves bullet 5. However, int is not an aggregate or union
type, so that bullet also does not apply. That means that the above
code violates the aliasing rule, which it obviously should not.
I believe that bullet 5 should be rephrased to indicate that if the
effective type (not the lvalue type) is an aggregate or union type
that contains a member with type compatible with the lvalue type, then
the object may be accessed.
Effectively, what he points out is that the rules are asymmetrical
with respect to struct/union membership. I have been aware of this
situation, and considered it a (non-urgent) problem, for quite some
time. A series of examples will better illustrate the problem. (These
examples were originally presented at the Santa Cruz meeting.)
In my experience with questions about whether aliasing is valid based
on type constraints, the question is invariably phrased in terms of
loop invariance. Such examples bring the problem into extremely sharp
focus.
And the relevant example that applies to this situation would be 3 which is as follows:
struct S { int a, b; };
void f3(int *pi, struct S *ps1, struct S const *ps2)
{
for (*pi = 0; *pi < 10; ++*pi) {
*ps1++ = *ps2;
}
}
The question here is whether the object *ps2 may be accessed (and
especially modified) by assigning to the lvalue *pi — and if so,
whether the standard actually says so. It could be argued that this is
not covered by the fifth bullet of 6.5p7, since *pi does not have
aggregate type at all.
**Perhaps the intention is that the question should be turned around: is
it allowed to access the value of the object pi by the lvalue ps2.
Obviously, this case would be covered by the fifth bullet.
All I can say about this interpretation is that it never occurred to
me as a possibility until the Santa Cruz meeting, even though I've
thought about these rules in considerable depth over the course of
many years. Even if this case might be considered to be covered by the
existing wording, I'd suggest that it might be worth looking for a
less opaque formulation.
The following discussion and proposed solutions are very long and hard to summarize but seems to end with a removal of the aforementioned bullet five and resolve the issue with adjustments to other parts of 6.5. But as noted above this issues involved were not resolvable and I don't see a follow-up proposal.
So it would seem the standard wording permits the scenario the OP demonstrates although my understanding is that this was unintentional and therefore I would avoid it and it could potentially change in later standards to be non-conforming.
I think this text does not apply:
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
sp->x has type unsigned int which is not an aggregate or union type.
In your code there is no strict aliasing violation: it is OK to read unsigned int as unsigned int.
The struct might have different alignment requirements to the array but other than that there is no problem.
Accessing via "an aggregate or union type" would be:
s t = *sp;
I confess that the idea that I can lay a struct over a locally defined array in this way is frankly exotic.
I still maintain that C99 and all subsequent standards permit it.
If fact it's very arguable that members being objects in themselves the first bullet point in 6.7.5 allows it:
a type compatible with the effective type of the object
I think that's M.M's point.
Looking at the problem the other way, let's notice that it's absolutely legitimate (in a strictly conforming environment) to alias the member sp->x as an object in it's own right.
In the context of the code in my OP consider a function with prototype void doit(int* ip,s* sp); the following call is expected to behave logically:
doit(&(sp->x),sp);
NB: Program logic may (of course) may not behave as desired. For example if doit increments sp->x until it exceeds *ip then there's a problem! However what is not allowed in a conformant compiler is for the outcome to be corrupted by artifacts due to the optimizer ignoring aliasing potential.
I maintain that C would be all the weaker if the language required me to code:
int temp=sp->x;
doit(&temp,sp);
sp->x=temp;
Imagine all the cases where any call to any function has to be policed for the potential aliasing access to any part of the structures being passed. Such a language would probably be unusable.
Obviously a hard optimizing (i.e. non-compliant) compiler might make a complete hash of doit() if it doesn't recognize that ip might be an alias of member in the middle of sp.
That's irrelevant to this discussion.
To set out when a compiler can (and cannot) make such assumptions is understood as the reason why the standard needs to set very precise parameters around aliasing. That is to give the optimizer some conditions to dis-count. In a low level language such as 'C' it could be reasonable (even desirable) to say that a suitably aligned pointer to an accessible valid bit pattern can be used to access to a value.
It is absolutely established that sp->x in my OP is pointing to a properly aligned location holding a valid unsigned int.
The intelligent concerns are whether the compiler/optimizer agree that's then a legitimate way to access that location or ignorable as undefined behavior.
As the doit() example shows it's absolutely established that a structure can be broken down and treated as individual objects which merely happen to have a special relationship.
This question appears to be about the circumstances when a set of members that happen to have that special relationship can have a structure 'laid over them'.
I think most people will agree that the program at the bottom of this answer performs valid, worthwhile functionality that if associated with some I/O library could 'abstract' a great deal of the work required to read and write structures.
You might think there's a better way of doing it, but I'm not expecting many people to think it's not an unreasonable approach.
It operates by exactly that means - it builds a structure member by member then accesses it through that structure.
I suspect some of the people who object to the code in the OP are more relaxed about this.
Firstly, it operates on memory allocated from the free-store as 'un-typed' universally aligned storage.
Secondly, it builds a whole structure. In the OP I'm pointing the rules (at least appear to permit) that you can line up bits of a structure and so long as you only de-reference those bits everything is OK.
I somewhat share that attitude. I think the OP is slightly perverse and language stretching in a poorly written corner of the standard. Not something to put your shirt on.
However, I absolutely think it would be a mistake to forbid the techniques below as they rule out a logically very valid technique that recognizes structures can be built up from objects just as much as broken down into them.
However I will say that something like this is the only thing I could come up with where this sort of approach seems worthwhile. But on the other hand if you can't pull data apart AND/OR put it together then you quickly start to break the notion at C structures are POD - the possibly padded sum of their parts, nothing more, nothing less.
#include <stddef.h>
#include <stdlib.h>
#include <stdio.h>
typedef enum {
is_int, is_double //NB:TODO: support more types but this is a toy.
} type_of;
//This function allocates and 'builds' an array based on a provided set of types, offsets and sizes.
//It's a stand-in for some function that (say) reads structures from a file and builds them according to a provided
//recipe.
int buildarray(void**array,const type_of* types,const size_t* offsets,size_t mems,size_t sz,size_t count){
const size_t asize=count*sz;
char*const data=malloc(asize==0?1:asize);
if(data==NULL){
return 1;//Allocation failure.
}
int input=1;//Dummy...
const char*end=data+asize;//One past end. Make const for safety!
for(char*curr=data;curr<end;curr+=sz){
for(size_t i=0;i<mems;++i){
char*mem=curr+offsets[i];
switch(types[i]){
case is_int:
*((int*)mem)=input++;//Dummy...Populate from file...
break;
case is_double:
*((double*)mem)=((double)input)+((double)input)/10.0;//Dummy...Populate from file...
++input;
break;
default:
free(data);//Better than returning an incomplete array. Should not leak even on error conditions.
return 2;//Invalid type!
}
}
}
if(array!=NULL){
*array=data;
}else{
free(data);//Just for fun apparently...
}
return 0;
}
typedef struct {
int a;
int b;
double c;
} S;
int main(void) {
const type_of types[]={is_int,is_int,is_double};
const size_t offsets[]={offsetof(S,a),offsetof(S,b),offsetof(S,c)};
S* array=NULL;
const size_t size=4;
int err=buildarray((void **)&array,types,offsets,3,sizeof(S),size);
if(err!=0){
return EXIT_FAILURE;
}
for(size_t i=0;i<size;++i){
printf("%zu: %d %d %f\n",i,array[i].a,array[i].b,array[i].c);
}
free(array);
return EXIT_SUCCESS;
}
I think it's an interesting tension.
C is intended to be that low level high level language and give the programmer almost direct access to machine operations and memory.
That means the programmer can fulfill with the arbitrary demands of hardware devices and write highly efficient code.
However if the programmer is given absolute control such as my point about an 'if it fits it's OK' approach to aliasing then the optimizer gets its game spoilt.
So weirdly it's worth holding a little bit of performance back to return a dividend from the optimizer.
Section 6.5 of the C99 standard tries (and doesn't entirely succeed) to set that boundary out.
I have defined a union as follows:
union {
uintptr_t refcount;
struct slab_header *page;
} u;
The page pointer is guaranteed to be aligned on a page boundary (most probably 4096), and is never going to be NULL. This implies that the lowest possible address is going to be 4096.
refcount will be within 0 .. 4095.
Upon the creation of the enclosing struct, I can either have u.refcount = 0 or u.page = mmap(...).
The code around this union is going to be something like that:
if (u.refcount < 4096) {
/* work with refcount, possibly increment it */
} else {
/* work with page, possibly dereference it */
}
Is this always guaranteed to work on a fully POSIX-compliant implementation? Is it ever possible that uintptr_t and struct slab_header * have different representations, so that, for example, when u.page == 8192, u.refcount < 4096 yields true?
I don't think that it's "always guaranteed to work", because:
uintptr_t is optional (7.18.1.4).
A void * can be converted to uintptr_t and back (7.18.1.4). It's not guaranteed that it is the case with struct slab_header*. A void * has the same representation and alignment requirements as a pointer to a character type. Pointers to structures needn't have the same representation or alignment (6.2.5 27). Even if this was not the case, nothing guarantees sizeof(uintptr_t) == sizeof(void *), it could obviously be larger and still satisfy the requirement of being convertible to void * in the typical case of homogeneous pointers.
Finally, even if they have the same size and are convertible, it's possible the representation of the pointer values differs in a strange way from that of unsigned integers. The representation of unsigned integers is relatively constrained (6.2.6.2 1), but no such constraints exist on pointers.
Therefore, I'd conclude the best way would be to have a common initial elements that tells the state.
I'm going to answer a different question -- "is this a good idea?" The more significant worry I see from your code is aliasing issues. I would be unsurprised (in fact, I would mildly expect it) if it were possible to write a piece of code that had the effect of
Write something to u.refcount
Write something to u.page
Read u.refcount
Discover the value you read is the same as what you first wrote
You may scoff, but I've seen this happen -- and if you don't appreciate the danger, it will take a very long time to debug.
You may be safe with this particular union; I'll leave it to someone with more experience with this sort of thing (and/or a copy of the C standard handy) to make that judgment. But I want to emphasize that this is something important to worry about!!!
There is another cost. Using the union in combination with a "magic" test to discover what is stored in it -- especially one using system-specific facts -- is going to make your code harder to read, debug, and maintain. You can take steps to mitigate that fact, but it will always be an issue.
And, of course, the cost of your code simply not working when someone tries to use it on a weird machine.
The right solution is probably to structure your code in a way so that the only code that cares about the data layout is a couple tiny inlined access routines, so that you can easily swap how to store things. Organize your code to use a compile-time flag to choose which one!
Now that I've said all that, the question I really want to ask (and want to get you in the habit of considering): "is it worth it?" You're making sacrifices in your time, code readability, ease of programming, and portability. What are you gaining?
A lot of people forget to ask this question, and they write complex buggy code when simplistic, easy to write and read code is just as good.
If you've discovered this change is going to double the performance of a time-intensive routine, then it's probably worth dealing with this hack. But if you're investigating this simply because it's a clever trick to save 8 bytes, then you should consider it only as an intellectual exercise.
I think that <stdint.h> and C99 standard guarantee that uintptr_t and void* has the same size and can be casted without loss.
Being page aligned for a pointer is an implementation detail.
there should be any problem with that code. But for any case you can have in the init of your program a check:
if (sizeof(uintptr_t) != sizeof (struct slab_header*)
{
print error
}
but something seems off (or not clear). you want the "refcount" to be 0 when ever you have a page. and not zero when the page is NULL ?
Is this always guaranteed to work?
No. You can't read a member of an union different than the last one written. And optimizers do take that into account, so that isn't a theoretical problem.
Is it ever possible that uintptr_t and struct slab_header * have different representations,
so that, for example, when u.page == 8192, u.refcount < 4096 yields true?
It is also theoretically possible. I can't think of a current implementation where it occurs.
Even though I am a long time C programmer, I only recently learned that one can directly assign structure variables to one another instead of using memcpy:
struct MyStruct a,b;
...
a = b; /* implicit memcpy */
Though this feels a bit "high-level" for C, it is definitely useful. But why can't I do equality and inequality comparison:
if (a == b) ...
if (a != b) ...
Is there any good reason for the standard to exclude this? Or is this an inconsistency in the - otherwise very elegant - standard?
I don't see why I can replace my memcpy's for clean assignments, but I have to keep those ugly memcmp's in place.
Per the comp.lang.c FAQ:
There is no good way for a compiler
to implement structure comparison
(i.e. to support the == operator for
structures) which is consistent with
C's low-level flavor. A simple
byte-by-byte comparison could founder
on random bits present in unused
"holes" in the structure (such
padding is used to keep the alignment
of later fields correct). A field-by-field comparison might require unacceptable amounts of
repetitive code for large structures.
Any compiler-generated comparison
could not be expected to compare
pointer fields appropriately in all
cases: for example, it's often
appropriate to compare char * fields
with strcmp rather than ==.
If you need to compare two structures,
you'll have to write your own function
to do so, field by field.