I read this on a blog:
Violating Type Rules: It is undefined behavior to cast an int* to a float* and dereference it (accessing the "int" as if it were a "float"). C requires that these sorts of type conversions happen through memcpy: using pointer casts is not correct and undefined behavior results. The rules for this are quite nuanced and I don't want to go into the details here (there is an exception for char*, vectors have special properties, unions change things, etc). This behavior enables an analysis known as "Type-Based Alias Analysis" (TBAA) which is used by a broad range of memory access optimizations in the compiler, and can significantly improve performance of the generated code. For example, this rule allows clang to optimize this function:
How can you use the memcpy function for type coercion? And what about the exception to char*?
I don't understand how to use the memcpy function for type coercion?
Suppose you have the float value 1.25. And suppose you want to confirm that its actual IEEE-754 representation in hexadecimal is 3fa00000. There are at least four different ways you might try to do this:
(1) Take a float pointer and cast it to an integer pointer, and indirect on it:
float f = 1.25;
printf("%08x\n", *(uint32_t *)&f);
(This fragment quietly assumes 32-bit int. For better portability, you could use printf("%08" PRIx32 "\n", *(uint32_t *)&f);.)
(2) Use a union:
union {float f; uint32_t i;} u;
u.f = f;
printf("%08x\n", u.i);
(3) Use a char pointer, and iterate/index:
unsigned char *p = (unsigned char *)&f;
for(int i = 3; i >= 0; i--) printf("%02x", p[i]);
(Note that this code fragment assumes little-endian.)
(4) Use memcpy:
uint32_t x;
memcpy(&x, &f, 4);
printf("%08x\n", x);
Now, the take-home lesson is that not all of these methods work reliably any more, because of the strict aliasing rule.
In particular, method (1) is flatly illegal. It's a textbook example of what the strict aliasing rule disallows.
I think you're still allowed to use a union as in method 2, but you may have to put on a language lawyer hat to convince yourself of it. (See also the comments on this answer below.)
Methods (3) and (4), however, continue to work, because they take advantage of an explicit exception to the strict aliasing rule, namely that you are allowed to access the bits of an object using a punned pointer of the "wrong" type, as long as the "wrong type" is specifically a character pointer.
So I think this is clear, but in answer to your specific questions:
How can you use the memcpy function for type coercion?
As in method (4).
And what about the exception to char *?
That's the explicit exception in the strict aliasing rule that allows method (3) to work.
The rules, by the way, are significantly different here in C than in C++. Strictly speaking, I believe, in C++ not even method (3) is legal, and the only way you're allowed to do this sort of thing any more is with method (4) and an implicit call to memcpy. (However, I'm told that optimizing compilers tend to treat calls to memcpy very specially these days, not only replacing explicit function calls with inline register moves, but sometimes even optimizing out the copy altogether, and doing something like method 1 or 2 internally, if they know they can get away with it.)
Related
For example, can this
unsigned f(float x) {
unsigned u = *(unsigned *)&x;
return u;
}
cause unpredictable results on a platform where,
unsigned and float are both 32-bit
a pointer has a fixed size for all types
unsigned and float can be stored to and loaded from the same part of memory.
I know about strict aliasing rules, but most examples showing problematic cases of violating strict aliasing is like the following.
static int g(int *i, float *f) {
*i = 1;
*f = 0;
return *i;
}
int h() {
int n;
return g(&n, (float *)&n);
}
In my understanding, the compiler is free to assume that i and f are implicitly restrict. The return value of h could be 1 if the compiler thinks *f = 0; is redundant (because i and f can't alias), or it could be 0 if it puts into account that the values of i and f are the same. This is undefined behaviour, so technically, anything else can happen.
However, the first example is a bit different.
unsigned f(float x) {
unsigned u = *(unsigned *)&x;
return u;
}
Sorry for my unclear wording, but everything is done "in-place". I can't think of any other way the compiler might interpret the line unsigned u = *(unsigned *)&x;, other than "copy the bits of x to u".
In practice, all compilers for various architectures I tested in https://godbolt.org/ with full optimization produce the same result for the first example, and varying results (either 0 or 1) for the second example.
I know it's technically possible that unsigned and float have different sizes and alignment requirements, or should be stored in different memory segments. In that case even the first code won't make sense. But on most modern platforms where the following holds, is the first example still undefined behaviour (can it produce unpredictable results)?
unsigned and float are both 32-bit
a pointer has a fixed size for all types
unsigned and float can be stored to and loaded from the same part of memory.
In real code, I do write
unsigned f(float x) {
unsigned u;
memcpy(&u, &x, sizeof(x));
return u;
}
The compiled result is the same as using pointer casting, after optimization. This question is about interpretation of the standard about strict aliasing rules for code such as the first example.
Is it always undefined behaviour to copy the bits of a variable through an incompatible pointer?
Yes.
The rule is https://port70.net/~nsz/c/c11/n1570.html#6.5p7 :
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
a type compatible with the effective type of the object,
a qualified version of a type compatible with the effective type of the object,
a type that is the signed or unsigned type corresponding to the effective type of the
object,
a type that is the signed or unsigned type corresponding to a qualified version of the
effective type of the object,
an aggregate or union type that includes one of the aforementioned types among its
members (including, recursively, a member of a subaggregate or contained union), or
a character type.
The effective type of the object x is float - it is defined with that type.
unsigned is not compatible with float,
unsigned is not a qualified version of float,
unsigned is not a signed or unsigned type of float,
unsigned is not a signed or unsigned type corresponding to qualified version of float,
unsigned is not an aggregate or union type
and unsigned is not a character type.
The "shall" is violated, it is undefined behavior (see https://port70.net/~nsz/c/c11/n1570.html#4p2 ). There is no other interpretation.
We also have https://port70.net/~nsz/c/c11/n1570.html#J.2 :
The behavior is undefined in the following circumstances:
An object has its stored value accessed other than by an lvalue of an allowable type (6.5).
As Kamil explains, it's UB. Even int and long (or long and long long) aren't alias-compatible even when they're the same size. (But interestingly, unsigned int is compatible with int)
It's nothing to do with being the same size, or using the same register-set as suggested in a comment, it's mainly a way to let compilers assume that different pointers don't point to overlapping memory when optimizing. They still have to support C99 union type-punning, not just memcpy. So for example a dst[i] = src[i] loop doesn't need to check for possible overlap when unrolling or vectorizing, if dst and src have different types.1
If you're accessing the same integer data, the standard requires that you use the exact same type, modulo only things like signed vs. unsigned and const. Or that you use (unsigned) char*, which is like GNU C __attribute__((may_alias)).
The other part of your question seems to be why it appears to work in practice, despite the UB.
Your godbolt link forgot to link the actual compilers you tried.
https://godbolt.org/z/rvj3d4e4o shows GCC4.1, from before GCC went out of its way to support "obvious" local compile-time-visible cases like this, to sometimes not break people's buggy code using non-portable idioms like this.
It loads garbage from stack memory, unless you use -fno-strict-aliasing to make it movd to that location first. (Store/reload instead of movd %xmm0, %eax is a missed-optimization bug that's been fixed in later GCC versions for most cases.)
f: # GCC4.1 -O3
movl -4(%rsp), %eax
ret
f: # GCC4.1 -O3 -fno-strict-aliasing
movss %xmm0, -4(%rsp)
movl -4(%rsp), %eax
ret
Even that old GCC version warns warning: dereferencing type-punned pointer will break strict-aliasing rules which should make it obvious that GCC notices this and does not consider it well-defined. Later GCC that do choose to support this code still warn.
It's debatable whether it's better to sometimes work in simple cases, but break other times, vs. always failing. But given that GCC -Wall does still warn about it, that's probably a good tradeoff between convenience for people dealing with legacy code or porting from MSVC. Another option would be to always break it unless people use -fno-strict-aliasing, which they should if dealing with codebases that depend on this behaviour.
Being UB doesn't mean required-to-fail
Just the opposite; it would take tons of extra work to actually trap on every signed overflow in the C abstract machine, for example, especially when optimizing stuff like 2 + c - 3 into c - 1. That's what gcc -fsanitize=undefined tries to do, adding x86 jo instructions after additions (except it still does constant-propagation so it's just adding -1, not detecting temporary overflow on INT_MAX. https://godbolt.org/z/WM9jGT3ac). And it seems strict-aliasing is not one of the kinds of UB it tries to detect at run time.
See also the clang blog article: What Every C Programmer Should Know About Undefined Behavior
An implementation is free to define behaviour the ISO C standard leaves undefined
For example, MSVC always defines this aliasing behaviour, like GCC/clang/ICC do with -fno-strict-aliasing. Of course, that doesn't change the fact that pure ISO C leaves it undefined.
It just means that on those specific C implementations, the code is guaranteed to work the way you want, rather than happening to do so by chance or by de-facto compiler behaviour if it's simple enough for modern GCC to recognize and do the more "friendly" thing.
Just like gcc -fwrapv for signed-integer overflows.
Footnote 1: example of strict-aliasing helping code-gen
#define QUALIFIER // restrict
void convert(float *QUALIFIER pf, const int *pi) {
for(int i=0 ; i<10240 ; i++){
pf[i] = pi[i];
}
}
Godbolt shows that with the -O3 defaults for GCC11.2 for x86-64, we get just a SIMD loop with movdqu / cvtdq2ps / movups and loop overhead. With -O3 -fno-strict-aliasing, we get two versions of the loop, and an overlap check to see if we can run the scalar or the SIMD version.
Is there actual cases where strict aliasing helps better code generation, in which the same cannot be achieved with restrict
You might well have a pointer that might point into either of two int arrays, but definitely not at any float variable, so you can't use restrict on it. Strict-aliasing will let the compiler still avoid spill/reload of float objects around stores through the pointer, even if the float objects are global vars or otherwise aren't provably local to the function. (Escape analysis.)
Or a struct node * that definitely isn't the same type as the payload in a tree.
Also, most code doesn't use restrict all over the place. It could get quite cumbersome. Not just in loops, but in every function that deals with pointers to structs. And if you get it wrong and promise something that's not true, your code's broken.
The Standard was never intended to fully, accurately, and unambiguously partition programs that have defined behavior and those that don't(*), but instead relies upon compiler writers to exercise a certain amount of common sense.
(*) If it was intended for that purpose, it fails miserably, as evidenced by the amount of confusion stemming from it.
Consider the following two code snippets:
/* Assume suitable declarations of u are available everywhere */
union test { uint32_t ww[4]; float ff[4]; } u;
/* Snippet #1 */
uint32_t proc1(int i, int j)
{
u.ww[i] = 1;
u.ff[j] = 2.0f;
return u.ww[i];
}
/* Snippet #2, part 1, in one compilation unit */
uint32_t proc2a(uint32_t *p1, float *p2)
{
*p1 = 1;
*p2 = 2.0f;
return *p1;
}
/* Snippet #2, part 2, in another compilation unit */
uint32_t proc2(int i, int j)
{
return proc2a(u.ww+i, u.ff+j);
}
It is clear that the authors of the Standard intended that the first version of the code be processed meaningfully on platforms where that would make sense, but it's also clear that at least some of the authors of C99 and later versions did not intend to require that the second version be processed likewise (some of the authors of C89 may have intended that the "strict aliasing rule" only apply to situations where a directly named object would be accessed via pointer of another type, as shown in the example given in the published Rationale; nothing in the Rationale suggests a desire to apply it more broadly).
On the other hand, the Standard defines the [] operator in such a fashion that proc1 is semantically equivalent to:
uint32_t proc3(int i, int j)
{
*(u.ww+i) = 1;
*(u.ff+j) = 2.0f;
return *(u.ww+i);
}
and there's nothing in the Standard that would imply that proc() shouldn't have the same semantics. What gcc and clang seem to do is special-case the [] operator as having a different meaning from pointer dereferencing, but nothing in the Standard makes such a distinction. The only way to consistently interpret the Standard is to recognize that the form with [] falls in the category of actions which the Standard doesn't require that implementations process meaningfully, but relies upon them to handle anyway.
Constructs such as yours example of using a directly-cast pointer to access storage associated with an object of the original pointer's type fall in a similar category of constructs which at least some authors of the Standard likely expected (and would have demanded, if they didn't expect) that compilers would handle reliably, with or without a mandate, since there was no imaginable reason why a quality compiler would do otherwise. Since then, however, clang and gcc have evolved to defy such expectations. Even if clang and gcc would normally generate useful machine code for a function, they seek to perform aggressive inter-procedural optimizations that make it impossible to predict what constructs will be 100% reliable. Unlike some compilers which refrain from applying potential optimizing transforms unless they can prove that they are sound, clang and gcc seek to perform transforms that can't be proven to affect program behavior.
I've been reading some articles about the Strict Aliasing Rules for a couple of days. Here are my understandings:
An object's effective type is the type of its declaration. If the object is an allocated memory, it does not have one until accessed by an lvalue with an effective type, which becomes the object's effective type.
An access to the value of an object must be through a compatible type with its effective type.
After I thought I got this, I wanted to do a simple experiment to see if my compiler really warns about this when I deliberately break the rule. Here's my code:
int main(void) {
unsigned char c[5] = {0x1, 0x2, 0x3, 0x4, 0x5};
int i = *(int*)c;
printf("%x\n", i);
return 0;
}
To me this seems to be against the rule because c has an effective type of char and we are trying to access it with an int pointer.
But my gcc compiles just fine! Even with the highest constraining level (-Wstrict-aliasing) it does not give a single warning. But strangely, replacing the int with a float gives the expected response:
int main(void) {
unsigned char c[5] = {0x1, 0x2, 0x3, 0x4, 0x5};
float i = *(float*)c;
printf("%f\n", i);
return 0;
}
The compiler gives a warning for this code. (dereferencing type-punned pointer will breakk strict-aliasing rules [Wstrict-aliasing])
Does the first code really break the rules? I know casting any pointer into a char* type is fine, but is it true the other way around? Or is it just something gcc does not care so much for?
According to the published Rationale for the C Standard, the purpose of the constraint referred to as the "strict aliasing rule" was to avoid requiring that a compiler given something like:
int x;
int test(double *p)
{
x = 1;
*p = 2.0;
return x;
}
allow for the possibility that the store to *p might affect the value of x, even though there is nothing in the code that would suggest that such a thing could happen. Because the Standard explicitly specifies that implementations may process constructs where it imposes no requirements "in a documented manner characteristic of the environment", the authors saw no need to fully enumerate all of the constructs they expected implementations to handle consistently. Consequently, in order to avoid rendering the language useless, every compiler must process meaningfully some constructs that violate the "aliasing" constraints as written.
From what I can tell, one of the ways that gcc does that is by applying the rules bidirectionally in some cases beyond those provided for by the constraint. The rules as written wouldn't require that an implementation allow for the possibility that an object of character-array type might be accessed using an lvalue of type int, but they also wouldn't require that implementations allow for the possibility that a structure object containing an array of integers might be accessed by dereferencing an int* such as that formed by the decay of the array in an expression like myStruct.intArray[2]. In cases where the authors of gcc recognize that treating a construct as a violation of the "aliasing" constraints would be silly, they will treat the construct as though it were not a constraint violation, and thus will not warn about it.
Using this answer is good because it is very portable, correct and passes compilers set to be strict, but it is less efficient than I want and than it could be, because it doesn't use the x86 bswap instruction. (Maybe other instruction sets have similar efficient instructions.)
If the system I'm running on supports ntohl() then I would expect ntohl() to use the bswap instruction and gets me close. ntohl() does exactly the right thing, but it only works on uint32_t, not on a float. Casting between a uint32_t and a float is type punning and not allowed by strict compilers. Making a union with a float and a uint32_t runs into undefined compiler behavior (per previous posts here).
My understanding from previous posts is that casting from any pointer type to a char * or vice versa is explicitly allowed. So what is wrong with this solution? I haven't seen it mentioned in any answers yet.
char NetworkOrderFloat[4]; // Assume it contains network-order float bytes
uint32_t HostOrderInt = ntohl(*(uint32_t *)NetworkOrderFloat);
char *Pointer = (char *)&HostOrderInt;
float HostOrderFloat = *(float *)Pointer;
The ideal solution here seems to be more environments supporting ntohf(), but that doesn't seem to have happened yet.
Your proposal breaks “strict aliasing rules”, the first time when it does *(uint32_t *)NetworkOrderFloat. The expression (uint32_t *)NetworkOrderFloat is still the address of an array of chars, and accessing it with an lvalue of type uint32_t is against these rules. Details and more examples can be found in this article.
Using a union to convert a float's representation to uint32_t, on the other hand, is not forbidden by the C standard as far as I know. But you can always use memcpy if you worry that it is.
float NetworkOrderFloat = ...;
uint32_t tmp;
_Static_assert(sizeof(uint32_t)==sizeof(float),"unsupported arch");
memcpy(&tmp, &NetworkOrderFloat, sizeof(float));
tmp = ntohl(tmp);
memcpy(&HostOrderFloat, &tmp, sizeof(float));
A decent modern compiler should compile the memcpy calls to nothing and ntohl to bswap.
Are there limits on what I can do to allocated memory?(standard-wise)
For example
#include <stdio.h>
#include <stdlib.h>
struct str{
long long a;
long b;
};
int main(void)
{
long *x = calloc(4,sizeof(long));
x[0] = 2;
x[3] = 7;
//is anything beyond here legal( if you would exclude possible illegal operations)
long long *y = x;
printf("%lld\n",y[0]);
y[0] = 2;
memset (x,0,16);
struct str *bar = x;
bar->b = 4;
printf("%lld\n",bar->a);
return 0;
}
To summarize:
Can I recast the pointer to other datatypes and structs, as long as the size fits?
Can I read before I write, then?
If not can I read after I wrote?
Can I use it with a struct smaller than the allocated memory?
Reading from y[0] violates the strict aliasing rule. You use an lvalue of type long long to read objects of effective type long.
Assuming you omit that line; the next troublesome part is memset(x,0,16);. This answer argues that memset does not update the effective type. The standard is not clear.
Assuming that memset leaves the effective type unchanged; the next issue is the read of bar->a.
The C Standard is unclear on this too. Some people say that bar->a implies (*bar).a and this is a strict aliasing violation because we did not write a bar object to the location first.
Others (including me) say that it is fine: the only lvalue used for access is bar->a; that is an lvalue of type long long, and it accesses an object of effective type long long (the one written by y[0] = 2;).
There is a C2X working group that is working on improving the specification of strict aliasing to clarify these issues.
Can I recast the pointer to other datatypes, as long as the size fits?
You can recast1 to any data type that is at most as large as the memory you allocated. You must write a value however to change the effective type of the allcoated object according to 6.5p6
Can I read before I write, then?
If not can I read after I wrote?
No. Except when otherwise specified (calloc is the otherwise)2, the value in the memory is indeterminate. It may contain trap values. A cast in order to reinterpret a value as another type is UB, and a violation of strict aliasing (6.5p7)
Can I use it with a struct smaller than the allocated memory?
Yes, but that's a waste.
1 You'll need to cast to void* first. Otherwise you'd get a rightful complaint from the compiler about incompatible pointer types.
2 Even then some types may trap on a completely 0 bit pattern, so it depends.
Most compilers offer a mode where reads and writes of pointers will act upon the underlying storage, in the sequence they are performed, regardless of the data types involved. The Standard does not require compilers to offer such a mode, but as far as I can tell all quality compilers do so.
According to their published rationale, the authors of the Standard added aliasing restrictions to the language with the stated purpose of avoiding compilers to make pessimistic aliasing assumptions when given code like:
float f;
float test(int *p)
{
f=1.0f;
*p = 2;
return f;
}
Note that in the example given in the rationale [very much like the above], even if it were legal to modify the storage used by f via pointer p, a reasonable person looking at the code would have no reason to think it likely that such a thing would ever happen. On the other hand, many compiler writers recognized that if given something like:
float f;
float test(float *p)
{
f=1.0f;
*(int*)p = 2;
return f;
}
one would have to be deliberately obtuse to think that the code would be unlikely to modify the storage used by a float, and there was consequently no reason why a quality compiler should not regard the write to *(int*)p as a potential write to a float.
Unfortunately, in the intervening years, compiler writers have become increasingly aggressive with type-based aliasing "optimizations", sometimes in ways that go clearly and undeniably beyond what the Standard would allow. Unless a program will never need to access any storage as different types at different times, I'd suggest using -fno-strict-aliasing option on compilers that support it. Otherwise one may have code that complies with the Standard and works today, but fails in a future version of the compiler which has become even more aggressive with its "optimizations".
PS--Disabling type-based aliasing may impact the performance of code in some situations, but proper use of restrict-qualified variables and parameters should avoid the costs of pessimistic aliasing assumptions. With a little care, use of those qualifiers will enable the same optimizations as aggressive aliasing could have done, but much more safely.
static int* p= (int*)(&foo);
I just know p points to a memory in the code segment.
But I don't know what exactly happens in this line.
I thought maybe it's a pointer to a function but the format to point a function is:
returnType (*pointerName) (params,...);
pointerName = &someFunc; // or pointerName=someFunc;
You take the address of foo and cast it to pointer to int.
If foo and p are of different types, the compiler might issue a warning about type mismatch. The cast is to supress that warning.
For example, consider the following code, which causes a warning from the compiler (initialization from incompatible pointer type):
float foo = 42;
int *p = &foo;
Here foo is a float, while p points to an int. Clearly - different types.
A typecasting makes the compiler treat one variable as if it was of different type. You typecast by putting new type name in parenthesis. Here we will make pointer to float to be treated like a pointer to int and the warning will be no more:
float foo = 5;
int *p = (int*)(&foo);
You could've omitted one pair of parenthesis as well and it'd mean the same:
float foo = 5;
int *p = (int*)&foo;
The issue is the same if foo is a function. We have a pointer to a function on right side of assignment and a pointer to int on left side. A cast would be added to make a pointer to function to be treated as an address of int.
A pointer of a type which points to an object (i.e. not void* and not a
pointer to a function) cannot be stored to a pointer to any other kind of
object without a cast, except in a few cases where the types are identical
except for qualifiers. Conforming compilers are required to issue a
diagnostic if that rule is broken.
Beyond that, the Standard allows compilers to interpret code that casts
pointers in nonsensical fashion unless code aides by some restrictions
which, as written make such casts essentially useless, for the nominal
purpose of promoting optimization. When the rules were written, most
compilers would probably do about half of the optimizations that would
be allowed under the rules, but would still process pointer casts sensibly
since doing so would cost maybe 5% of the theoretically-possible
optimizations. Today, however, it is more fashionable for compiler writers
to seek out all cases where an optimization would be allowed by the
Standard without regard for whether they make sense.
Compilers like gcc have an option -fno-strict-aliasing that blocks this
kind of optimization, both in cases where it would offer big benefits and
little risk, as well as in the cases where it would almost certainly break
code and be unlikely to offer any real benefit. It would be helpful if they
had an option to block only the latter, but I'm unaware of one. Thus, my
recommendation is that unless one wants to program in a very limited subset
of Dennis Ritchie's language, I'd suggest targeting the -fno-strict-aliasing
dialect.