Unfortunately, I haven't found anything like std-discussion for the ISO C standard, so I'll ask here.
Before answering the question, make sure you are familiar with the idea of pointer provenance (see DR260 and/or http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2263.htm).
6.3.2.3(Pointers) paragraph 7 says:
When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
The questions are:
What does "a pointer to an object" mean? Does it mean that if we want to get the pointer to the lowest addressed byte of an object of type T, we shall convert a pointer of type cvT* to a pointer to a character type, and converting a pointer to void, obtained from the pointer of type T*, won't give us desired result? Or "a pointer to an object" is the value of a pointer which follows from the pointer provenance and cast to void* does not change the value (analogous to how it was recently formalized in C++17)?
Why the paragraph explicitly mentions increments? Does it mean that adding value greater than 1 is undefined? Does it mean that decrementing (after we have incremented the result several times so that we won't go beyond the lower object boundary) is undefined? In short: is the sequences of bytes composing an object an array?
The general description of pointer addition suggests that for any values/types of pointer p and signed integers x and y where ((ptr+x)+y) and (x+y) are both defined by the Standard, (ptr+(x+y)) would behave equivalent to ((ptr+x)+y). While it might be possible to argue that the Standard doesn't explicitly say that incrementing a pointer five times would be equivalent to adding 5, there is nothing in the Standard that would suggest that quality implementations should not be expected to behave in that fashion.
Note that the authors of the Standard didn't try to make it "language-lawyer-proof". Further, they didn't think anyone would care about whether or not an obviously-inferior implementation was "conforming". An implementation that only works reliably if bytes of an object are accessed sequentially would be less versatile than one which supported reliable indexing, while offering no plausible advantage. Consequently, there should be no need for the Standard to mandate support for indexing, because anyone trying to produce a quality implementation would support it whether the Standard mandated it or not.
Of course, there are some constructs which programmers in the 1990s--or even the authors of the Standard themselves--expected quality compilers to handle reliably, but which some of today's "clever" compilers don't. Whether that means such expectations were unreasonable, or whether they remain accurate when applied to quality compilers, is a matter of opinion. In this particular case, I think the implication that positive indexing should behave like repeated incrementing is strong enough that I wouldn't expect compiler writers to argue otherwise, but I'm not 100% certain that no compiler would ever be "clever"/obtuse enough to look at something like:
int test(unsigned char foo[5][5], int x)
{
foo[1][0] = 1;
// Following should yield a pointer that can be used to access the entire
// array 'foo', but an obtuse compiler writer could perhaps argue that the
// code is converting the address of foo[0] into a pointer to the first
// element of that sub-array, and that the resulting pointer is thus only
// usable to access items within that sub-array.
unsigned char *p = (unsigned char*)foo;
// Following should be able to access any element of the object [i.e. foo]
// whose address was taken
p[x] = 2;
return foo[1][0];
}
and decide that it could skip the second read of foo[1][0] since p[x] wouldn't possibly access any element of foo beyond the first row. I would, however, say that programmers should not try to code around the possibility of vandals writing a compiler that would behave that way. No program can be made bullet-proof against vandals writing obtuse-but-"conforming" compilers, and the fact that a program can be undermined by such vandals should hardly be viewed as a defect.
Take a non-char c object and create a pointer to it, i.e.
int obj;
int *objPtr = &obj;
convert the pointer to object to pointer to char:
char *charPtr = (char *)objPtr;
now, charPtr points to the lowest byte or the int obj.
increment it:
charPtr++;
now it points to the next byte of the object. and so on till you reach the size of the object:
int i;
for (i = 0; i < sizeof(obj); i++)
printf("%d", *charPtr++);
Related
I noticed this warning from Clang:
warning: performing pointer arithmetic on a null pointer
has undefined behavior [-Wnull-pointer-arithmetic]
In details, it is this code which triggers this warning:
int *start = ((int*)0);
int *end = ((int*)0) + count;
The constant literal zero converted to any pointer type decays into the null pointer constant, which does not point to any contiguous area of memory but still has the type pointer to type needed to do pointer arithmetic.
Why would arithmetic on a null pointer be forbidden when doing the same on a non-null pointer obtained from an integer different than zero does not trigger any warning?
And more importantly, does the C standard explicitly forbid null pointer arithmetic?
Also, this code will not trigger the warning, but this is because the pointer is not evaluated at compile time:
int *start = ((int*)0);
int *end = start + count;
But a good way of avoiding the undefined behavior is to explicitly cast an integer value to the pointer:
int *end = (int *)(sizeof(int) * count);
The C standard does not allow it.
6.5.6 Additive operators (emphasis mine)
8 When an expression that has integer type is added to or
subtracted from a pointer, the result has the type of the pointer
operand. If the pointer operand points to an element of an array
object, and the array is large enough, the result points to an element
offset from the original element such that the difference of the
subscripts of the resulting and original array elements equals the
integer expression. In other words, if the expression P points to the
i-th element of an array object, the expressions (P)+N (equivalently,
N+(P)) and (P)-N (where N has the value n) point to, respectively, the
i+n-th and i-n-th elements of the array object, provided they exist.
Moreover, if the expression P points to the last element of an array
object, the expression (P)+1 points one past the last element of the
array object, and if the expression Q points one past the last element
of an array object, the expression (Q)-1 points to the last element of
the array object. If both the pointer operand and the result point to
elements of the same array object, or one past the last element of the
array object, the evaluation shall not produce an overflow; otherwise,
the behavior is undefined. If the result points one past the last
element of the array object, it shall not be used as the operand of a
unary * operator that is evaluated.
For the purposes of the above, a pointer to a single object is considered as pointing into an array of 1 element.
Now, ((uint8_t*)0) does not point at an element of an array object. Simply because a pointer holding a null pointer value does not point at any object. Which is said at:
6.3.2.3 Pointers
3 If a null pointer constant is converted to a pointer type, the
resulting pointer, called a null pointer, is guaranteed to compare
unequal to a pointer to any object or function.
So you can't do arithmetic on it. The warning is justified, because as the second highlighted sentence mentions, we are in the case of undefined behavior.
Don't be fooled by the fact the offsetof macro is possibly implemented like that. The standard library is not bound by the constraints placed on user programs. It can employ deeper knowledge. But doing this in our code is not well defined.
When the C Standard was written, the vast majority of C implementations would, for any non-void* pointer value p, uphold the invariants that p+0 and p-0 both yield p, and p-p will yield zero. More generally, operations like a size-zero memcpy or fwrite that operate on a buffer of size N would ignore the buffer address when N was zero. Such behavior would allow programmers to avoid having to write code to handle corner cases. For example, code to output a packet with an optional payload passed via address and length arguments would naturally process (NULL,0) as an empty payload.
Nothing in the published Rationale for the C Standard suggests that implementations whose target platforms would naturally behave in such fashion shouldn't continue to work as they always had. There were, however, a few platforms where it may have been expensive to uphold such behavioral guarantees in cases where p is null.
As with most situations where the vast majority of C implementations would process a construct identically, but implementations might exist where such treatment would be impractical, the Standard characterizes the addition of zero to a null pointer as Undefined Behavior. The Standard allows implementations to, as a form of "conforming language extension", define the behavior of constructs in cases where it imposes no requirements, and it allow conforming (but not strictly conforming) programs to make use of them. According to the published Rationale, the stated intention was that support for such "popular extensions" be regarded as a "quality of implementation" issue to be decided by the marketplace. Implementations that could support them at essentially zero cost would do so, but implementations where such support would be expensive would be free to support such constructs or not based upon their customers' needs.
If one is using a compiler that targets commonplace platforms, and is designed to process the widest range of useful programs reasonably efficiently, then the extended semantics surrounding pointer arithmetic may allow one to write code more efficiently than would otherwise be possible. If one is targeting a compiler that does not value compatibility with quality compilers, however, one should recognize that it may treat the Standard's allowance for quirky hardware as an invitation to behave nonsensically even on commonplace hardware. Of course, one should also be aware that such compilers may behave nonsensically in corner cases where adherence with the Standard would require them to forego optimizations that are unsound but would "usually" be safe.
I was playing around with some arrays and pointers in c and started wondering whether doing this would be undefined behavior.
int (*arr)[5] = malloc(sizeof(int[5][5]));
// Is this undefined behavior?
int val0 = arr[0][5];
// Rephrased, is it guaranteed it'll always have the same effect as this line?
int val1 = arr[1][0];
Thank you for any insights.
In C, what you're doing is undefined behavior.
The expression arr[0] has type int [5]. So the expression arr[0][5] dereferences one element past the end of the array arr[0], and dereferencing past the end of an array is undefined behavior.
Section 6.5.2.1p2 of the C standard regarding Array Subscripting states:
The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))).
And section 6.5.6p8 of the C standard regarding Additive Operators states:
When an expression that has integer type is added to or
subtracted from a pointer, the result has the type of the pointer
operand. If the pointer operand points to an element of an array
object, and the array is large enough, the result points to an element
offset from the original element such that the difference of the
subscripts of the resulting and original array elements equals the
integer expression. In other words, if the expression P points to
the i-th element of an array object, the expressions (P)+N
(equivalently,N+(P)) and (P)-N (where N has the value n)
point to, respectively, the i+n-th and i−n -th elements of the
array object, provided they exist. Moreover, if the
expression P points to the last element of an array object, the
expression (P)+1 points one past the last element of the array
object, and if the expression Q points one past the last
element of an array object,the expression (Q)-1 points to the
last element of the array object. If both the pointer operand
and the result point to elements of the same array object,
or one past the last element of the array object, the evaluation
shall not produce an overflow; otherwise, the behavior is undefined.
If the result points one past the last element of the array object, it
shall not be used as the operand of a unary * operator that is
evaluated.
The part in bold specifies that the addition implicit in an array subscript may not result in a pointer more that one element past the end of an array, and that a pointer to one element past the end of an array may not be defererenced.
The fact that the array in question is itself a member of an array, meaning the elements of each subarray are continuous in memory, doesn't change this. Aggressive optimization settings in the compiler may note that it is undefined behavior to access past the end of the array and make optimizations based on this fact.
The Standard is clearly intended to avoid requiring that a compiler given something like:
int foo[5][10];
int test(int i)
{
foo[1][0] = 1;
foo[0][i] = 2;
return foo[1][0];
}
must reload the value of foo[1][0] to accommodate the possibility that the write to foo[0][i] might affect foo[1][0]. On the other hand, before the Standard was written, it would have been idiomatic to write something like:
void dump_array(int *p, int rows, int cols)
{
int i,j;
for (i=0; i<rows; i++)
{
for (j=0; j<cols; j++)
printf("%6d", *p++);
printf("\n");
}
}
int foo[5][10];
...
dump_array(foo[0], 5, 10);
and nothing in the published Rationale suggests that the authors had any intention of forbidding such constructs nor breaking code that used them. Indeed, the primary benefit of requiring that rows of an array be placed consecutively, even when adding padding would improve efficiency, is to allow such code to function.
At the time the Standard was written, when generating code for a function that received a pointer, compilers would treat the pointer as though it might identify some arbitrary part of some arbitrary larger object, without making any effort to know or care about what that enclosing object might be. They would thus, as a very popular form of "conforming language extension", support constructs like dump_array without regard for whether the Standard required them to do so, and consequently the authors of the Standard saw no reason to worry about when the Standard mandated such support. Instead, they left such matters as a Quality of Implementation issue over which the Standard could waive jurisdiction.
Unfortunately, because the authors of the Standard expected that compilers would treat the act of passing a pointer to a function as implicitly "laundering" it, the authors of the Standard saw no need to define any explicit method for laundering information about a pointer's enclosing objects in cases where it would be necessary for a function to treat a pointer identifying "raw" storage. Such distinctions didn't matter given the state of compiler technology in the 1980s, but may be quite relevant if e.g. code does something like:
int matrix[10][10];
void test2(int c)
{
matrix[4][0] = 1;
dump_array(matrix[0], 1, c);
matrix[4][0] = 2;
}
or
void test3(int r)
{
matrix[4][0] = 1;
dump_array((int*)matrix, r, 10);
matrix[4][0] = 2;
}
Depending upon what the functions is intending to do, having a compiler optimize out the first write to matrix[4][0] in one or both may improve efficiency, or it may cause the generated code to behave uselessly. Treating explicit pointer conversions as erasing type information, but treating array-to-pointer decay as retaining it, would allow programmers to achieve required semantics if they write code as in the second example, while allowing compilers to perform the relevant optimizations when source code is written as in the first example. Unfortunately, the Standard makes no distinctions, and maintainers of free compilers are loath to forego any "optimizations" they view the Standard as giving them, leaving the language with nothing but "hope for the best" semantics except on implementations that either refrain from cross-procedural optimizations or document what needs to be done to block them.
Which of the following ways of treating and trying to recover a C pointer, is guaranteed to be valid?
1) Cast to void pointer and back
int f(int *a) {
void *b = a;
a = b;
return *a;
}
2) Cast to appropriately sized integer and back
int f(int *a) {
uintptr_t b = a;
a = (int *)b;
return *a;
}
3) A couple of trivial integer operations
int f(int *a) {
uintptr_t b = a;
b += 99;
b -= 99;
a = (int *)b;
return *a;
}
4) Integer operations nontrivial enough to obscure provenance, but which will nonetheless leave the value unchanged
int f(int *a) {
uintptr_t b = a;
char s[32];
// assume %lu is suitable
sprintf(s, "%lu", b);
b = strtoul(s);
a = (int *)b;
return *a;
}
5) More indirect integer operations that will leave the value unchanged
int f(int *a) {
uintptr_t b = a;
for (uintptr_t i = 0;; i++)
if (i == b) {
a = (int *)i;
return *a;
}
}
Obviously case 1 is valid, and case 2 surely must be also. On the other hand, I came across a post by Chris Lattner - which I unfortunately can't find now - saying something similar to case 5 is not valid, that the standard licenses the compiler to just compile it to an infinite loop. Yet each case looks like an unobjectionable extension of the previous one.
Where is the line drawn between a valid case and an invalid one?
Added based on discussion in comments: while I still can't find the post that inspired case 5, I don't remember what type of pointer was involved; in particular, it might have been a function pointer, which might be why that case demonstrated invalid code whereas my case 5 is valid code.
Second addition: okay, here's another source that says there is a problem, and this one I do have a link to. https://www.cl.cam.ac.uk/~pes20/cerberus/notes30.pdf - the discussion about pointer provenance - says, and backs up with evidence, that no, if the compiler loses track of where a pointer came from, it's undefined behavior.
According to the C11 draft standard:
Example 1
Valid, by §6.5.16.1, even without an explicit cast.
Example 2
The intptr_t and uintptr_t types are optional. Assigning a pointer to an integer requires an explicit cast (§6.5.16.1), although gcc and clang will only warn you if you don’t have one. With those caveats, the round-trip conversion is valid by §7.20.1.4. ETA: John Bellinger brings up that the behavior is only specified when you do an intermediate cast to void* both ways. However, both gcc and clang allow the direct conversion as a documented extension.
Example 3
Safe, but only because you’re using unsigned arithmetic, which cannot overflow, and are therefore guaranteed to get the same object representation back. An intptr_t could overflow! If you want to do pointer arithmetic safely, you can convert any kind of pointer to char* and then add or subtract offsets within the same structure or array. Remember, sizeof(char) is always 1. ETA: The standard guarantees that the two pointers compare equal, but your link to Chisnall et al. gives examples where compilers nevertheless assume the two pointers do not alias each other.
Example 4
Always, always, always check for buffer overflows whenever you read from and especially whenever you write to a buffer! If you can mathematically prove that overflow cannot happen by static analysis? Then write out the assumptions that justify that, explicitly, and assert() or static_assert() that they haven’t changed. Use snprintf(), not the deprecated, unsafe sprintf()! If you remember nothing else from this answer, remember that!
To be absolutely pedantic, the portable way to do this would be to use the format specifiers in <inttypes.h> and define the buffer length in terms of the maximum value of any pointer representation. In the real world, you would print pointers out with the %p format.
The answer to the question you intended to ask is yes, though: all that matters is that you get back the same object representation. Here’s a less contrived example:
#include <assert.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
int i = 1;
const uintptr_t u = (uintptr_t)(void*)&i;
uintptr_t v;
memcpy( &v, &u, sizeof(v) );
int* const p = (int*)(void*)v;
assert(p == &i);
*p = 2;
printf( "%d = %d.\n", i, *p );
return EXIT_SUCCESS;
}
All that matter are the bits in your object representation. This code also follows the strict aliasing rules in §6.5. It compiles and runs fine on the compilers that gave Chisnall et al trouble.
Example 5
This works, same as above.
One extremely pedantic footnote that will never ever be relevant to your coding: some obsolete esoteric hardware has one’s-complement or sign-and-magnitude representation of signed integers, and on these, there might be a distinct value of negative zero that might or might not trap. On some CPUs, this might be a valid pointer or null pointer representation distinct from positive zero. And on some CPUs, positive and negative zero might compare equal.
PS
The standard says:
Two pointers compare equal if and only if both are null pointers, both are pointers to the same object (including a pointer to an object and a subobject at its beginning) or function, both are pointers to one past the last element of the same array object, or one is a pointer to one past the end of one array object and the other is a pointer to the start of a different array object that happens to immediately follow the first array object in the address space.
Furthermore, if the two array objects are consecutive rows of the same multidimensional array, one past the end of the first row is a valid pointer to the start of the next row. Therefore, even a pathological implementation that deliberately sets out to cause as many bugs as the standard allows could only do so if your manipulated pointer compares equal to the address of an array object, in which case the implementation might in theory decide to interpret it as one-past-the-end of some other array object instead.
The intended behavior is clearly that the pointer comparing equal both to &array1+1 and to &array2 is equivalent to both: it means to let you compare it to addresses within array1 or dereference it to get array2[0]. However, the Standard does not actually say that.
PPS
The standards committee has addressed some of these issues and proposes that the C standard explicitly add language about pointer provenance. This would nail down whether a conforming implementation is allowed to assume that a pointer created by bit manipulation does not alias another pointer.
Specifically, the proposed corrigendum would introduce pointer provenance and allow pointers with different provenance not to compare equal. It would also introduce a -fno-provenance option, which would guarantee that any two pointers compare equal if and only if they have the same numeric address. (As discussed above, two object pointers that compare equal alias each other.)
1) Cast to void pointer and back
This yields a valid pointer equal to the original. Paragraph 6.3.2.3/1 of the standard is clear on this:
A pointer to void may be converted to or from a pointer to any object type. A pointer to any object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.
2) Cast to appropriately sized integer and back
3) A couple of trivial integer operations
4) Integer operations nontrivial enough to obscure provenance, but which will nonetheless leave the value unchanged
5) More indirect integer operations that will leave the value unchanged
[...] Obviously case 1 is valid, and case 2 surely must be also. On the other hand, I came across a post by Chris Lattner - which I unfortunately can't find now - saying case 5 is not valid, that the standard licenses the compiler to just compile it to an infinite loop.
C does require a cast when converting either way between pointers and integers, and you have omitted some of those in your example code. In that sense your examples (2) - (5) are all non-conforming, but for the rest of this answer I'll pretend the needed casts are there.
Still, being very pedantic, all of these examples have implementation-defined behavior, so they are not strictly conforming. On the other hand, "implementation-defined" behavior is still defined behavior; whether that means your code is "valid" or not depends on what you mean by that term. In any event, what code the compiler might emit for any of the examples is a separate matter.
These are the relevant provisions of the standard from section 6.3.2.3 (emphasis added):
An integer may be converted to any pointer type. Except as previously specified, the result is implementation-defined, might not be correctly aligned, might not point to an entity of the referenced type, and might be a trap representation.
Any pointer type may be converted to an integer type. Except as previously specified, the result is implementation-defined. If the result cannot be represented in the integer type, the behavior is undefined. The result need not be in the range of values of any integer type.
The definition of uintptr_t is also relevant to your particular example code. The standard describes it this way (C2011, 7.20.1.4/1; emphasis added):
an unsigned integer type with the property that any valid pointer to void can be converted to this type, then converted back to pointer to void, and the result will compare equal to the original pointer.
You are converting back and forth between int * and uintptr_t. int * is not void *, so 7.20.1.4/1 does not apply to these conversions, and the behavior is implementation-defined per section 6.3.2.3.
Suppose, however, that you convert back and forth via an intermediate void *:
uintptr_t b = (uintptr_t)(void *)a;
a = (int *)(void *)b;
On an implementation that provides uintptr_t (which is optional), that would make your examples (2 - 5) all strictly conforming. In that case, the result of the integer-to-pointer conversions depends only on the value of the uintptr_t object, not on how that value was obtained.
As for the claims you attribute to Chris Lattner, they are substantially incorrect. If you have represented them accurately, then perhaps they reflect a confusion between implementation-defined behavior and undefined behavior. If the code exhibited undefined behavior then the claim might hold some water, but that is not, in fact, the case.
Regardless of how its value was obtained, b has a definite value of type uintptr_t, and the loop must eventually increment i to that value, at which point the if block will run. In principle, the implementation-defined behavior of a conversion from uintptr_t directly to int * could be something crazy such as skipping the next statement (thus causing an infinite loop), but such behavior is entirely implausible. Every implementation you ever meet will either fail at that point, or store some value in variable a, and then, if it didn't crash, it will execute the return statement.
Because different application fields require the ability to manipulate pointers in different ways, and because the best implementations for some purposes may be totally unsuitable for some others, the C Standard treats the support (or lack thereof) for various kinds of manipulations as a Quality of Implementation issue. In general, people writing an implementation for a particular application field should be more familiar than the authors of the Standard with what features would be useful to programmers in that field, and people making a bona fide effort to produce quality implementations suitable for writing applications in that field will support such features whether the Standard requires them to or not.
In the pre-Standard language invented by Dennis Ritchie, all pointers of a particular type that were identified the same address were equivalent. If any sequence of operations on a pointer would end up producing another pointer of the same type that identified the same address, that pointer would--essentially by definition--be equivalent to the first. The C Standard specifies some situations, however, where pointers can identify the same location in storage and be indistinguishable from each other without being equivalent. For example, given:
int foo[2][4] = {0};
int *p = foo[0]+4, *q=foo[1];
both p and q will compare equal to each other, and to foo[0]+4, and to foo[1]. On the other hand, despite the fact that evaluation of p[-1] and q[0] would have defined behavior, evaluation of p[0] or q[-1] would invoke UB. Unfortunately, while the Standard makes clear that p and q would not be equivalent, it does nothing to clarify whether performing various sequences of operations on e.g. p will yield a pointer that is usable in all cases where p was usable, a pointer that's usable in all cases where either p or q would be usable, a pointer that's only usable in cases where q would be usable, or a pointer that's only usable in cases where both p and q would be usable.
Quality implementations intended for low-level programming should generally process pointer manipulations other than those involving restrict pointers in a way that would yield a pointer that would be usable in any case where a pointer that compares equal to it could usable. Unfortunately, the Standard provides no means by which a program can determine if it's being processed by a quality implementations that is suitable for low-level programming and refuse to run if it isn't, so most forms of systems programming must rely upon quality implementations processing certain actions in documented manner characteristic of the environment even when the Standard would impose no requirements.
Incidentally, even if the normal constructs for manipulating pointers wouldn't have any way of creating pointers where the principles of equivalence shouldn't apply, some platforms may define ways of creating "interesting" pointers. For example, if an implementation that would normally trap operations on null pointers were run on in an environment where it might sometimes be necessary to access an object at address zero, it might define a special syntax for creating a pointer that could be used to access any address, including zero, within the context where it was created. A "legitimate pointer to address zero" would likely compare equal to a null pointer (even though they're not equivalent), but performing a round-trip conversion to another type and back would likely convert what had been a legitimate pointer to address zero into a null pointer. If the Standard had mandated that a round-trip conversion of any pointer must yield one that's usable in the same way as the original, that would require that the compiler omit null traps on any pointers that could have been produced in such a fashion, even if they would far more likely have been produced by round-tripping a null pointer.
Incidentally, from a practical perspective, "modern" compilers, even in -fno-strict-aliasing, will sometimes attempt to track provenance of pointers through pointer-integer-pointer conversions in such a way that pointers produced by casting equal integers may sometimes be assumed incapable of aliasing.
For example, given:
#include <stdint.h>
extern int x[],y[];
int test(void)
{
if (!x[0]) return 999;
uintptr_t upx = (uintptr_t)x;
uintptr_t upy = (uintptr_t)(y+1);
//Consider code with and without the following line
if (upx == upy) upy = upx;
if ((upx ^ ~upy)+1) // Will return if upx != upy
return 123;
int *py = (int*)upy;
*py += 1;
return x[0];
}
In the absence of the marked line, gcc, icc, and clang will all assume--even when using -fno-strict-aliasing, that an operation on *py can't affect *px, even though the only way that code could be reached would be if upx and upy held the same value (meaning that both px and py were produced by casting the same uintptr_t value). Adding the marked line causes icc and clang to recognize that px and py could identify the same object, but gcc assumes that the assignment can be optimized out, even though it should mean that py would be derived from px--a situation a quality compiler should have no trouble recognizing as implying possible aliasing.
I'm not sure what realistic benefit compiler writers expect from their efforts to track provenance of uintptr_t values, given that I don't see much purpose for doing such conversions outside of cases where the results of conversions may be used in "interesting" ways. Given compiler behavior, however, I'm not sure I see any nice way to guarantee that conversions between integers and pointers will behave in a fashion consistent with the values involved.
I've been trying to understand a particular aspect of strict aliasing recently, and I think I have made the smallest possible interesting piece of code. (Interesting for me, that is!)
Update: Based on the answers so far, it's clear I need to clarify the question. The first listing here is "obviously" defined behaviour, from a certain point of view. The real issue is to follow this logic through to custom allocators and custom memory pools. If I malloc a large block of memory at the start, and then write my own my_malloc and my_free that uses that single large block, then is it UB on the grounds that it doesn't use the official free?
I'll stick with C, somewhat arbitrarily. I get the impression it is easier to talk about, that the C standard is a bit clearer.
int main() {
uint32_t *p32 = malloc(4);
*p32 = 0;
free(p32);
uint16_t *p16 = malloc(4);
p16[0] = 7;
p16[1] = 7;
free(p16);
}
It is possible that the second malloc will return the same address as the first malloc (because it was freed in between). That means that it is accessing the same memory with two different types, which violates strict aliasing. So surely the above is undefined behaviour (UB)?
(For simplicity, let's assume the malloc always succeeds. I could add in checks for the return value of malloc, but that would just clutter the question)
If it's not UB, why? Is there an explicit exception in the standard, which says that malloc and free (and calloc/realloc/...) are allowed to "delete" the type associated with a particular address, allowing further accesses to "imprint" a new type on the address?
If malloc/free are special, then does that mean I cannot legally write my own allocator which clones the behaviour of malloc? I'm sure there are plenty of projects out there with custom allocators - are they all UB?
Custom allocators
If we decide, therefore, that such custom allocators must be defined behaviour, then it means the strict aliasing rule is essentially "incorrect". I would update it to say that it is possible to write (not read) through a pointer of a different ('new') type as long as you don't use pointers of the old type any more. This wording could be quietly-ish changed if it was confirmed that all compilers have essentially obeyed this new rule anyway.
I get the impression that gcc and clang essentially respect my (aggressive) reinterpretation. If so, perhaps the standards should be edited accordingly? My 'evidence' regarding gcc and clang is difficult to describe, it uses memmove with an identical source and destination (which is therefore optimized out) in such a way that it blocks any undesirable optimizations because it tells the compiler that future reads through the destination pointer will alias the bit pattern that was previously written through the source pointer. I was able to block the undesirable interpretations accordingly. But I guess this isn't really 'evidence', maybe I was just lucky. UB clearly means that the compiler is also allowed to give me misleading results!
( ... unless, of course, there is another rule that makes memcpy and memmove special in the same way that malloc may be special. That they are allowed to change the type to the type of the destination pointer. That would be consistent with my 'evidence'. )
Anyway, I'm rambling. I guess a very short answer would be: "Yes, malloc (and friends) are special. Custom allocators are not special and are therefore UB, unless they maintain separate memory pools for each type. And, further, see example X for an extreme piece of code where compiler Y does undesirable stuff precisely because compiler Y is very strict in this regard and is contradicting this reinterpretation."
Follow up: what about non-malloced memory? Does the same thing apply. (Local variables, static variables, ...)
Here are the C99 strict aliasing rules in (what I hope is) their entirety:
6.5
(6) The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using
memcpy or memmove, or is copied as an array of character type, then the effective type
of the modified object for that access and for subsequent accesses that do not modify the
value is the effective type of the object from which the value is copied, if it has one. For
all other accesses to an object having no declared type, the effective type of the object is
simply the type of the lvalue used for the access.
(7) An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
These two clauses together prohibit one specific case, storing a value via an lvalue of type X and then subsequently retrieving a value via an lvalue of type Y incompatible with X.
So, as I read the standard, even this usage is perfectly OK (assuming 4 bytes are enough to store either an uint32_t or two uint16_t).
int main() {
uint32_t *p32 = malloc(4);
*p32 = 0;
/* do not do this: free(p32); */
/* do not do this: uint16_t *p16 = malloc(4); */
/* do this instead: */
uint16_t *p16 = (uint16_t *)p32;
p16[0] = 7;
p16[1] = 7;
free(p16);
}
There's no rule that prohibits storing an uint32_t and then subsequently storing an uint16_t at the same address, so we're perfectly OK.
Thus there's nothing that would prohibit writing a fully compliant pool allocator.
Your code is correct C and does not invoke undefined behaviour (except that you do not test malloc return value) because :
you allocate a bloc of memory, use it and free it
you allocate another bloc of memory, use it and free it.
What is undefined is whether p16 will receive same value as p32 had at a different time
What would be undefined behaviour, even if value was the same would be to access p32 after it has been freed. Examples :
int main() {
uint32_t *p32 = malloc(4);
*p32 = 0;
free(p32);
uint16_t *p16 = malloc(4);
p16[0] = 7;
p16[1] = 7;
if (p16 == p32) { // whether p16 and p32 are equal is undefined
uint32_t x = *p32; // accessing *p32 is explicitely UB
}
free(p16);
}
It is UB because you try to access a memory block after it has been freed. And even when it does point to a memory block, that memory block has been initialized as an array of uint16_t, using it as a pointer to another type is formally undefined behaviour.
Custom allocation (assuming a C99 conformant compiler) :
So you have a big chunk of memory and want to write custom free and malloc functions without UB. It is possible. Here I will not go to far into the hard part of management of allocated and free blocs, and just give hints.
you will need to know what it the strictest alignement for the implementation. stdlib malloc knows it because 7.20.3 §1 of C99 language specification (draft n1256) says : The pointer returned if the allocation
succeeds is suitably aligned so that it may be assigned to a pointer to any type of object. It is generally 4 on 32 bits systems and 8 on 64 bits systems, but might be greater or lesser ...
you memory pool must be a char array because 6.3.2.3 §7 says : A pointer to an object or incomplete type may be converted to a pointer to a different
object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of
the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object. : that means that provided you can deal with the alignement, a character array of correct size can be converted to a pointer to an arbitrary type (and is the base of malloc implementation)
You must make your memory pool start at an address compatible with the system alignement :
intptr_t orig_addr = chunk;
int delta = orig_addr % alignment;
char *pool = chunk + alignement - delta; /* pool in now aligned */
You now only have to return from your own pool addresses of blocs got as pool + n * alignement and converted to void * : 6.3.2.3 §1 says : A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.
It would be cleaner with C11, because C11 explicitely added _Alignas and alignof keywords to explictely deal with it and it would be better than the current hack. But it should work nonetheless
Limits :
I must admit that my interpretation of 6.3.2.3 §7 is that a pointer to a correctly aligned char array can be converted to a pointer of another type is not really neat and clear. Some may argue that what is said is just that if it originally pointed to the other type, it can be used as a char pointer. But as I start from a char pointer it is not explicitely allowed. That's true, but it is the best that can be done, it is not explicely marked as undefined behaviour ... and it is what malloc does under the hood.
As alignement is explicitely implementation dependant, you cannot create a general library usable on any implementation.
The actual rules regarding aliasing are laid out in standard section 6.5, paragraph 7. Note the wording:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
(emphasis mine)
Aliasing includes the notion of objects, not just general memory. For malloc to have returned the same address on a second use requires the original object to have been deallocated. Even if it has the same address, it is not considered the same object. Any attempts to access the first object through dangling pointers leftover after free are UB for completely different reasons, so there's no aliasing because any continued use of the first pointer p32 is invalid anyway.
I'm redefining sizeof as:
#undef sizeof
#define sizeof(type) ((char*)((type*)(0) + 1) - (char*)((type*)(0)))
For this to work, the 2 '0' in the definition need to be the same entity in memory, or in other words, need to have the same address. Is this always guaranteed, or is it compiler/architecture/run-time dependent?
The 0 here is not an object – it is an address. So the question you ask is something of a non-sequitur.
You are thinking that the zero's are discreet pieces of data that need to be stored somewhere. They aren't.. they are being cast as pointers to memory location zero.
When you increment a pointer to a type, it is actually incremented by the size of the type it points to. This is how C array arithmetic works.
In practice, a null pointer of a certain type always refers to the same location in memory (especially when constructed the same way, as you do above), simply because any other implementation would be senseless.
However, The standard actually does not guarantee a lot about this:
"[...] is guaranteed to compare unequal to a pointer to any object or function." 6.3.2.3§3
"[...] Any two null pointers shall compare equal." 6.3.2.3§4
This leaves a lot of lee-way. Assume a memory model with two distinctive regions. Each region could have a region of null pointers (say the first 128 bytes). It is easy to see, that even in that weird case, the basic assumptions about null pointers can indeed hold! Well, given a proper compiler that makes weird null tests...
So, what else do we know about pointers in general...
What you are trying to do is first, increment a pointer
"one operand shall be a pointer to a complete object type and the other shall have integer type. (Incrementing is equivalent to adding 1.)" [6.5.6§2]
and then a pointer difference
"both operands are pointers to qualified or unqualified versions of compatible complete object types" [6.5.6§3]
OK, they are (well, assuming type is a complete object type). But what about semantics?
"For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type." [6.5.6§7]
This is actually a bit of a problem: The null pointer need not point to an actual object! (Otherwise you could dereference it safely...) Therefore, incrementing it or subtracting it from another pointer is UB!
To conclude: 0 does not point to an object, and therefore the answer to your question is No.
A strictly standards-conforming compiler could reject this, or return some nonsense. On "typical" machines and pointers have the same size, and casting an integer to a pointer just takes that bit pattern and looks at it as a pointer. There are machines where words contain extra data (type perhaps, permission bits). Some addresses might be forbidden for certain objects (i.e., nothing can have address 0), and so on. While it is guaranteed that sizeof(char) == 1, on e.g. Crays a character is actually 32 bits.
Besides, the C standard guarantees that the expresison in sizeof(expression) is not evaluated at all, just its type is taken. I.e., ^sizeof(x++)doesn't incrementx`.