I've been trying to understand a particular aspect of strict aliasing recently, and I think I have made the smallest possible interesting piece of code. (Interesting for me, that is!)
Update: Based on the answers so far, it's clear I need to clarify the question. The first listing here is "obviously" defined behaviour, from a certain point of view. The real issue is to follow this logic through to custom allocators and custom memory pools. If I malloc a large block of memory at the start, and then write my own my_malloc and my_free that uses that single large block, then is it UB on the grounds that it doesn't use the official free?
I'll stick with C, somewhat arbitrarily. I get the impression it is easier to talk about, that the C standard is a bit clearer.
int main() {
uint32_t *p32 = malloc(4);
*p32 = 0;
free(p32);
uint16_t *p16 = malloc(4);
p16[0] = 7;
p16[1] = 7;
free(p16);
}
It is possible that the second malloc will return the same address as the first malloc (because it was freed in between). That means that it is accessing the same memory with two different types, which violates strict aliasing. So surely the above is undefined behaviour (UB)?
(For simplicity, let's assume the malloc always succeeds. I could add in checks for the return value of malloc, but that would just clutter the question)
If it's not UB, why? Is there an explicit exception in the standard, which says that malloc and free (and calloc/realloc/...) are allowed to "delete" the type associated with a particular address, allowing further accesses to "imprint" a new type on the address?
If malloc/free are special, then does that mean I cannot legally write my own allocator which clones the behaviour of malloc? I'm sure there are plenty of projects out there with custom allocators - are they all UB?
Custom allocators
If we decide, therefore, that such custom allocators must be defined behaviour, then it means the strict aliasing rule is essentially "incorrect". I would update it to say that it is possible to write (not read) through a pointer of a different ('new') type as long as you don't use pointers of the old type any more. This wording could be quietly-ish changed if it was confirmed that all compilers have essentially obeyed this new rule anyway.
I get the impression that gcc and clang essentially respect my (aggressive) reinterpretation. If so, perhaps the standards should be edited accordingly? My 'evidence' regarding gcc and clang is difficult to describe, it uses memmove with an identical source and destination (which is therefore optimized out) in such a way that it blocks any undesirable optimizations because it tells the compiler that future reads through the destination pointer will alias the bit pattern that was previously written through the source pointer. I was able to block the undesirable interpretations accordingly. But I guess this isn't really 'evidence', maybe I was just lucky. UB clearly means that the compiler is also allowed to give me misleading results!
( ... unless, of course, there is another rule that makes memcpy and memmove special in the same way that malloc may be special. That they are allowed to change the type to the type of the destination pointer. That would be consistent with my 'evidence'. )
Anyway, I'm rambling. I guess a very short answer would be: "Yes, malloc (and friends) are special. Custom allocators are not special and are therefore UB, unless they maintain separate memory pools for each type. And, further, see example X for an extreme piece of code where compiler Y does undesirable stuff precisely because compiler Y is very strict in this regard and is contradicting this reinterpretation."
Follow up: what about non-malloced memory? Does the same thing apply. (Local variables, static variables, ...)
Here are the C99 strict aliasing rules in (what I hope is) their entirety:
6.5
(6) The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using
memcpy or memmove, or is copied as an array of character type, then the effective type
of the modified object for that access and for subsequent accesses that do not modify the
value is the effective type of the object from which the value is copied, if it has one. For
all other accesses to an object having no declared type, the effective type of the object is
simply the type of the lvalue used for the access.
(7) An object shall have its stored value accessed only by an lvalue expression that has one of
the following types:
— a type compatible with the effective type of the object,
— a qualified version of a type compatible with the effective type of the object,
— a type that is the signed or unsigned type corresponding to the effective type of the object,
— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
— a character type.
These two clauses together prohibit one specific case, storing a value via an lvalue of type X and then subsequently retrieving a value via an lvalue of type Y incompatible with X.
So, as I read the standard, even this usage is perfectly OK (assuming 4 bytes are enough to store either an uint32_t or two uint16_t).
int main() {
uint32_t *p32 = malloc(4);
*p32 = 0;
/* do not do this: free(p32); */
/* do not do this: uint16_t *p16 = malloc(4); */
/* do this instead: */
uint16_t *p16 = (uint16_t *)p32;
p16[0] = 7;
p16[1] = 7;
free(p16);
}
There's no rule that prohibits storing an uint32_t and then subsequently storing an uint16_t at the same address, so we're perfectly OK.
Thus there's nothing that would prohibit writing a fully compliant pool allocator.
Your code is correct C and does not invoke undefined behaviour (except that you do not test malloc return value) because :
you allocate a bloc of memory, use it and free it
you allocate another bloc of memory, use it and free it.
What is undefined is whether p16 will receive same value as p32 had at a different time
What would be undefined behaviour, even if value was the same would be to access p32 after it has been freed. Examples :
int main() {
uint32_t *p32 = malloc(4);
*p32 = 0;
free(p32);
uint16_t *p16 = malloc(4);
p16[0] = 7;
p16[1] = 7;
if (p16 == p32) { // whether p16 and p32 are equal is undefined
uint32_t x = *p32; // accessing *p32 is explicitely UB
}
free(p16);
}
It is UB because you try to access a memory block after it has been freed. And even when it does point to a memory block, that memory block has been initialized as an array of uint16_t, using it as a pointer to another type is formally undefined behaviour.
Custom allocation (assuming a C99 conformant compiler) :
So you have a big chunk of memory and want to write custom free and malloc functions without UB. It is possible. Here I will not go to far into the hard part of management of allocated and free blocs, and just give hints.
you will need to know what it the strictest alignement for the implementation. stdlib malloc knows it because 7.20.3 §1 of C99 language specification (draft n1256) says : The pointer returned if the allocation
succeeds is suitably aligned so that it may be assigned to a pointer to any type of object. It is generally 4 on 32 bits systems and 8 on 64 bits systems, but might be greater or lesser ...
you memory pool must be a char array because 6.3.2.3 §7 says : A pointer to an object or incomplete type may be converted to a pointer to a different
object or incomplete type. If the resulting pointer is not correctly aligned for the pointed-to type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of
the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object. : that means that provided you can deal with the alignement, a character array of correct size can be converted to a pointer to an arbitrary type (and is the base of malloc implementation)
You must make your memory pool start at an address compatible with the system alignement :
intptr_t orig_addr = chunk;
int delta = orig_addr % alignment;
char *pool = chunk + alignement - delta; /* pool in now aligned */
You now only have to return from your own pool addresses of blocs got as pool + n * alignement and converted to void * : 6.3.2.3 §1 says : A pointer to void may be converted to or from a pointer to any incomplete or object type. A pointer to any incomplete or object type may be converted to a pointer to void and back again; the result shall compare equal to the original pointer.
It would be cleaner with C11, because C11 explicitely added _Alignas and alignof keywords to explictely deal with it and it would be better than the current hack. But it should work nonetheless
Limits :
I must admit that my interpretation of 6.3.2.3 §7 is that a pointer to a correctly aligned char array can be converted to a pointer of another type is not really neat and clear. Some may argue that what is said is just that if it originally pointed to the other type, it can be used as a char pointer. But as I start from a char pointer it is not explicitely allowed. That's true, but it is the best that can be done, it is not explicely marked as undefined behaviour ... and it is what malloc does under the hood.
As alignement is explicitely implementation dependant, you cannot create a general library usable on any implementation.
The actual rules regarding aliasing are laid out in standard section 6.5, paragraph 7. Note the wording:
An object shall have its stored value accessed only by an lvalue expression that has one of the following types:
(emphasis mine)
Aliasing includes the notion of objects, not just general memory. For malloc to have returned the same address on a second use requires the original object to have been deallocated. Even if it has the same address, it is not considered the same object. Any attempts to access the first object through dangling pointers leftover after free are UB for completely different reasons, so there's no aliasing because any continued use of the first pointer p32 is invalid anyway.
Related
Are explicit casts from char * to other pointer types fully defined behavior according to ANSI C89 if the pointer is guaranteed to meet the alignment requirements of the type you're casting to? Here's an example of what I mean:
/* process.c */
void *process(size_t elem_size, size_t cap) {
void *arr;
assert(cap > 5);
arr = malloc(elem_size * cap);
/* set id of element 5 to 0xffffff */
*(long *)((char *)arr + elem_size*5) = 0xffffff;
/* rest of the code omitted */
return arr;
}
/* main.c */
struct some_struct { long id; /* other members omitted */ };
struct other_struct { long id; /* other members omitted */ };
int main(int argc, char **argv) {
struct some_struct *s = process(sizeof(struct some_struct), 40);
printf("%lx\n", s[5].id);
return 0;
}
This code compiles without warnings and works as expected on my machine but I'm not fully sure if these kinds of casts are defined behavior.
C89 draft, section 4.10.3 (Memory management functions):
The pointer returned if the allocation succeeds is suitably aligned so that it may be assigned to a pointer to any type of object and then used to access such an object in the space allocated (until the space is explicitly freed or reallocated).
C89 draft, section 3.3.4 (Cast operators):
A pointer to an object or incomplete type may be converted to a pointer to a different object type or a different incomplete type. The resulting pointer might not be valid if it is improperly aligned for the type pointed to. It is guaranteed, however, that a pointer to an object of a given alignment may be converted to a pointer to an object of the same alignment or a less strict alignment and back again; the result shall compare equal to the original pointer.
This clearly specifies what happens if you cast from struct some_struct * to char * and back to struct some_struct * but in my case the code responsible for allocation doesn't have access to the full struct definition so it can't initially specify the pointer type to be struct some_struct * so I'm not sure if the rule still applies.
If the code I posted is technically UB, is there another standards-compliant way to modify array elements without knowing their full type? Are there real-world impementations where you would expect it to do something else than ((struct some_struct *)arr)[5].id = 0xffffff;?
This code compiles without warnings and works as expected on my machine but I'm not fully sure if these kinds of casts are defined behavior.
In general, the casts have defined behavior, but that behavior can be that the result is not a valid pointer. Thus dereferencing the result of the cast may produce UB.
Considering only function process(), then, it is possible that the result of its evaluation of (long *)((char *)arr + elem_size*5) is an invalid pointer. If and only if it is invalid, the attempt to use it to assign a value to the object it hypothetically points to produces UB.
However, main()s particular usage of that function is fine:
In process(), the pointer returned by malloc and stored in arr is suitably aligned for a struct some_struct (and for any other type).
The compiler must choose a size and layout for struct some_struct such that every element and every member of every element of an array of such structures is properly aligned.
Arrays are composed of contiguous objects of the array's element type, without gaps (though the element types themselves may contain padding if they are structures or unions).
Therefore, (char *)arr + n * sizeof(struct some_struct) must be suitably aligned for a struct some_struct, for any integer n such that the result points within or just past the end of the allocated region. This computation is closely related to the computations involved in accessing an array of struct some_struct via the indexing operator.
struct some_struct has a long as its first member, and that member must appear at offset 0 from the beginning of the struct. Therefore, every pointer that is suitably aligned for a struct some_struct must also be suitably aligned for a long.
Determining whether an operation upholds or violates "type aliasing" constraints requires being able to answer two questions:
For purposes of the aliasing rules, when does a region of storage hold an "object" of a particular type.
For purposes of the aliasing rules, is a particular access performed "by" an lvalue expression of a particular type.
For your question, the second issue above is the most the relevant: if a pointer or lvalue of type T1 is used to derive a value of type T2* which is dereferenced to access storage, the access may sometimes need to be regarded for purposes of aliasing as though performed by an lvalue of type T1, and sometimes as though performed by one of type T2, but the Standard fails to offer any guidance as to which interpretation should apply when. Constructs like yours would be processed predictably by implementation that didn't abuse the Standard as an excuse to behave nonsensically, but could be processed nonsensically by conforming but obtuse implementations that do abuse the Standard in such fashion.
The authors of C89 didn't expect anyone to care about the precise boundaries between constructs whose behavior was defined by the Standard, versus those which all implementations were expected to process identically but which the Standard didn't actually define, and thus saw no need to define the terms "object" and "by" with sufficient precision to unambiguously answer the above questions in ways that would yield defined program behavior in all sensible cases.
Let's say I want to move a void* pointer by 4 bytes. Are the following equivalent:
A:
void* new_address(void* in_ptr) {
intptr_t tmp = (intptr_t)in_ptr;
intptr_t new_address = tmp + 4;
return (void*)new_address;
}
B:
void* new_address(void* in_ptr) {
char* tmp = (char*)in_ptr;
char* new_address = tmp + 4;
return (void*)new_address;
}
Are both defined behavior? Is one more popular/accepted convention? Any other reason to use one over the other?.
Let's only consider 64bit systems. If intptr_t is not available we can use int64_t instead.
The context is a custom memory allocator which needs to move the pointer before allocating new block of memory to a specific address (for alignment purposes). We don't know what object the resulting pointer is going to point to yet but we know we need to move it to a specific location which in the examples above is 4 bytes.
Michael Kerrisk says on page 1415 that,
The C standards make one exception to the rule that pointers of
different types need not have the same representation: pointers of the
types char * and void * are required to have the same internal
representation.
All the C standard guarantees (7.18.1.4) is that you can convert void* values to intptr_t (or uintptr_t) and back again and end up with an equal value for the pointer.
The nuance is here that we cannot apply mathematical operations (including ==) if void* is in use.
Is casting a pointer to intptr_t [...] defined behavior?
Converting a pointer to any integer type is defined and the result is implementation defined, except when result can't be represented in integer type, then it's undefined behavior. See C11 6.3.2.3p6. But intptr_t has to be able to represent void* - the behavior is defined.
, doing arithmetic on it and then casting back, defined behavior?
Any integer may be converted to any pointer type. The resulting pointer is implementation defined - there is no guarantee that adding 4 to intptr_t will increment the pointer value by 4. See C11 6.3.2.3p5.
Are both defined behavior?
Yes, however the result is implementation defined.
Is one more popular/accepted convention?
Subjective: I say using uintptr_t is more popular then intptr_t. Converting a pointer to uintptr_t or to char* to do some arithmetic happens in some code, I can't say which is more popular.
Any other reason to use one over the other?.
Not really, but I think go with char*.
When it comes to actually accessing the data behind the resulting pointer - it depends. If the resulting pointer points within the same object then you're fine (remember, conversion is implementation defined). If the resulting pointer does not point to the same object, I believe the best interpretation would be from reading c2263 Clarifying Pointer Provenance v4 2.2.3Q5 and I think that's: the current C11 standard does not clearly specify that, which would make the behavior not defined.
Because you tagged gcc, both code snippets should compile to equivalent code - I believe on all architectures pointers are converted 1:1 to (u)intptr_t on gcc. Gcc docs implementation defined behavior 4.7 arrays and pointers states casting from pointer to integer and back again, the resulting pointer must reference the same object as the original pointer, otherwise the behavior is undefined - so you're safe as long as the resulting pointer points to the same object.
The context is a custom memory allocator
See implementations of container_of and offsetof macros. Do not hardcode + 4 in your code, and if you do, do not depend on alignment requirements on accessing the resulting pointers - remember to use memcpy to safely copy the context or handle alignment properly. Do not reinvent the wheel - when in doubt see other implementations like glibc malloc.c or newlib malloc.c - they both calculate on char* in mem2chunk macro, but also happen to do calculations on uintptr_t integers.
No 'strictly conforming program uses A. Using the result may be Undefined Behaviour as there is no requirement for addition against intptr_t to be reflected in a pointer value if that intptr_ is converted back to a pointer.
It is both unspecified behaviour and implementation-defined.
If the optional type intptr_t is defined all you are guaranteed is that you can convert void * to intptr_t and then convert that value back to void * and the two values will compare equal (==).
The strictly conforming way to perform pointer arithmetic is B. B is guaranteed to work if and only if the pointer int_ptr is valid and for the largest enclosing object there are 3 or more bytes in that object beyond that value. It's 3 because it's valid to point to (but not dereference) to the address that is (logically) one byte beyond the end of an object.
Object includes a declared object (including array) or block of memory such as returned by malloc().
All good practice is to prefer to write 'strictly conforming' programs where possible. So all good practice is to prefer B over A.
According to the standard the use of the pointer (as a pointer) may result in Undefined Behaviour because it may be (implementation defined) to be a trap representation.
A strictly conforming program is defined as "A strictly conforming program shall use only those features of the language and library specified in this International Standard.3) It shall not produce output dependent on any unspecified, undefined, or implementation-defined behavior, and shall not exceed any minimum implementation limit.
There's some disagreement about whether the code offered for A is unspecified or implementation defined. The standard says both because implementation-defined behaviour is a sub-category of unspecified. However because the implementation may document it as a trap representation using the value may result in Undefined Behaviour.
But I hope that is swept aside by the fact that 'strictly conforming programs' don't depend on unspecified, undefined or implementation defined behaviour.
So good practice here is certainly B.
Consider a secure environment that encrypts pointer values to deliberately confound the de-referencing of arbitrary pointer values. In principle it could provide intptr_t and be conformant.
Though I still maintain that if A doesn't work then intptr_t being an optional type it would be better to not provide it. Whether it is defined is unspecified and implementation dependent. That's because no 'strictly conforming program' uses it and it has no practical use other than to manipulate a pointer as an arithmetic type in a way not supported by pointer arithmetic on a compatible pointer type char *. The snippet in A falls into that category.
To store a void * declare a void * or char[sizeof(void*)] or malloc() or similar. To overlay a void * over an arithmetic type, declare a union and benefit that the union will be aligned for a void *.
But according to the specification it is unspecified, implementation-defined no 'strictly conforming program' can rely on it and may result in Undefined Behaviour.
A very long winded way of saying the answer, here, is B.
Unfortunately, I haven't found anything like std-discussion for the ISO C standard, so I'll ask here.
Before answering the question, make sure you are familiar with the idea of pointer provenance (see DR260 and/or http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2263.htm).
6.3.2.3(Pointers) paragraph 7 says:
When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
The questions are:
What does "a pointer to an object" mean? Does it mean that if we want to get the pointer to the lowest addressed byte of an object of type T, we shall convert a pointer of type cvT* to a pointer to a character type, and converting a pointer to void, obtained from the pointer of type T*, won't give us desired result? Or "a pointer to an object" is the value of a pointer which follows from the pointer provenance and cast to void* does not change the value (analogous to how it was recently formalized in C++17)?
Why the paragraph explicitly mentions increments? Does it mean that adding value greater than 1 is undefined? Does it mean that decrementing (after we have incremented the result several times so that we won't go beyond the lower object boundary) is undefined? In short: is the sequences of bytes composing an object an array?
The general description of pointer addition suggests that for any values/types of pointer p and signed integers x and y where ((ptr+x)+y) and (x+y) are both defined by the Standard, (ptr+(x+y)) would behave equivalent to ((ptr+x)+y). While it might be possible to argue that the Standard doesn't explicitly say that incrementing a pointer five times would be equivalent to adding 5, there is nothing in the Standard that would suggest that quality implementations should not be expected to behave in that fashion.
Note that the authors of the Standard didn't try to make it "language-lawyer-proof". Further, they didn't think anyone would care about whether or not an obviously-inferior implementation was "conforming". An implementation that only works reliably if bytes of an object are accessed sequentially would be less versatile than one which supported reliable indexing, while offering no plausible advantage. Consequently, there should be no need for the Standard to mandate support for indexing, because anyone trying to produce a quality implementation would support it whether the Standard mandated it or not.
Of course, there are some constructs which programmers in the 1990s--or even the authors of the Standard themselves--expected quality compilers to handle reliably, but which some of today's "clever" compilers don't. Whether that means such expectations were unreasonable, or whether they remain accurate when applied to quality compilers, is a matter of opinion. In this particular case, I think the implication that positive indexing should behave like repeated incrementing is strong enough that I wouldn't expect compiler writers to argue otherwise, but I'm not 100% certain that no compiler would ever be "clever"/obtuse enough to look at something like:
int test(unsigned char foo[5][5], int x)
{
foo[1][0] = 1;
// Following should yield a pointer that can be used to access the entire
// array 'foo', but an obtuse compiler writer could perhaps argue that the
// code is converting the address of foo[0] into a pointer to the first
// element of that sub-array, and that the resulting pointer is thus only
// usable to access items within that sub-array.
unsigned char *p = (unsigned char*)foo;
// Following should be able to access any element of the object [i.e. foo]
// whose address was taken
p[x] = 2;
return foo[1][0];
}
and decide that it could skip the second read of foo[1][0] since p[x] wouldn't possibly access any element of foo beyond the first row. I would, however, say that programmers should not try to code around the possibility of vandals writing a compiler that would behave that way. No program can be made bullet-proof against vandals writing obtuse-but-"conforming" compilers, and the fact that a program can be undermined by such vandals should hardly be viewed as a defect.
Take a non-char c object and create a pointer to it, i.e.
int obj;
int *objPtr = &obj;
convert the pointer to object to pointer to char:
char *charPtr = (char *)objPtr;
now, charPtr points to the lowest byte or the int obj.
increment it:
charPtr++;
now it points to the next byte of the object. and so on till you reach the size of the object:
int i;
for (i = 0; i < sizeof(obj); i++)
printf("%d", *charPtr++);
AFAIK, there are three situations where aliasing is ok
Types that only differ by qualifier or sign can alias each other.
struct or union types can alias types contained inside them.
casting T* to char* is ok. (the opposite is not allowed)
These makes sense when reading simple examples from John Regehrs blog posts but I'm not sure how to reason about aliasing-correctness for larger examples, such as malloc-like memory arrangements.
I'm reading Per Vognsens re-implementation of Sean Barrets stretchy buffers. It uses a malloc-like schema where a buffer has associated metadata just before it.
typedef struct BufHdr {
size_t len;
size_t cap;
char buf[];
} BufHdr;
The metadata is accessed by subtracting an offset from a pointer b:
#define buf__hdr(b) ((BufHdr *)((char *)(b) - offsetof(BufHdr, buf)))
Here's a somewhat simplified version of the original buf__grow function that extends the buffer and returns the buf as a void*.
void *buf__grow(const void *buf, size_t new_size) {
// ...
BufHdr *new_hdr; // (1)
if (buf) {
new_hdr = xrealloc(buf__hdr(buf), new_size);
} else {
new_hdr = xmalloc(new_size);
new_hdr->len = 0;
}
new_hdr->cap = new_cap;
return new_hdr->buf;
}
Usage example (the buf__grow is hidden behind macros but here it's in the open for clarity):
int *ip = NULL;
ip = buf__grow(ip, 16);
ip = buf__grow(ip, 32);
After these calls, we have 32 + sizeof(BufHdr) bytes large memory area on the heap. We have ip pointing into that area and we have new_hdr and buf__hdr pointing into it at various points in the execution.
Questions
Is there a strict-aliasing violation here? AFAICT, ip and some variable of type BufHdr shouldn't be allowed to point to the same memory.
Or is it so that the fact that buf__hdr not creating an lvalue means it's not aliasing the same memory as ip? And the fact that new_hdr is contained within buf__grow where ip isn't "live" means that those aren't aliasing either?
If new_hdr were in global scope, would that change things?
Do the C compiler track the type of storage or only the types of variables? If there is storage, such as the memory area allocated in buf__grow that doesn't have any variable pointing to it, then what is the type of that storage? Are we free to reinterpret that storage as long as there is no variable associated with that memory?
Is there a strict-aliasing violation here? AFAICT, ip and some variable of type BufHdr shouldn't be allowed to point to the same memory.
What's important to remember is that a strict aliasing violation only occurs when you do a value access of a memory location, and the compiler believes that what's stored at that memory location is of a different type. So it is not so important to speak of the types of the pointers, as to speak of the effective type of whatever they point at.
An allocated chunk of memory has no declared type. What applies is C11 6.5/6:
The effective type of an object for an access to its stored value is the declared type of the
object, if any. 87)
Where note 87 clarifies that allocated objects have no declared type. That is the case here, so we continue to read the definition of effective type:
If a value is stored into an object having no declared type through an
lvalue having a type that is not a character type, then the type of the lvalue becomes the
effective type of the object for that access and for subsequent accesses that do not modify
the stored value.
This means that as soon as we do an access to the chunk of allocated memory, the effective type of whatever is stored there, becomes the type of whatever we stored there.
The first time access happens in your case, is the lines new_hdr->len = 0; and new_hdr->cap = new_cap;, making the effective type of the data at those addresses size_t.
buf remains inaccessed, so that part of the memory does not yet have an effective type. You return new_hdr->buf and set an int* to point there.
The next thing that will happen, I assume is buf__hdr(ip). In that macro, the pointer is cast to (char *), then some pointer subtraction occurs:
(b) - offsetof(BufHdr, buf) // undefined behavior
Here we formally get undefined behavior, but for entirely different reasons than strict aliasing. b is not a pointer pointing to the same array as whatever is stored before b. The relevant part is the specification of the additive operators 6.5.6:
For subtraction, one of the following shall hold:
— both operands have arithmetic type;
— both operands are pointers to qualified or unqualified versions of compatible complete object types; or
— the left operand is a pointer to a complete object type and the right operand has integer type.
The first two clearly don't apply. In the third case, we don't point to a complete object type, as buf has not yet gotten an effective type. As I understand it, this means we have a constraint violation, I'm not entirely sure here. I am however very sure that the following is violated, 6.5.6/9:
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements. The size of the result is implementation-defined,
and its type (a signed integer type) is ptrdiff_t defined in the <stddef.h> header.
If the result is not representable in an object of that type, the behavior is undefined
So that's definitely a bug.
If we ignore that part, the actual access (BufHdr *) is fine, since BufHdr is a struct ("aggregate") containing the effective type of the object accessed (2x size_t). And here the memory of buf is accessed for the first time, getting the effective type char[] (flexible array member).
There is no strict aliasing violation unless you would after invoking the above macro go and access ip as an int.
If new_hdr were in global scope, would that change things?
No, the pointer type does not matter, only the effective type of the pointed-at object.
Do the C compiler track the type of storage or only the types of variables?
It needs to track the effective type of the object if it wishes to do optimizations like gcc, assuming strict aliasing violations never occur.
Are we free to reinterpret that storage as long as there is no variable associated with that memory?
Yes you can point at it with any kind of pointer - since it is allocated memory, it doesn't get an effective type until you do a value access.
The Standard does not define any means by which an lvalue of one type can be used to derive an lvalue of a second type that can be used to access the storage, unless the latter has a character type. Even something as basic as:
union foo { int x; float y;} u = {0};
u.x = 1;
invokes UB because it uses an lvalue of type int to access the storage associated with an object of type union foo and float. On the other hand, the authors of the Standard probably figured that since no compiler writer would be so obtuse as to use the lvalue-type rules as justification for not processing the above in useful fashion, there was no need to try to craft explicit rules mandating that they do so.
If a compiler guarantees not to "enforce" the rule except in cases where:
an object is modified during a particular execution of a function or loop;
lvalues of two or more different types are used to access storage during such execution; and
neither lvalue has been visibly and actively derived from the other within such execution
such a guarantee would be sufficient to allow a malloc() implementation that would be free of "aliasing"-related problems. While I suspect the authors of the Standard probably expected compiler writers to naturally uphold such a guarantee whether or not it was mandated, neither gcc nor clang will do so unless the -fno-strict-aliasing flag is used.
Unfortunately, when asked in Defect Report #028 to clarify what the C89 rules meant, the Committee responded by suggesting that an lvalue formed by dereferencing a pointer to a unions member will mostly behave like an lvalue formed directly with the member-access operator, except that actions which would invoke Implementation-Defined Behavior if done directly on a union member should invoke UB if done on a pointer. When writing C99, the Committee decided to "clarify" things by codifying that principle into C99's "Effective Type" rules, rather than recognizing any cases where an lvalue of a derived type may be used to access the parent object [an omission which the Effective Type rules do nothing to correct!].
The C99 standard states that:
When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object
Consider the following code:
struct test {
int x[5];
char something;
short y[5];
};
...
struct test s = { ... };
char *p = (char *) s.x;
char *q = (char *) s.y;
printf("%td\n", q - p);
This obviously breaks the above rule, since the p and q pointers are pointing to different "array objects", and, according to the rule, the q - p difference is undefined.
But in practice, why should such a thing ever result in undefined behaviour? After all, the struct members are laid out sequentially (just as array elements are), with any potential padding between the members. True, the amount of padding will vary across implementations and that would affect the outcome of the calculations, but why should that result be "undefined"?
My question is, can we suppose that the standard is just "ignorant" of this issue, or is there a good reason for not broadening this rule? Couldn't the above rule be rephrased to "both shall point to elements of the same array object or members of the same struct"?
My only suspicion are segmented memory architectures where the members might end up in different segments. Is that the case?
I also suspect that this is the reason why GCC defines its own __builtin_offsetof, in order to have a "standards compliant" definition of the offsetof macro.
EDIT:
As already pointed out, arithmetic on void pointers is not allowed by the standard. It is a GNU extension that fires a warning only when GCC is passed -std=c99 -pedantic. I'm replacing the void * pointers with char * pointers.
Subtraction and relational operators (on type char*) between addresses of member of the same struct are well defined.
Any object can be treated as an array of unsigned char.
Quoting N1570 6.2.6.1 paragraph 4:
Values stored in non-bit-field objects of any other object type
consist of n × CHAR_BIT bits, where n is the size of an object of that
type, in bytes. The value may be copied into an object of type
unsigned char [ n ] (e.g., by memcpy); the resulting set of bytes is
called the object representation of the value.
...
My only suspicion are segmented memory architectures where the members
might end up in different segments. Is that the case?
No. For a system with a segmented memory architecture, normally the compiler will impose a restriction that each object must fit into a single segment. Or it can permit objects that occupy multiple segments, but it still has to ensure that pointer arithmetic and comparisons work correctly.
Pointer arithmetic requires that the two pointers being added or subtracted to be part of the same object because it doesn't make sense otherwise.
The quoted section of standard specifically refers to two unrelated objects such as int a[b]; and int b[5]. The pointer arithmetic requires to know the type of the object that the pointers pointing to (I am sure you are aware of this already).
i.e.
int a[5];
int *p = &a[1]+1;
Here p is calculated by knowing the that the &a[1] refers to an int object and hence incremented to 4 bytes (assuming sizeof(int) is 4).
Coming to the struct example, I don't think it can possibly be defined in a way to make pointer arithmetic between struct members legal.
Let's take the example,
struct test {
int x[5];
char something;
short y[5];
};
Pointer arithmatic is not allowed with void pointers by C standard (Compiling with gcc -Wall -pedantic test.c would catch that). I think you are using gcc which assumes void* is similar to char* and allows it.
So,
printf("%zu\n", q - p);
is equivalent to
printf("%zu", (char*)q - (char*)p);
as pointer arithmetic is well defined if the pointers point to within the same object and are character pointers (char* or unsigned char*).
Using correct types, it would be:
struct test s = { ... };
int *p = s.x;
short *q = s.y;
printf("%td\n", q - p);
Now, how can q-p be performed? based on sizeof(int) or sizeof(short) ? How can the size of char something; that's in the middle of these two arrays be calculated?
That should explain it's not possible to perform pointer arithmetic on objects of different types.
Even if all members are of same type (thus no type issue as stated above), then it's better to use the standard macro offsetof (from <stddef.h>) to get the difference between struct members which has the similar effect as pointer arithmetic between members:
printf("%zu\n", offsetof(struct test, y) - offsetof(struct test, x));
So I see no necessity to define pointer arithmetic between struct members by the C standard.
Yes, you are allowed to perform pointer arithmetric on structure bytes:
N1570 - 6.3.2.3 Pointers p7:
... When a pointer to an object is converted to a pointer to a character type,
the result points to the lowest addressed byte of the object. Successive increments of the
result, up to the size of the object, yield pointers to the remaining bytes of the object.
This means that for the programmer, bytes of the stucture shall be seen as a continuous area, regardless how it may have been implemented in the hardware.
Not with void* pointers though, that is non-standard compiler extension. As mentioned on paragraph from the standard, it applies only to character type pointers.
Edit:
As mafso pointed out in comments, above is only true as long as type of substraction result ptrdiff_t, has enough range for the result. Since range of size_t can be larger than ptrdiff_t, and if structure is big enough, it's possible that addresses are too far apart.
Because of this it's preferable to use offsetof macro on structure members and calculate result from those.
I believe the answer to this question is simpler than it appears, the OP asks:
but why should that result be "undefined"?
Well, let's see that the definition of undefined behavior is in the draft C99 standard section 3.4.3:
behavior, upon use of a nonportable or erroneous program construct or
of erroneous data, for which this International Standard imposes no
requirements
it is simply behavior for which the standard does not impose a requirement, which perfectly fits this situation, the results are going to vary depending on the architecture and attempting to specify the results would have probably been difficult if not impossible in a portable manner. This leaves the question, why would they choose undefined behavior as opposed to let's say implementation of unspecified behavior?
Most likely it was made undefined behavior to limit the number of ways an invalid pointer could be created, this is consistent with the fact that we are provided with offsetof to remove the one potential need for pointer subtraction of unrelated objects.
Although the standard does not really define the term invalid pointer, we get a good description in Rationale for International Standard—Programming Languages—C which in section 6.3.2.3 Pointers says (emphasis mine):
Implicit in the Standard is the notion of invalid pointers. In
discussing pointers, the Standard typically refers to “a pointer to an
object” or “a pointer to a function” or “a null pointer.” A special
case in address arithmetic allows for a pointer to just past the end
of an array. Any other pointer is invalid.
The C99 rationale further adds:
Regardless how an invalid pointer is created, any use of it yields
undefined behavior. Even assignment, comparison with a null pointer
constant, or comparison with itself, might on some systems result in
an exception.
This strongly suggests to us that a pointer to padding would be an invalid pointer, although it is difficult to prove that padding is not an object, the definition of object says:
region of data storage in the execution environment, the contents of
which can represent values
and notes:
When referenced, an object may be interpreted as having a particular
type; see 6.3.2.1.
I don't see how we can reason about the type or the value of padding between elements of a struct and therefore they are not objects or at least is strongly indicates padding is not meant to be considered an object.
I should point out the following:
from the C99 standard, section 6.7.2.1:
Within a structure object, the non-bit-field members and the units in which bit-fields
reside have addresses that increase in the order in which they are declared. A pointer to a
structure object, suitably converted, points to its initial member (or if that member is a
bit-field, then to the unit in which it resides), and vice versa. There may be unnamed
padding within a structure object, but not at its beginning.
It isn't so much that the result of pointer subtraction between members is undefined so much as it is unreliable (i.e. not guaranteed to be the same between different instances of the same struct type when the same arithmetic is applied).