How to implement memmove in standard C without an intermediate copy? - c

From the man page on my system:
void *memmove(void *dst, const void *src, size_t len);
DESCRIPTION
The memmove() function copies len bytes from string src to string dst.
The two strings may overlap; the copy is always done in a non-destructive
manner.
From the C99 standard:
6.5.8.5 When two pointers are compared, the result depends on the
relative locations in the address
space of the objects pointed to. If
two pointers to object or incomplete
types both point to the same object,
or both point one past the last
element of the same array object,
theycompare equal. If the objects
pointed to are members of the same
aggregate object, pointers to
structure members declared later
compare greater than pointers to
members declared earlier in the
structure, and pointers to array
elements with larger subscript values
compare greater than pointers to
elements of the same array with lower
subscript values. All pointers to
members of the same union object
compare equal. If the expression P
points to an element of an array
object and the expression Q points to
the last element of the same array
object, the pointer expression Q+1
compares greater than P. In all
other cases, the behavior is
undefined.
The emphasis is mine.
The arguments dst and src can be converted to pointers to char so as to alleviate strict aliasing problems, but is it possible to compare two pointers that may point inside different blocks, so as to do the copy in the correct order in case they point inside the same block?
The obvious solution is if (src < dst), but that is undefined if src and dst point to different blocks. "Undefined" means you should not even assume that the condition returns 0 or 1 (this would have been called "unspecified" in the standard's vocabulary).
An alternative is if ((uintptr_t)src < (uintptr_t)dst), which is at least unspecified, but I am not sure that the standard guarantees that when src < dst is defined, it is equivalent to (uintptr_t)src < (uintptr_t)dst). Pointer comparison is defined from pointer arithmetic. For instance, when I read section 6.5.6 on addition, it seems to me that pointer arithmetic could go in the direction opposite to uintptr_t arithmetic, that is, that a compliant compiler might have, when p is of type char*:
((uintptr_t)p)+1==((uintptr_t)(p-1)
This is only an example. Generally speaking very little seems to be guaranteed when converting pointers to integers.
This is a purely academic question, because memmove is provided together with the compiler. In practice, the compiler authors can simply promote undefined pointer comparison to unspecified behavior, or use the relevant pragma to force their compiler to compile their memmove correctly. For instance, this implementation has this snippet:
if ((uintptr_t)dst < (uintptr_t)src) {
/*
* As author/maintainer of libc, take advantage of the
* fact that we know memcpy copies forwards.
*/
return memcpy(dst, src, len);
}
I would still like to use this example as proof that the standard goes too far with undefined behaviors, if it is true that memmove cannot be implemented efficiently in standard C. For instance, no-one ticked when answering this SO question.

I think you're right, it's not possible to implement memmove efficiently in standard C.
The only truly portable way to test whether the regions overlap, I think, is something like this:
for (size_t l = 0; l < len; ++l) {
if (src + l == dst) || (src + l == dst + len - 1) {
// they overlap, so now we can use comparison,
// and copy forwards or backwards as appropriate.
...
return dst;
}
}
// No overlap, doesn't matter which direction we copy
return memcpy(dst, src, len);
You can't implement either memcpy or memmove all that efficiently in portable code, because the platform-specific implementation is likely to kick your butt whatever you do. But a portable memcpy at least looks plausible.
C++ introduced a pointer specialization of std::less, which is defined to work for any two pointers of the same type. It might in theory be slower than <, but obviously on a non-segmented architecture it isn't.
C has no such thing, so in a sense, the C++ standard agrees with you that C doesn't have enough defined behaviour. But then, C++ needs it for std::map and so on. It's much more likely that you'd want to implement std::map (or something like it) without knowledge of the implementation than that you'd want to implement memmove (or something like it) without knowledge of the implementation.

For two memory areas to be valid and overlapping, I believe you would need to be in one of the defined situations of 6.5.8.5. That is, two areas of an array, union, struct, etc.
The reason other situations are undefined are because two different objects might not even be in the same kind of memory, with the same kind of pointer. On PC architectures, addresses are usually just 32-bit address into virtual memory, but C supports all kinds of bizarre architectures, where memory is nothing like that.
The reason that C leaves things undefined is to give leeway to the compiler writers when the situation doesn't need to be defined. The way to read 6.5.8.5 is a paragraph carefully describing architectures that C wants to support where pointer comparison doesn't make sense unless it's inside the same object.
Also, the reason memmove and memcpy are provided by the compiler is that they are sometimes written in tuned assembly for the target CPU, using a specialized instruction. They are not meant to be able to be implemented in C with the same efficiency.

For starters, the C standard is notorious for having problems in the details like this. Part of the problem is because C is used on multiple platforms and the standard attempts to be abstract enough to cover all current and future platforms (which might use some convoluted memory layout that's beyond anything we've ever seen). There is a lot of undefined or implementation-specific behavior in order for compiler writers to "do the right thing" for the target platform. Including details for every platform would be impractical (and constantly out-of-date); instead, the C standard leaves it up to the compiler writer to document what happens in these cases. "Unspecified" behavior only means that the C standard doesn't specify what happens, not necessarily that the outcome cannot be predicted. The outcome is usually still predictable if you read the documentation for your target platform and your compiler.
Since determining if two pointers point to the same block, memory segment, or address space depends on how the memory for that platform is laid out, the spec does not define a way to make that determination. It assumes that the compiler knows how to make this determination. The part of the spec you quoted said that result of pointer comparison depends on the pointers' "relative location in the address space". Notice that "address space" is singular here. This section is only referring to pointers that are in the same address space; that is, pointers that are directly comparable. If the pointers are in different address spaces, then the result is undefined by the C standard and is instead defined by the requirements of the target platform.
In the case of memmove, the implementor generally determines first if the addresses are directly comparable. If not, then the rest of the function is platform-specific. Most of the time, being in different memory spaces is enough to ensure that the regions don't overlap and the function turns into a memcpy. If the addresses are directly comparable, then it's just a simple byte copy process starting from the first byte and going forward or from the last byte and going backwards (whichever one will safely copy the data without clobbering anything).
All in all, the C standard leaves a lot intentionally unspecified where it can't write a simple rule that works on any target platform. However, the standard writers could have done a better job explaining why some things are not defined and used more descriptive terms like "architecture-dependent".

Here's another idea, but I don't know if it's correct. To avoid the O(len) loop in Steve's answer, one could put it in the #else clause of an #ifdef UINTPTR_MAX with the cast-to-uintptr_t implementation. Provided that cast of unsigned char * to uintptr_t commutes with adding integer offsets whenever the offset is valid with the pointer, this makes the pointer comparison well-defined.
I'm not sure whether this commutativity is defined by the standard, but it would make sense, as it works even if only the lower bits of a pointer are an actual numeric address and the upper bits are some sort of black box.

I would still like to use this example as proof that the standard goes too far with undefined behaviors, if it is true that memmove cannot be implemented efficiently in standard C
But it's not proof. There's absolutely no way to guarantee that you can compare two arbitrary pointers on an arbitrary machine architecture. The behaviour of such a pointer comparison cannot be legislated by the C standard or even a compiler. I could imagine a machine with a segmented architecture that might produce a different result depending on how the segments are organised in RAM or might even choose to throw an exception when pointers into different segments are compared. This is why the behaviour is "undefined". The exact same program on the exact same machine might give different results from run to run.
The oft given "solution" of memmove() using the relationship of the two pointers to choose whether to copy from the beginning to the end or from the end to the beginning only works if all memory blocks are allocated from the same address space. Fortunately, this is usually the case although it wasn't in the days of 16 bit x86 code.

Related

How does pointer comparison work in C? Is it ok to compare pointers that don't point to the same array?

In K&R (The C Programming Language 2nd Edition) chapter 5 I read the following:
First, pointers may be compared under certain circumstances.
If p and q point to members of the same array, then relations like ==, !=, <, >=, etc. work properly.
Which seems to imply that only pointers pointing to the same array can be compared.
However when I tried this code
char t = 't';
char *pt = &t;
char x = 'x';
char *px = &x;
printf("%d\n", pt > px);
1 is printed to the screen.
First of all, I thought I would get undefined or some type or error, because pt and px aren't pointing to the same array (at least in my understanding).
Also is pt > px because both pointers are pointing to variables stored on the stack, and the stack grows down, so the memory address of t is greater than that of x? Which is why pt > px is true?
I get more confused when malloc is brought in. Also in K&R in chapter 8.7 the following is written:
There is still one assumption, however, that pointers to different blocks returned by sbrk can be meaningfully compared. This is not guaranteed by the standard which permits pointer comparisons only within an array. Thus this version of malloc is portable only among machines for which the general pointer comparison is meaningful.
I had no issue comparing pointers that pointed to space malloced on the heap to pointers that pointed to stack variables.
For example, the following code worked fine, with 1 being printed:
char t = 't';
char *pt = &t;
char *px = malloc(10);
strcpy(px, pt);
printf("%d\n", pt > px);
Based on my experiments with my compiler, I'm being led to think that any pointer can be compared with any other pointer, regardless of where they individually point. Moreover, I think pointer arithmetic between two pointers is fine, no matter where they individually point because the arithmetic is just using the memory addresses the pointers store.
Still, I am confused by what I am reading in K&R.
The reason I'm asking is because my prof. actually made it an exam question. He gave the following code:
struct A {
char *p0;
char *p1;
};
int main(int argc, char **argv) {
char a = 0;
char *b = "W";
char c[] = [ 'L', 'O', 'L', 0 ];
struct A p[3];
p[0].p0 = &a;
p[1].p0 = b;
p[2].p0 = c;
for(int i = 0; i < 3; i++) {
p[i].p1 = malloc(10);
strcpy(p[i].p1, p[i].p0);
}
}
What do these evaluate to:
p[0].p0 < p[0].p1
p[1].p0 < p[1].p1
p[2].p0 < p[2].p1
The answer is 0, 1, and 0.
(My professor does include the disclaimer on the exam that the questions are for a Ubuntu Linux 16.04, 64-bit version programming environment)
(editor's note: if SO allowed more tags, that last part would warrant x86-64, linux, and maybe assembly. If the point of the question / class was specifically low-level OS implementation details, rather than portable C.)
According to the C11 standard, the relational operators <, <=, >, and >= may only be used on pointers to elements of the same array or struct object. This is spelled out in section 6.5.8p5:
When two pointers are compared, the result depends on the
relative locations in the address space of the objects pointed to.
If two pointers to object types both point to the same object, or
both point one past the last element of the same array
object, they compare equal. If the objects pointed to are
members of the same aggregate object,pointers to structure
members declared later compare greater than pointers to
members declared earlier in the structure, and pointers to
array elements with larger subscript values compare greater than
pointers to elements of the same array with lower subscript values.
All pointers to members of the same union object compare
equal. If the expression P points to an element of an array
object and the expression Q points to the last element of the
same array object, the pointer expression Q+1 compares greater than P.
In all other cases, the behavior is undefined.
Note that any comparisons that do not satisfy this requirement invoke undefined behavior, meaning (among other things) that you can't depend on the results to be repeatable.
In your particular case, for both the comparison between the addresses of two local variables and between the address of a local and a dynamic address, the operation appeared to "work", however the result could change by making a seemingly unrelated change to your code or even compiling the same code with different optimization settings. With undefined behavior, just because the code could crash or generate an error doesn't mean it will.
As an example, an x86 processor running in 8086 real mode has a segmented memory model using a 16-bit segment and a 16-bit offset to build a 20-bit address. So in this case an address doesn't convert exactly to an integer.
The equality operators == and != however do not have this restriction. They can be used between any two pointers to compatible types or NULL pointers. So using == or != in both of your examples would produce valid C code.
However, even with == and != you could get some unexpected yet still well-defined results. See Can an equality comparison of unrelated pointers evaluate to true? for more details on this.
Regarding the exam question given by your professor, it makes a number of flawed assumptions:
A flat memory model exists where there is a 1-to-1 correspondence between an address and an integer value.
That the converted pointer values fit inside an integer type.
That the implementation simply treats pointers as integers when performing comparisons without exploiting the freedom given by undefined behavior.
That a stack is used and that local variables are stored there.
That a heap is used to pull allocated memory from.
That the stack (and therefore local variables) appears at a higher address than the heap (and therefore allocated objects).
That string constants appear at a lower address then the heap.
If you were to run this code on an architecture and/or with a compiler that does not satisfy these assumptions then you could get very different results.
Also, both examples also exhibit undefined behavior when they call strcpy, since the right operand (in some cases) points to a single character and not a null terminated string, resulting in the function reading past the bounds of the given variable.
The primary issue with comparing pointers to two distinct arrays of the same type is that the arrays themselves need not be placed in a particular relative positioning--one could end up before and after the other.
First of all, I thought I would get undefined or some type or error, because pt an px aren't pointing to the same array (at least in my understanding).
No, the result is dependent on implementation and other unpredictable factors.
Also is pt>px because both pointers are pointing to variables stored on the stack, and the stack grows down, so the memory address of t is greater than that of x? Which is why pt>px is true?
There isn't necessarily a stack. When it exists, it need not to grow down. It could grow up. It could be non-contiguous in some bizarre way.
Moreover, I think pointer arithmetic between two pointers is fine, no matter where they individually point because the arithmetic is just using the memory addresses the pointers store.
Let's look at the C specification, §6.5.8 on page 85 which discusses relational operators (i.e. the comparison operators you're using). Note that this does not apply to direct != or == comparison.
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. ... If the objects pointed to are members of the same aggregate object, ... pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values.
In all other cases, the behavior is undefined.
The last sentence is important. While I cut down some unrelated cases to save space, there's one case that's important to us: two arrays, not part of the same struct/aggregate object1, and we're comparing pointers to those two arrays. This is undefined behavior.
While your compiler just inserted some sort of CMP (compare) machine instruction which numerically compares the pointers, and you got lucky here, UB is a pretty dangerous beast. Literally anything can happen--your compiler could optimize out the whole function including visible side effects. It could spawn nasal demons.
1Pointers into two different arrays that are part of the same struct can be compared, since this falls under the clause where the two arrays are part of the same aggregate object (the struct).
Then asked what
p[0].p0 < p[0].p1
p[1].p0 < p[1].p1
p[2].p0 < p[2].p1
Evaluate to. The answer is 0, 1, and 0.
These questions reduce to:
Is the heap above or below the stack.
Is the heap above or below the string literal section of the program.
same as [1].
And the answer to all three is "implementation defined". Your prof's questions are bogus; they have based it in traditional unix layout:
<empty>
text
rodata
rwdata
bss
< empty, used for heap >
...
stack
kernel
but several modern unices (and alternative systems) do not conform to those traditions. Unless they prefaced the question with " as of 1992 "; make sure to give a -1 on the eval.
On almost any remotely-modern platform, pointers and integers have an isomorphic ordering relation, and pointers to disjoint objects are not interleaved. Most compilers expose this ordering to programmers when optimizations are disabled, but the Standard makes no distinction between platforms that have such an ordering and those that don't and does not require that any implementations expose such an ordering to the programmer even on platforms that would define it. Consequently, some compiler writers perform various kinds of optimizations and "optimizations" based upon an assumption that code will never compare use relational operators on pointers to different objects.
According to the published Rationale, the authors of the Standard intended that implementations extend the language by specifying how they will behave in situations the Standard characterizes as "Undefined Behavior" (i.e. where the Standard imposes no requirements) when doing so would be useful and practical, but some compiler writers would rather assume programs will never try to benefit from anything beyond what the Standard mandates, than allow programs to usefully exploit behaviors the platforms could support at no extra cost.
I'm not aware of any commercially-designed compilers that do anything weird with pointer comparisons, but as compilers move to the non-commercial LLVM for their back end, they're increasingly likely to process nonsensically code whose behavior had been specified by earlier compilers for their platforms. Such behavior isn't limited to relational operators, but can even affect equality/inequality. For example, even though the Standard specifies that a comparison between a pointer to one object and a "just past" pointer to an immediately-preceding object will compare equal, gcc and LLVM-based compilers are prone to generate nonsensical code if programs perform such comparisons.
As an example of a situation where even equality comparison behaves nonsensically in gcc and clang, consider:
extern int x[],y[];
int test(int i)
{
int *p = y+i;
y[0] = 4;
if (p == x+10)
*p = 1;
return y[0];
}
Both clang and gcc will generate code that will always return 4 even if x is ten elements, y immediately follows it, and i is zero resulting in the comparison being true and p[0] being written with the value 1. I think what happens is that one pass of optimization rewrites the function as though *p = 1; were replaced with x[10] = 1;. The latter code would be equivalent if the compiler interpreted *(x+10) as equivalent to *(y+i), but unfortunately a downstream optimization stage recognizes that an access to x[10] would only defined if x had at least 11 elements, which would make it impossible for that access to affect y.
If compilers can get that "creative" with pointer equality scenario which is described by the Standard, I would not trust them to refrain from getting even more creative in cases where the Standard doesn't impose requirements.
It's simple: Comparing pointers does not make sense as memory locations for objects are never guaranteed to be in the same order as you declared them.
The exception is arrays. &array[0] is lower than &array[1]. Thats what K&R points out. In practice struct member addresses are also in the order you declare them in my experience. No guarantees on that....
Another exception is if you compare a pointer for equal. When one pointer is equal to another you know it's pointing to the same object. Whatever it is.
Bad exam question if you ask me. Depending on Ubuntu Linux 16.04, 64-bit version programming environment for an exam question ? Really ?
Pointers are just integers, like everything else in a computer. You absolutely can compare them with < and > and produce results without causing a program to crash. That said, the standard does not guarantee that those results have any meaning outside of array comparisons.
In your example of stack allocated variables, the compiler is free to allocate those variables to registers or stack memory addresses, and in any order it so choose. Comparisons such as < and > therefore won't be consistent across compilers or architectures. However, == and != aren't so restricted, comparing pointer equality is a valid and useful operation.
What A Provocative Question!
Even cursory scanning of the responses and comments in this thread will reveal how emotive your seemingly simple and straight forward query turns out to be.
It should not be surprising.
Inarguably, misunderstandings around the concept and use of pointers represents a predominant cause of serious failures in programming in general.
Recognition of this reality is readily evident in the ubiquity of languages designed specifically to address, and preferably to avoid the challenges pointers introduce altogether. Think C++ and other derivatives of C, Java and its relations, Python and other scripts -- merely as the more prominent and prevalent ones, and more or less ordered in severity of dealing with the issue.
Developing a deeper understanding of the principles underlying, therefore must be pertinent to every individual that aspires to excellence in programming -- especially at the systems level.
I imagine this is precisely what your teacher means to demonstrate.
And the nature of C makes it a convenient vehicle for this exploration. Less clearly than assembly -- though perhaps more readily comprehensible -- and still far more explicitly than languages based on deeper abstraction of the execution environment.
Designed to facilitate deterministic translation of the programmer’s intent into instructions that machines can comprehend, C is a system level language. While classified as high-level, it really belongs in a ‘medium’ category; but since none such exists, the ‘system’ designation has to suffice.
This characteristic is largely responsible for making it a language of choice for device drivers, operating system code, and embedded implementations. Furthermore, a deservedly favoured alternative in applications where optimal efficiency is paramount; where that means the difference between survival and extinction, and therefore is a necessity as opposed to a luxury. In such instances, the attractive convenience of portability loses all its allure, and opting for the lack-lustre performance of the least common denominator becomes an unthinkably detrimental option.
What makes C -- and some of its derivatives -- quite special, is that it allows its users complete control -- when that is what they desire -- without imposing the related responsibilities upon them when they do not. Nevertheless, it never offers more than the thinnest of insulations from the machine, wherefore proper use demands exacting comprehension of the concept of pointers.
In essence, the answer to your question is sublimely simple and satisfyingly sweet -- in confirmation of your suspicions. Provided, however, that one attaches the requisite significance to every concept in this statement:
The acts of examining, comparing and manipulating pointers are always and necessarily valid, while the conclusions derived from the result depends on the validity of the values contained, and thus need not be.
The former is both invariably safe and potentially proper, while the latter can only ever be proper when it has been established as safe. Surprisingly -- to some -- so establishing the validity of the latter depends on and demands the former.
Of course, part of the confusion arises from the effect of the recursion inherently present within the principle of a pointer -- and the challenges posed in differentiating content from address.
You have quite correctly surmised,
I'm being led to think that any pointer can be compared with any other pointer, regardless of where they individually point. Moreover, I think pointer arithmetic between two pointers is fine, no matter where they individually point because the arithmetic is just using the memory addresses the pointers store.
And several contributors have affirmed: pointers are just numbers. Sometimes something closer to complex numbers, but still no more than numbers.
The amusing acrimony in which this contention has been received here reveals more about human nature than programming, but remains worthy of note and elaboration. Perhaps we will do so later...
As one comment begins to hint; all this confusion and consternation derives from the need to discern what is valid from what is safe, but that is an oversimplification. We must also distinguish what is functional and what is reliable, what is practical and what may be proper, and further still: what is proper in a particular circumstance from what may be proper in a more general sense. Not to mention; the difference between conformity and propriety.
Toward that end, we first need to appreciate precisely what a pointer is.
You have demonstrated a firm grip on the concept, and like some others may find these illustrations patronizingly simplistic, but the level of confusion evident here demands such simplicity in clarification.
As several have pointed out: the term pointer is merely a special name for what is simply an index, and thus nothing more than any other number.
This should already be self-evident in consideration of the fact that all contemporary mainstream computers are binary machines that necessarily work exclusively with and on numbers. Quantum computing may change that, but that is highly unlikely, and it has not come of age.
Technically, as you have noted, pointers are more accurately addresses; an obvious insight that naturally introduces the rewarding analogy of correlating them with the ‘addresses’ of houses, or plots on a street.
In a flat memory model: the entire system memory is organized in a single, linear sequence: all houses in the city lie on the same road, and every house is uniquely identified by its number alone. Delightfully simple.
In segmented schemes: a hierarchical organization of numbered roads is introduced above that of numbered houses so that composite addresses are required.
Some implementations are still more convoluted, and the totality of distinct ‘roads’ need not sum to a contiguous sequence, but none of that changes anything about the underlying.
We are necessarily able to decompose every such hierarchical link back into a flat organization. The more complex the organization, the more hoops we will have to hop through in order to do so, but it must be possible. Indeed, this also applies to ‘real mode’ on x86.
Otherwise the mapping of links to locations would not be bijective, as reliable execution -- at the system level -- demands that it MUST be.
multiple addresses must not map to singular memory locations, and
singular addresses must never map to multiple memory locations.
Bringing us to the further twist that turns the conundrum into such a fascinatingly complicated tangle. Above, it was expedient to suggest that pointers are addresses, for the sake of simplicity and clarity. Of course, this is not correct. A pointer is not an address; a pointer is a reference to an address, it contains an address. Like the envelope sports a reference to the house. Contemplating this may lead you to glimpse what was meant with the suggestion of recursion contained in the concept. Still; we have only so many words, and talking about the addresses of references to addresses and such, soon stalls most brains at an invalid op-code exception. And for the most part, intent is readily garnered from context, so let us return to the street.
Postal workers in this imaginary city of ours are much like the ones we find in the ‘real’ world. No one is likely to suffer a stroke when you talk or enquire about an invalid address, but every last one will balk when you ask them to act on that information.
Suppose there are only 20 houses on our singular street. Further pretend that some misguided, or dyslexic soul has directed a letter, a very important one, to number 71. Now, we can ask our carrier Frank, whether there is such an address, and he will simply and calmly report: no. We can even expect him to estimate how far outside the street this location would lie if it did exist: roughly 2.5 times further than the end. None of this will cause him any exasperation. However, if we were to ask him to deliver this letter, or to pick up an item from that place, he is likely to be quite frank about his displeasure, and refusal to comply.
Pointers are just addresses, and addresses are just numbers.
Verify the output of the following:
void foo( void *p ) {
printf(“%p\t%zu\t%d\n”, p, (size_t)p, p == (size_t)p);
}
Call it on as many pointers as you like, valid or not. Please do post your findings if it fails on your platform, or your (contemporary) compiler complains.
Now, because pointers are simply numbers, it is inevitably valid to compare them. In one sense this is precisely what your teacher is demonstrating. All of the following statements are perfectly valid -- and proper! -- C, and when compiled will run without encountering problems, even though neither pointer need be initialized and the values they contain therefore may be undefined:
We are only calculating result explicitly for the sake of clarity, and printing it to force the compiler to compute what would otherwise be redundant, dead code.
void foo( size_t *a, size_t *b ) {
size_t result;
result = (size_t)a;
printf(“%zu\n”, result);
result = a == b;
printf(“%zu\n”, result);
result = a < b;
printf(“%zu\n”, result);
result = a - b;
printf(“%zu\n”, result);
}
Of course, the program is ill-formed when either a or b is undefined (read: not properly initialized) at the point of testing, but that is utterly irrelevant to this part of our discussion. These snippets, as too the following statements, are guaranteed -- by the ‘standard’ -- to compile and run flawlessly, notwithstanding the IN-validity of any pointer involved.
Problems only arise when an invalid pointer is dereferenced. When we ask Frank to pick up or deliver at the invalid, non-existent address.
Given any arbitrary pointer:
int *p;
While this statement must compile and run:
printf(“%p”, p);
... as must this:
size_t foo( int *p ) { return (size_t)p; }
... the following two, in stark contrast, will still readily compile, but fail in execution unless the pointer is valid -- by which we here merely mean that it references an address to which the present application has been granted access:
printf(“%p”, *p);
size_t foo( int *p ) { return *p; }
How subtle the change? The distinction lies in the difference between the value of the pointer -- which is the address, and the value of the contents: of the house at that number. No problem arises until the pointer is dereferenced; until an attempt is made to access the address it links to. In trying to deliver or pick up the package beyond the stretch of the road...
By extension, the same principle necessarily applies to more complex examples, including the aforementioned need to establish the requisite validity:
int* validate( int *p, int *head, int *tail ) {
return p >= head && p <= tail ? p : NULL;
}
Relational comparison and arithmetic offer identical utility to testing equivalence, and are equivalently valid -- in principle. However, what the results of such computation would signify, is a different matter entirely -- and precisely the issue addressed by the quotations you included.
In C, an array is a contiguous buffer, an uninterrupted linear series of memory locations. Comparison and arithmetic applied to pointers that reference locations within such a singular series are naturally, and obviously meaningful in relation both to each other, and to this ‘array’ (which is simply identified by the base). Precisely the same applies to every block allocated through malloc, or sbrk. Because these relationships are implicit, the compiler is able to establish valid relationships between them, and therefore can be confident that calculations will provide the answers anticipated.
Performing similar gymnastics on pointers that reference distinct blocks or arrays do not offer any such inherent, and apparent utility. The more so since whatever relation exists at one moment may be invalidated by a reallocation that follows, wherein that is highly likely to change, even be inverted. In such instances the compiler is unable to obtain the necessary information to establish the confidence it had in the previous situation.
You, however, as the programmer, may have such knowledge! And in some instances are obliged to exploit that.
There ARE, therefore, circumstances in which EVEN THIS is entirely VALID and perfectly PROPER.
In fact, that is exactly what malloc itself has to do internally when time comes to try merging reclaimed blocks -- on the vast majority of architectures. The same is true for the operating system allocator, like that behind sbrk; if more obviously, frequently, on more disparate entities, more critically -- and relevant also on platforms where this malloc may not be. And how many of those are not written in C?
The validity, security and success of an action is inevitably the consequence of the level of insight upon which it is premised and applied.
In the quotes you have offered, Kernighan and Ritchie are addressing a closely related, but nonetheless separate issue. They are defining the limitations of the language, and explaining how you may exploit the capabilities of the compiler to protect you by at least detecting potentially erroneous constructs. They are describing the lengths the mechanism is able -- is designed -- to go to in order to assist you in your programming task. The compiler is your servant, you are the master. A wise master, however, is one that is intimately familiar with the capabilities of his various servants.
Within this context, undefined behaviour serves to indicate potential danger and the possibility of harm; not to imply imminent, irreversible doom, or the end of the world as we know it. It simply means that we -- ‘meaning the compiler’ -- are not able to make any conjecture about what this thing may be, or represent and for this reason we choose to wash our hands of the matter. We will not be held accountable for any misadventure that may result from the use, or mis-use of this facility.
In effect, it simply says: ‘Beyond this point, cowboy: you are on your own...’
Your professor is seeking to demonstrate the finer nuances to you.
Notice what great care they have taken in crafting their example; and how brittle it still is. By taking the address of a, in
p[0].p0 = &a;
the compiler is coerced into allocating actual storage for the variable, rather than placing it in a register. It being an automatic variable, however, the programmer has no control over where that is assigned, and so unable to make any valid conjecture about what would follow it. Which is why a must be set equal to zero for the code to work as expected.
Merely changing this line:
char a = 0;
to this:
char a = 1; // or ANY other value than 0
causes the behaviour of the program to become undefined. At minimum, the first answer will now be 1; but the problem is far more sinister.
Now the code is inviting of disaster.
While still perfectly valid and even conforming to the standard, it now is ill-formed and although sure to compile, may fail in execution on various grounds. For now there are multiple problems -- none of which the compiler is able to recognize.
strcpy will start at the address of a, and proceed beyond this to consume -- and transfer -- byte after byte, until it encounters a null.
The p1 pointer has been initialized to a block of exactly 10 bytes.
If a happens to be placed at the end of a block and the process has no access to what follows, the very next read -- of p0[1] -- will elicit a segfault. This scenario is unlikely on the x86 architecture, but possible.
If the area beyond the address of a is accessible, no read error will occur, but the program still is not saved from misfortune.
If a zero byte happens to occur within the ten starting at the address of a, it may still survive, for then strcpy will stop and at least we will not suffer a write violation.
If it is not faulted for reading amiss, but no zero byte occurs in this span of 10, strcpy will continue and attempt to write beyond the block allocated by malloc.
If this area is not owned by the process, the segfault should immediately be triggered.
The still more disastrous -- and subtle --- situation arises when the following block is owned by the process, for then the error cannot be detected, no signal can be raised, and so it may ‘appear’ still to ‘work’, while it actually will be overwriting other data, your allocator’s management structures, or even code (in certain operating environments).
This is why pointer related bugs can be so hard to track. Imagine these lines buried deep within thousands of lines of intricately related code, that someone else has written, and you are directed to delve through.
Nevertheless, the program must still compile, for it remains perfectly valid and standard conformant C.
These kinds of errors, no standard and no compiler can protect the unwary against. I imagine that is exactly what they are intending to teach you.
Paranoid people constantly seek to change the nature of C to dispose of these problematic possibilities and so save us from ourselves; but that is disingenuous. This is the responsibility we are obliged to accept when we choose to pursue the power and obtain the liberty that more direct and comprehensive control of the machine offers us. Promoters and pursuers of perfection in performance will never accept anything less.
Portability and the generality it represents is a fundamentally separate consideration and all that the standard seeks to address:
This document specifies the form and establishes the interpretation of programs expressed in the programming language C. Its purpose is to promote portability, reliability, maintainability, and efficient execution of C language programs on a variety of computing systems.
Which is why it is perfectly proper to keep it distinct from the definition and technical specification of the language itself. Contrary to what many seem to believe generality is antithetical to exceptional and exemplary.
To conclude:
Examining and manipulating pointers themselves is invariably valid and often fruitful. Interpretation of the results, may, or may not be meaningful, but calamity is never invited until the pointer is dereferenced; until an attempt is made to access the address linked to.
Were this not true, programming as we know it -- and love it -- would not have been possible.

Is it ok to compare pointers in C if they are members of a struct? [duplicate]

In K&R (The C Programming Language 2nd Edition) chapter 5 I read the following:
First, pointers may be compared under certain circumstances.
If p and q point to members of the same array, then relations like ==, !=, <, >=, etc. work properly.
Which seems to imply that only pointers pointing to the same array can be compared.
However when I tried this code
char t = 't';
char *pt = &t;
char x = 'x';
char *px = &x;
printf("%d\n", pt > px);
1 is printed to the screen.
First of all, I thought I would get undefined or some type or error, because pt and px aren't pointing to the same array (at least in my understanding).
Also is pt > px because both pointers are pointing to variables stored on the stack, and the stack grows down, so the memory address of t is greater than that of x? Which is why pt > px is true?
I get more confused when malloc is brought in. Also in K&R in chapter 8.7 the following is written:
There is still one assumption, however, that pointers to different blocks returned by sbrk can be meaningfully compared. This is not guaranteed by the standard which permits pointer comparisons only within an array. Thus this version of malloc is portable only among machines for which the general pointer comparison is meaningful.
I had no issue comparing pointers that pointed to space malloced on the heap to pointers that pointed to stack variables.
For example, the following code worked fine, with 1 being printed:
char t = 't';
char *pt = &t;
char *px = malloc(10);
strcpy(px, pt);
printf("%d\n", pt > px);
Based on my experiments with my compiler, I'm being led to think that any pointer can be compared with any other pointer, regardless of where they individually point. Moreover, I think pointer arithmetic between two pointers is fine, no matter where they individually point because the arithmetic is just using the memory addresses the pointers store.
Still, I am confused by what I am reading in K&R.
The reason I'm asking is because my prof. actually made it an exam question. He gave the following code:
struct A {
char *p0;
char *p1;
};
int main(int argc, char **argv) {
char a = 0;
char *b = "W";
char c[] = [ 'L', 'O', 'L', 0 ];
struct A p[3];
p[0].p0 = &a;
p[1].p0 = b;
p[2].p0 = c;
for(int i = 0; i < 3; i++) {
p[i].p1 = malloc(10);
strcpy(p[i].p1, p[i].p0);
}
}
What do these evaluate to:
p[0].p0 < p[0].p1
p[1].p0 < p[1].p1
p[2].p0 < p[2].p1
The answer is 0, 1, and 0.
(My professor does include the disclaimer on the exam that the questions are for a Ubuntu Linux 16.04, 64-bit version programming environment)
(editor's note: if SO allowed more tags, that last part would warrant x86-64, linux, and maybe assembly. If the point of the question / class was specifically low-level OS implementation details, rather than portable C.)
According to the C11 standard, the relational operators <, <=, >, and >= may only be used on pointers to elements of the same array or struct object. This is spelled out in section 6.5.8p5:
When two pointers are compared, the result depends on the
relative locations in the address space of the objects pointed to.
If two pointers to object types both point to the same object, or
both point one past the last element of the same array
object, they compare equal. If the objects pointed to are
members of the same aggregate object,pointers to structure
members declared later compare greater than pointers to
members declared earlier in the structure, and pointers to
array elements with larger subscript values compare greater than
pointers to elements of the same array with lower subscript values.
All pointers to members of the same union object compare
equal. If the expression P points to an element of an array
object and the expression Q points to the last element of the
same array object, the pointer expression Q+1 compares greater than P.
In all other cases, the behavior is undefined.
Note that any comparisons that do not satisfy this requirement invoke undefined behavior, meaning (among other things) that you can't depend on the results to be repeatable.
In your particular case, for both the comparison between the addresses of two local variables and between the address of a local and a dynamic address, the operation appeared to "work", however the result could change by making a seemingly unrelated change to your code or even compiling the same code with different optimization settings. With undefined behavior, just because the code could crash or generate an error doesn't mean it will.
As an example, an x86 processor running in 8086 real mode has a segmented memory model using a 16-bit segment and a 16-bit offset to build a 20-bit address. So in this case an address doesn't convert exactly to an integer.
The equality operators == and != however do not have this restriction. They can be used between any two pointers to compatible types or NULL pointers. So using == or != in both of your examples would produce valid C code.
However, even with == and != you could get some unexpected yet still well-defined results. See Can an equality comparison of unrelated pointers evaluate to true? for more details on this.
Regarding the exam question given by your professor, it makes a number of flawed assumptions:
A flat memory model exists where there is a 1-to-1 correspondence between an address and an integer value.
That the converted pointer values fit inside an integer type.
That the implementation simply treats pointers as integers when performing comparisons without exploiting the freedom given by undefined behavior.
That a stack is used and that local variables are stored there.
That a heap is used to pull allocated memory from.
That the stack (and therefore local variables) appears at a higher address than the heap (and therefore allocated objects).
That string constants appear at a lower address then the heap.
If you were to run this code on an architecture and/or with a compiler that does not satisfy these assumptions then you could get very different results.
Also, both examples also exhibit undefined behavior when they call strcpy, since the right operand (in some cases) points to a single character and not a null terminated string, resulting in the function reading past the bounds of the given variable.
The primary issue with comparing pointers to two distinct arrays of the same type is that the arrays themselves need not be placed in a particular relative positioning--one could end up before and after the other.
First of all, I thought I would get undefined or some type or error, because pt an px aren't pointing to the same array (at least in my understanding).
No, the result is dependent on implementation and other unpredictable factors.
Also is pt>px because both pointers are pointing to variables stored on the stack, and the stack grows down, so the memory address of t is greater than that of x? Which is why pt>px is true?
There isn't necessarily a stack. When it exists, it need not to grow down. It could grow up. It could be non-contiguous in some bizarre way.
Moreover, I think pointer arithmetic between two pointers is fine, no matter where they individually point because the arithmetic is just using the memory addresses the pointers store.
Let's look at the C specification, §6.5.8 on page 85 which discusses relational operators (i.e. the comparison operators you're using). Note that this does not apply to direct != or == comparison.
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. ... If the objects pointed to are members of the same aggregate object, ... pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values.
In all other cases, the behavior is undefined.
The last sentence is important. While I cut down some unrelated cases to save space, there's one case that's important to us: two arrays, not part of the same struct/aggregate object1, and we're comparing pointers to those two arrays. This is undefined behavior.
While your compiler just inserted some sort of CMP (compare) machine instruction which numerically compares the pointers, and you got lucky here, UB is a pretty dangerous beast. Literally anything can happen--your compiler could optimize out the whole function including visible side effects. It could spawn nasal demons.
1Pointers into two different arrays that are part of the same struct can be compared, since this falls under the clause where the two arrays are part of the same aggregate object (the struct).
Then asked what
p[0].p0 < p[0].p1
p[1].p0 < p[1].p1
p[2].p0 < p[2].p1
Evaluate to. The answer is 0, 1, and 0.
These questions reduce to:
Is the heap above or below the stack.
Is the heap above or below the string literal section of the program.
same as [1].
And the answer to all three is "implementation defined". Your prof's questions are bogus; they have based it in traditional unix layout:
<empty>
text
rodata
rwdata
bss
< empty, used for heap >
...
stack
kernel
but several modern unices (and alternative systems) do not conform to those traditions. Unless they prefaced the question with " as of 1992 "; make sure to give a -1 on the eval.
On almost any remotely-modern platform, pointers and integers have an isomorphic ordering relation, and pointers to disjoint objects are not interleaved. Most compilers expose this ordering to programmers when optimizations are disabled, but the Standard makes no distinction between platforms that have such an ordering and those that don't and does not require that any implementations expose such an ordering to the programmer even on platforms that would define it. Consequently, some compiler writers perform various kinds of optimizations and "optimizations" based upon an assumption that code will never compare use relational operators on pointers to different objects.
According to the published Rationale, the authors of the Standard intended that implementations extend the language by specifying how they will behave in situations the Standard characterizes as "Undefined Behavior" (i.e. where the Standard imposes no requirements) when doing so would be useful and practical, but some compiler writers would rather assume programs will never try to benefit from anything beyond what the Standard mandates, than allow programs to usefully exploit behaviors the platforms could support at no extra cost.
I'm not aware of any commercially-designed compilers that do anything weird with pointer comparisons, but as compilers move to the non-commercial LLVM for their back end, they're increasingly likely to process nonsensically code whose behavior had been specified by earlier compilers for their platforms. Such behavior isn't limited to relational operators, but can even affect equality/inequality. For example, even though the Standard specifies that a comparison between a pointer to one object and a "just past" pointer to an immediately-preceding object will compare equal, gcc and LLVM-based compilers are prone to generate nonsensical code if programs perform such comparisons.
As an example of a situation where even equality comparison behaves nonsensically in gcc and clang, consider:
extern int x[],y[];
int test(int i)
{
int *p = y+i;
y[0] = 4;
if (p == x+10)
*p = 1;
return y[0];
}
Both clang and gcc will generate code that will always return 4 even if x is ten elements, y immediately follows it, and i is zero resulting in the comparison being true and p[0] being written with the value 1. I think what happens is that one pass of optimization rewrites the function as though *p = 1; were replaced with x[10] = 1;. The latter code would be equivalent if the compiler interpreted *(x+10) as equivalent to *(y+i), but unfortunately a downstream optimization stage recognizes that an access to x[10] would only defined if x had at least 11 elements, which would make it impossible for that access to affect y.
If compilers can get that "creative" with pointer equality scenario which is described by the Standard, I would not trust them to refrain from getting even more creative in cases where the Standard doesn't impose requirements.
It's simple: Comparing pointers does not make sense as memory locations for objects are never guaranteed to be in the same order as you declared them.
The exception is arrays. &array[0] is lower than &array[1]. Thats what K&R points out. In practice struct member addresses are also in the order you declare them in my experience. No guarantees on that....
Another exception is if you compare a pointer for equal. When one pointer is equal to another you know it's pointing to the same object. Whatever it is.
Bad exam question if you ask me. Depending on Ubuntu Linux 16.04, 64-bit version programming environment for an exam question ? Really ?
Pointers are just integers, like everything else in a computer. You absolutely can compare them with < and > and produce results without causing a program to crash. That said, the standard does not guarantee that those results have any meaning outside of array comparisons.
In your example of stack allocated variables, the compiler is free to allocate those variables to registers or stack memory addresses, and in any order it so choose. Comparisons such as < and > therefore won't be consistent across compilers or architectures. However, == and != aren't so restricted, comparing pointer equality is a valid and useful operation.
What A Provocative Question!
Even cursory scanning of the responses and comments in this thread will reveal how emotive your seemingly simple and straight forward query turns out to be.
It should not be surprising.
Inarguably, misunderstandings around the concept and use of pointers represents a predominant cause of serious failures in programming in general.
Recognition of this reality is readily evident in the ubiquity of languages designed specifically to address, and preferably to avoid the challenges pointers introduce altogether. Think C++ and other derivatives of C, Java and its relations, Python and other scripts -- merely as the more prominent and prevalent ones, and more or less ordered in severity of dealing with the issue.
Developing a deeper understanding of the principles underlying, therefore must be pertinent to every individual that aspires to excellence in programming -- especially at the systems level.
I imagine this is precisely what your teacher means to demonstrate.
And the nature of C makes it a convenient vehicle for this exploration. Less clearly than assembly -- though perhaps more readily comprehensible -- and still far more explicitly than languages based on deeper abstraction of the execution environment.
Designed to facilitate deterministic translation of the programmer’s intent into instructions that machines can comprehend, C is a system level language. While classified as high-level, it really belongs in a ‘medium’ category; but since none such exists, the ‘system’ designation has to suffice.
This characteristic is largely responsible for making it a language of choice for device drivers, operating system code, and embedded implementations. Furthermore, a deservedly favoured alternative in applications where optimal efficiency is paramount; where that means the difference between survival and extinction, and therefore is a necessity as opposed to a luxury. In such instances, the attractive convenience of portability loses all its allure, and opting for the lack-lustre performance of the least common denominator becomes an unthinkably detrimental option.
What makes C -- and some of its derivatives -- quite special, is that it allows its users complete control -- when that is what they desire -- without imposing the related responsibilities upon them when they do not. Nevertheless, it never offers more than the thinnest of insulations from the machine, wherefore proper use demands exacting comprehension of the concept of pointers.
In essence, the answer to your question is sublimely simple and satisfyingly sweet -- in confirmation of your suspicions. Provided, however, that one attaches the requisite significance to every concept in this statement:
The acts of examining, comparing and manipulating pointers are always and necessarily valid, while the conclusions derived from the result depends on the validity of the values contained, and thus need not be.
The former is both invariably safe and potentially proper, while the latter can only ever be proper when it has been established as safe. Surprisingly -- to some -- so establishing the validity of the latter depends on and demands the former.
Of course, part of the confusion arises from the effect of the recursion inherently present within the principle of a pointer -- and the challenges posed in differentiating content from address.
You have quite correctly surmised,
I'm being led to think that any pointer can be compared with any other pointer, regardless of where they individually point. Moreover, I think pointer arithmetic between two pointers is fine, no matter where they individually point because the arithmetic is just using the memory addresses the pointers store.
And several contributors have affirmed: pointers are just numbers. Sometimes something closer to complex numbers, but still no more than numbers.
The amusing acrimony in which this contention has been received here reveals more about human nature than programming, but remains worthy of note and elaboration. Perhaps we will do so later...
As one comment begins to hint; all this confusion and consternation derives from the need to discern what is valid from what is safe, but that is an oversimplification. We must also distinguish what is functional and what is reliable, what is practical and what may be proper, and further still: what is proper in a particular circumstance from what may be proper in a more general sense. Not to mention; the difference between conformity and propriety.
Toward that end, we first need to appreciate precisely what a pointer is.
You have demonstrated a firm grip on the concept, and like some others may find these illustrations patronizingly simplistic, but the level of confusion evident here demands such simplicity in clarification.
As several have pointed out: the term pointer is merely a special name for what is simply an index, and thus nothing more than any other number.
This should already be self-evident in consideration of the fact that all contemporary mainstream computers are binary machines that necessarily work exclusively with and on numbers. Quantum computing may change that, but that is highly unlikely, and it has not come of age.
Technically, as you have noted, pointers are more accurately addresses; an obvious insight that naturally introduces the rewarding analogy of correlating them with the ‘addresses’ of houses, or plots on a street.
In a flat memory model: the entire system memory is organized in a single, linear sequence: all houses in the city lie on the same road, and every house is uniquely identified by its number alone. Delightfully simple.
In segmented schemes: a hierarchical organization of numbered roads is introduced above that of numbered houses so that composite addresses are required.
Some implementations are still more convoluted, and the totality of distinct ‘roads’ need not sum to a contiguous sequence, but none of that changes anything about the underlying.
We are necessarily able to decompose every such hierarchical link back into a flat organization. The more complex the organization, the more hoops we will have to hop through in order to do so, but it must be possible. Indeed, this also applies to ‘real mode’ on x86.
Otherwise the mapping of links to locations would not be bijective, as reliable execution -- at the system level -- demands that it MUST be.
multiple addresses must not map to singular memory locations, and
singular addresses must never map to multiple memory locations.
Bringing us to the further twist that turns the conundrum into such a fascinatingly complicated tangle. Above, it was expedient to suggest that pointers are addresses, for the sake of simplicity and clarity. Of course, this is not correct. A pointer is not an address; a pointer is a reference to an address, it contains an address. Like the envelope sports a reference to the house. Contemplating this may lead you to glimpse what was meant with the suggestion of recursion contained in the concept. Still; we have only so many words, and talking about the addresses of references to addresses and such, soon stalls most brains at an invalid op-code exception. And for the most part, intent is readily garnered from context, so let us return to the street.
Postal workers in this imaginary city of ours are much like the ones we find in the ‘real’ world. No one is likely to suffer a stroke when you talk or enquire about an invalid address, but every last one will balk when you ask them to act on that information.
Suppose there are only 20 houses on our singular street. Further pretend that some misguided, or dyslexic soul has directed a letter, a very important one, to number 71. Now, we can ask our carrier Frank, whether there is such an address, and he will simply and calmly report: no. We can even expect him to estimate how far outside the street this location would lie if it did exist: roughly 2.5 times further than the end. None of this will cause him any exasperation. However, if we were to ask him to deliver this letter, or to pick up an item from that place, he is likely to be quite frank about his displeasure, and refusal to comply.
Pointers are just addresses, and addresses are just numbers.
Verify the output of the following:
void foo( void *p ) {
printf(“%p\t%zu\t%d\n”, p, (size_t)p, p == (size_t)p);
}
Call it on as many pointers as you like, valid or not. Please do post your findings if it fails on your platform, or your (contemporary) compiler complains.
Now, because pointers are simply numbers, it is inevitably valid to compare them. In one sense this is precisely what your teacher is demonstrating. All of the following statements are perfectly valid -- and proper! -- C, and when compiled will run without encountering problems, even though neither pointer need be initialized and the values they contain therefore may be undefined:
We are only calculating result explicitly for the sake of clarity, and printing it to force the compiler to compute what would otherwise be redundant, dead code.
void foo( size_t *a, size_t *b ) {
size_t result;
result = (size_t)a;
printf(“%zu\n”, result);
result = a == b;
printf(“%zu\n”, result);
result = a < b;
printf(“%zu\n”, result);
result = a - b;
printf(“%zu\n”, result);
}
Of course, the program is ill-formed when either a or b is undefined (read: not properly initialized) at the point of testing, but that is utterly irrelevant to this part of our discussion. These snippets, as too the following statements, are guaranteed -- by the ‘standard’ -- to compile and run flawlessly, notwithstanding the IN-validity of any pointer involved.
Problems only arise when an invalid pointer is dereferenced. When we ask Frank to pick up or deliver at the invalid, non-existent address.
Given any arbitrary pointer:
int *p;
While this statement must compile and run:
printf(“%p”, p);
... as must this:
size_t foo( int *p ) { return (size_t)p; }
... the following two, in stark contrast, will still readily compile, but fail in execution unless the pointer is valid -- by which we here merely mean that it references an address to which the present application has been granted access:
printf(“%p”, *p);
size_t foo( int *p ) { return *p; }
How subtle the change? The distinction lies in the difference between the value of the pointer -- which is the address, and the value of the contents: of the house at that number. No problem arises until the pointer is dereferenced; until an attempt is made to access the address it links to. In trying to deliver or pick up the package beyond the stretch of the road...
By extension, the same principle necessarily applies to more complex examples, including the aforementioned need to establish the requisite validity:
int* validate( int *p, int *head, int *tail ) {
return p >= head && p <= tail ? p : NULL;
}
Relational comparison and arithmetic offer identical utility to testing equivalence, and are equivalently valid -- in principle. However, what the results of such computation would signify, is a different matter entirely -- and precisely the issue addressed by the quotations you included.
In C, an array is a contiguous buffer, an uninterrupted linear series of memory locations. Comparison and arithmetic applied to pointers that reference locations within such a singular series are naturally, and obviously meaningful in relation both to each other, and to this ‘array’ (which is simply identified by the base). Precisely the same applies to every block allocated through malloc, or sbrk. Because these relationships are implicit, the compiler is able to establish valid relationships between them, and therefore can be confident that calculations will provide the answers anticipated.
Performing similar gymnastics on pointers that reference distinct blocks or arrays do not offer any such inherent, and apparent utility. The more so since whatever relation exists at one moment may be invalidated by a reallocation that follows, wherein that is highly likely to change, even be inverted. In such instances the compiler is unable to obtain the necessary information to establish the confidence it had in the previous situation.
You, however, as the programmer, may have such knowledge! And in some instances are obliged to exploit that.
There ARE, therefore, circumstances in which EVEN THIS is entirely VALID and perfectly PROPER.
In fact, that is exactly what malloc itself has to do internally when time comes to try merging reclaimed blocks -- on the vast majority of architectures. The same is true for the operating system allocator, like that behind sbrk; if more obviously, frequently, on more disparate entities, more critically -- and relevant also on platforms where this malloc may not be. And how many of those are not written in C?
The validity, security and success of an action is inevitably the consequence of the level of insight upon which it is premised and applied.
In the quotes you have offered, Kernighan and Ritchie are addressing a closely related, but nonetheless separate issue. They are defining the limitations of the language, and explaining how you may exploit the capabilities of the compiler to protect you by at least detecting potentially erroneous constructs. They are describing the lengths the mechanism is able -- is designed -- to go to in order to assist you in your programming task. The compiler is your servant, you are the master. A wise master, however, is one that is intimately familiar with the capabilities of his various servants.
Within this context, undefined behaviour serves to indicate potential danger and the possibility of harm; not to imply imminent, irreversible doom, or the end of the world as we know it. It simply means that we -- ‘meaning the compiler’ -- are not able to make any conjecture about what this thing may be, or represent and for this reason we choose to wash our hands of the matter. We will not be held accountable for any misadventure that may result from the use, or mis-use of this facility.
In effect, it simply says: ‘Beyond this point, cowboy: you are on your own...’
Your professor is seeking to demonstrate the finer nuances to you.
Notice what great care they have taken in crafting their example; and how brittle it still is. By taking the address of a, in
p[0].p0 = &a;
the compiler is coerced into allocating actual storage for the variable, rather than placing it in a register. It being an automatic variable, however, the programmer has no control over where that is assigned, and so unable to make any valid conjecture about what would follow it. Which is why a must be set equal to zero for the code to work as expected.
Merely changing this line:
char a = 0;
to this:
char a = 1; // or ANY other value than 0
causes the behaviour of the program to become undefined. At minimum, the first answer will now be 1; but the problem is far more sinister.
Now the code is inviting of disaster.
While still perfectly valid and even conforming to the standard, it now is ill-formed and although sure to compile, may fail in execution on various grounds. For now there are multiple problems -- none of which the compiler is able to recognize.
strcpy will start at the address of a, and proceed beyond this to consume -- and transfer -- byte after byte, until it encounters a null.
The p1 pointer has been initialized to a block of exactly 10 bytes.
If a happens to be placed at the end of a block and the process has no access to what follows, the very next read -- of p0[1] -- will elicit a segfault. This scenario is unlikely on the x86 architecture, but possible.
If the area beyond the address of a is accessible, no read error will occur, but the program still is not saved from misfortune.
If a zero byte happens to occur within the ten starting at the address of a, it may still survive, for then strcpy will stop and at least we will not suffer a write violation.
If it is not faulted for reading amiss, but no zero byte occurs in this span of 10, strcpy will continue and attempt to write beyond the block allocated by malloc.
If this area is not owned by the process, the segfault should immediately be triggered.
The still more disastrous -- and subtle --- situation arises when the following block is owned by the process, for then the error cannot be detected, no signal can be raised, and so it may ‘appear’ still to ‘work’, while it actually will be overwriting other data, your allocator’s management structures, or even code (in certain operating environments).
This is why pointer related bugs can be so hard to track. Imagine these lines buried deep within thousands of lines of intricately related code, that someone else has written, and you are directed to delve through.
Nevertheless, the program must still compile, for it remains perfectly valid and standard conformant C.
These kinds of errors, no standard and no compiler can protect the unwary against. I imagine that is exactly what they are intending to teach you.
Paranoid people constantly seek to change the nature of C to dispose of these problematic possibilities and so save us from ourselves; but that is disingenuous. This is the responsibility we are obliged to accept when we choose to pursue the power and obtain the liberty that more direct and comprehensive control of the machine offers us. Promoters and pursuers of perfection in performance will never accept anything less.
Portability and the generality it represents is a fundamentally separate consideration and all that the standard seeks to address:
This document specifies the form and establishes the interpretation of programs expressed in the programming language C. Its purpose is to promote portability, reliability, maintainability, and efficient execution of C language programs on a variety of computing systems.
Which is why it is perfectly proper to keep it distinct from the definition and technical specification of the language itself. Contrary to what many seem to believe generality is antithetical to exceptional and exemplary.
To conclude:
Examining and manipulating pointers themselves is invariably valid and often fruitful. Interpretation of the results, may, or may not be meaningful, but calamity is never invited until the pointer is dereferenced; until an attempt is made to access the address linked to.
Were this not true, programming as we know it -- and love it -- would not have been possible.

What makes it possible for glibc malloc to compare pointers from different "objects"?

Comparing pointers with a relational operator (e.g. <, <=, >= or >) is only defined by the C standard when the pointers both point to within the same aggregate object (struct, array or union). This in practise means that a comparison in the shape of
if (start_object <= my_pointer && my_pointer < end_object+1) {
can be turned into
if (1) {
by an optimising compiler. Despite this, in K&R, section 8.7 "Example—A Storage Allocator", the authors make comparisons similar to the one above. They excuse this by saying
There is still one assumption, however, that pointers to different blocks returned by sbrk can be meaningfully compared. This is not guaranteed by the standard, which permits pointer comparisons only within an array. Thus this version of malloc is portable only among machines for which general pointer comparison is meaningful.
Furthermore, it appears the implementation of malloc used in glibc does the same thing!
What's worse is – the reason I stumbled across this to begin with is – for a school assignment I'm supposed to implement a rudimentary malloc like function, and the instructions for the assignment requires us to use the K&R code, but we have to replace the sbrk call with a call to mmap!
While comparing pointers from different sbrk calls is probably undefined, it is also only slightly dubious, since you have some sort of mental intuition that the returned pointers should come from sort of the same region of memory. Pointers returned by different mmap calls have, as far as I understand, no guarantee to even be remotely similar to each other, and consolidating/merging memory blocks across mmap calls should be highly illegal (and it appears glibc avoids this, resorting to only merging memory returned by sbrk or internally inside mmap pages, not across them), yet the assignment requires this.
Question: could someone shine some light on
Whether or not comparing pointers from different calls to sbrk may be optimised away, and
If so, what glibc does that lets them get away with it.
The language lawyer answer is (I believe) to be found in §6.5.8.5 of the C99 standard (or more precisely from ISO/IEC 9899:TC3 Committee Draft — Septermber 7, 2007 WG14/N1256 which is nearly identical but I don't have the original to hand) which has the following with regard to relational operators (i.e. <, <=, >, >=):
When two pointers are compared, the result depends on the relative locations in the address space of the objects pointed to. If two pointers to object or incomplete types both point to the same object, or both point one past the last element of the same array object, they compare equal. If the objects pointed to are members of the same aggregate object, pointers to structure members declared later compare greater than pointers to members declared earlier in the structure, and pointers to array elements with larger subscript values compare greater than pointers to elements of the same array with lower subscript values. All pointers to members of the same union object compare equal. If the expression P points to an element of an array object and the expression Q points to the last element of the same array object, the pointer expression Q+1 compares greater than P. In all other cases, the behavior is undefined.
(the C11 text is identical or near identical)
This at first seems unhelpful, or at least suggests the that the implementations each exploit undefined behaviour. I think, however, you can either rationalise the behaviour or use a work around.
The C pointers specified are either going to be NULL, or derived from taking the address of an object with &, or by pointer arithmetic, or by the result of some function. In the case concerned, they are derived by the result of the sbrk or mmap system calls. What do these systems calls really return? At a register level, they return an integer with the size uintptr_t (or intptr_t). It is in fact the system call interface which is casting them to a pointer. As we know casts between pointers and uintptr_t (or intptr_t) are by definition of the type bijective, we know we could cast the pointers to uintptr_t (for instance) and compare them, which will impose a well order relation on the pointers. The wikipedia link gives more information, but this will in essence ensure that every comparison is well defined as well as other useful properties such as a<b and b<c implies a<c. (I also can't choose an entirely arbitrary order as it would need to satisfy the other requirements of C99 §6.5.8.5 which pretty much leaves me with intptr_t and uintptr_t as candidates.)
We can exploit this to and write the (arguably better):
if ((uintptr_t)start_object <= (uintptr_t)my_pointer && (uintptr_t)my_pointer < (uintptr_t)(end_object+1)) {
There is a nit here. You'll note I casted to uintptr_t and not intptr_t. Why was that the right choice? In fact why did I not choose a rather bizarre ordering such as reversing the bits and comparing? The assumption here is that I'm chosing the same ordering as the kernel, specifically that my definition of < (given by the ordering) is such that the start and end of any allocated memory block will always be such that start < end. On all modern platforms I know, there is no 'wrap around' (e.g. the kernel will not allocate 32 bit memory starting at 0xffff8000 and ending at 0x00007ffff) - though note that similar wrap around has been exploited in the past.
The C standard specifies that pointer comparisons give undefined results under many circumstances. However, here you are building your own pointers out of integers returned by system calls. You can therefore either compare the integers, or compare the the pointers by casting them back to integers (exploiting the bijective nature of the cast). If you merely compare the pointers, you rely on the C compiler's implementation of pointer comparison being sane, which it almost certainly is, but is not guaranteed.
Are the possibilities I mention so obscure that they can be discounted? Nope, let's find a platform example where they might be important: 8086. It's possible to imagine an 8086 compilation model where every pointer is a 'far' pointer (i.e. contains a segment register). Pointer comparison could do a < or > on the segment register and only if they are equal do a < or > onto the offset. This would be entirely legitimate so long as all the structures in C99 §6.5.8.5 are in the same segment. But it won't work as one might expect between segments as 1000:1234 (which is equal to 1010:1134 in memory address) will appear smaller than 1010:0123. mmap here might well return results in different segments. Similarly one could think of another memory model where the segment register is actually a selector, and a pointer comparison uses a processor comparison instruction was used to compare memory addresses which aborts if an invalid selector or an offset outside a segment is used.
You ask two specific questions:
Whether or not comparing pointers from different calls to sbrk may be optimised away, and
If so, what glibc does that lets them get away with it.
In the formulation given above where start_object etc. are actually void *, then the calculation may be optimized away (i.e. might do what you want) but is not guaranteed to do so as the behaviour is undefined. A cast would guarantee that it does so provided the kernel uses the same well ordering as implied by the cast.
In answer to the second question, glibc is relying on a behaviour of the C compiler which is technically not required, but is very likely (per the above).
Note also (at least in the K&R in front of me) that the line you quote doesn't exist in the code. The caveat is in relation to the comparison of header * pointers with < (as I far as I can see comparison of void * pointers with < is always UB) which may derive from separate sbrk() calls.
The answer is simple enough. The C library implementation is written with some knowledge of (or perhaps expectations for) how the C compiler will handle certain code that has undefined behaviour according to the official specification.
There are many examples I could give; but that pointers actually refer to an address in the process' address space and can be compared freely is relied on by the C library implementation (at least by Glibc) and also by many "real world" programs. While it is not guaranteed by the standard for strictly conforming programs, it is true for the vast majority of real-world architectures/compilers. Also, note footnote 67, regarding conversion of pointers to integers and back:
The mapping functions for converting a pointer to an integer or an
integer to a pointer are intended to be consistent with the addressing
structure of the execution environment.
While this doesn't strictly give license to compare arbitrary pointers, it helps to understand how the rules are supposed to work: as a set of specific behaviours that are certain to be consistent across all platforms, rather than as a limit to what is permissible when the representation of a pointer is fully known and understood.
You've remarked that:
if (start_object <= my_pointer && my_pointer < end_object+1) {
Can be turned into:
if (1) {
With the assumption (which you didn't state) that my_pointer is derived in some way from the value of start_object or the address of the object which it delimits - then this is strictly true, but it's not an optimisation that compilers make in practice except in the case of static/automatic storage duration objects (i.e. objects which the compiler knows weren't dynamically allocated).
Consider the fact that calls to sbrk are defined to increment, or decrement the number of bytes allocated in some region (the heap), for some process by the given incr parameter according to some brk address. This is really just a wrapper around brk, which allows you to adjust the current top of the heap. When you call brk(addr), you're telling the kernel to allocate space for your process all the way from the bottom of the addr (or possibly to free space between the current previous higher-address top of the heap down to the new address). sbrk(incr) would be exactly equivalent if incr == new_top - original_top. Thus to answer your question:
Because sbrk just adjusts the size of the heap (i.e. some contiguous region of memory) by incr number of bytes, comparing the values of sbrk is just a comparison of points in some contiguous region of memory. That is exactly equivalent to comparing points in an array, and so it is a well defined operation according to the the C-standard. Therefore, pointer comparison calls around sbrk can be optimized away.
glibc doesn't do anything special to "get away with it" - they just assume that the assumption mentioned above holds true (which it does). In fact, if they're checking the state of a chunk for memory that was allocated with mmap, it explicitly verifies that the mmap'd memory is outside the range allocated with sbrk.
Edit: Something I want to make clearer about my answer: The key here is that there is no undefined behavior! sbrk is defined to allocate bytes in some contiguous region of memory, which is itself an 'object' as specified by the C-standard. Therefore, comparison of pointers within that 'object' is a completely sane and well defined operation. The assumption here is not that glibc is taking advantage of undefined pointer comparison, it's that it's assuming that sbrk grows / shrinks memory in some contiguous region.
The authors of the C Standard recognized that there are some segmented-memory hardware platforms where an attempt to perform a relational comparison between objects in different segments might behave strangely. Rather than say that such platforms could not efficiently accommodate efficient C implementations, the authors of the Standard allow such implementations to do anything they see fit if an attempt is made to compare pointers to objects that might be in different segments.
For the authors of the Standard to have said that comparisons between disjoint objects should only exhibit strange behavior on such segmented-memory systems that can't efficiently yield consistent behavior would have been seen as implying that such systems were inferior to platforms where relational comparisons between arbitrary pointers will yield a consistent ranking, and the authors of the Standard went out of their way to avoid such implications. Instead, they figured that since there was no reason for implementations targeting commonplace platforms to do anything weird with such comparisons, such implementations would handle them sensibly whether the Standard mandated them or not.
Unfortunately, some people who are more interested in making a compiler that conforms to the Standard than in making one that's useful have decided that any code which isn't written to accommodate the limitations of hardware that has been obsolete for decades should be considered "broken". They claim that their "optimizations" allow programs to be more efficient than would otherwise be possible, but in many cases the "efficiency" gains are only significant in cases where a compiler omits code which is actually necessary. If a programmer works around the compiler's limitations, the resulting code may end up being less efficient than if the compiler hadn't bothered with the "optimization" in the first place.

Rationale for pointer comparisons outside an array to be UB

So, the standard (referring to N1570) says the following about comparing pointers:
C99 6.5.8/5 Relational operators
When two pointers are compared, the result depends on the relative
locations in the address space of the objects pointed to.
... [snip obvious definitions of comparison within aggregates] ...
In all other cases,
the behavior is undefined.
What is the rationale for this instance of UB, as opposed to specifying (for instance) conversion to intptr_t and comparison of that?
Is there some machine architecture where a sensible total ordering on pointers is hard to construct? Is there some class of optimization or analysis that unrestricted pointer comparisons would impede?
A deleted answer to this question mentions that this piece of UB allows for skipping comparison of segment registers and only comparing offsets. Is that particularly valuable to preserve?
(That same deleted answer, as well as one here, note that in C++, std::less and the like are required to implement a total order on pointers, whether the normal comparison operator does or not.)
Various comments in the ub mailing list discussion Justification for < not being a total order on pointers? strongly allude to segmented architectures being the reason. Including the follow comments, 1:
Separately, I believe that the Core Language should simply recognize the fact that all machines these days have a flat memory model.
and 2:
Then we maybe need an new type that guarantees a total order when
converted from a pointer (e.g. in segmented architectures, conversion
would require taking the address of the segment register and adding the
offset stored in the pointer).
and 3:
Pointers, while historically not totally ordered, are practically so
for all systems in existence today, with the exception of the ivory tower
minds of the committee, so the point is moot.
and 4:
But, even if segmented architectures, unlikely though it is, do come
back, the ordering problem still has to be addressed, as std::less
is required to totally order pointers. I just want operator< to be an
alternate spelling for that property.
Why should everyone else pretend to suffer (and I do mean pretend,
because outside of a small contingent of the committee, people already
assume that pointers are totally ordered with respect to operator<) to
meet the theoretical needs of some currently non-existent
architecture?
Counter to the trend of comments from the ub mailing list, FUZxxl points out that supporting DOS is a reason not to support totally ordered pointers.
Update
This is also supported by the Annotated C++ Reference Manual(ARM) which says this was due to burden of supporting this on segmented architectures:
The expression may not evaluate to false on segmented architectures
[...] This explains why addition, subtraction and comparison of
pointers are defined only for pointers into an array and one element
beyond the end. [...] Users of machines with a nonsegmented address
space developed idioms, however, that referred to the elements beyond
the end of the array [...] was not portable to segmented architectures
unless special effort was taken [...] Allowing [...] would be costly
and serve few useful purposes.
The 8086 is a processor with 16 bit registers and a 20 bit address space. To cope with the lack of bits in its registers, a set of segment registers exists. On memory access, the dereferenced address is computed like this:
address = 16 * segment + register
Notice that among other things, an address has generally multiple ways to be represented. Comparing two arbitrary addresses is tedious as the compiler has to first normalize both addresses and then compare the normalized addresses.
Many compilers specify (in the memory models where this is possible) that when doing pointer arithmetic, the segment part is to be left untouched. This has several consequences:
objects can have a size of at most 64 kB
all addresses in an object have the same segment part
comparing addresses in an object can be done just by comparing the register part; that can be done in a single instruction
This fast comparison of course only works when the pointers are derived from the same base-address, which is one of the reasons why the C standard defines pointer comparisons only for when both pointers point into the same object.
If you want a well-ordered comparison for all pointers, consider converting the pointers to uintptr_t values first.
I believe it's undefined so that C can be run on architectures where, in effect, "smart pointers" are implemented in hardware, with various checks to ensure that pointers never accidentally point outside of the memory regions they're defined to refer to. I've never personally used such a machine, but the way to think about them is that computing an invalid pointer is precisely as forbidden as dividing by 0; you're likely to get a run-time exception that terminates your program. Furthermore, what's forbidden is computing the pointer, you don't even have to dereference it to get the exception.
Yes, I believe the definition also ended up permitting more-efficient comparisons of offset registers in old 8086 code, but that was not the only reason.
Yes, a compiler for one of these protected pointer architectures could theoretically implement the "forbidden" comparisons by converting to unsigned or the equivalent, but (a) it would likely be significantly less efficient to do so and (b) that would be a wantonly deliberate circumvention of the architecture's intended protection, protection which at least some of the architecture's C programmers would presumably want to have enabled (not disabled).
Historically, saying that action invoked Undefined Behavior meant that any program which made use of such actions could be expected to correctly only on those implementations which defined, for that action, behavior meeting their requirements. Specifying that an action invoked Undefined Behavior didn't mean that programs using such action should be considered "illegitimate", but was rather intended to allow C to be used to run programs that didn't require such actions, on platforms which could not efficiently support them.
Generally, the expectation was that a compiler would either output the sequence of instructions which would most efficiently perform the indicated action in the cases required by the standard, and do whatever that sequence of instructions happened to do in other cases, or would output a sequence of instructions whose behavior in such cases was deemed to be in some fashion more "useful" than the natural sequence. In cases where an action might trigger a hardware trap, or where triggering an OS trap might plausibly in some cases be considered preferable to executing the "natural" sequence of instructions, and where a trap might cause behaviors outside the control of the C compiler, the Standard imposes no requirements. Such cases are thus labeled as "Undefined Behavior".
As others have noted, there are some platforms where p1 < p2, for unrelated pointers p1 and p2, could be guaranteed to yield 0 or 1, but where the most efficient means of comparing p1 and p2 that would work in the cases defined by the Standard might not uphold the usual expectation that p1 < p2 || p2 > p2 || p1 != p2. If a program written for such a platform knows that it will never deliberately compare unrelated pointers (implying that any such comparison would represent a program bug) it may be helpful to have stress-testing or troubleshooting builds generate code which traps on any such comparisons. The only way for the Standard to allow such implementations is to make such comparisons Undefined Behavior.
Until recently, the fact that a particular action would invoke behavior that was not defined by the Standard would generally only pose difficulties for people trying to write code on platforms where the action would have undesirable consequences. Further, on platforms where an action could only have undesirable consequences if a compiler went out of its way to make it do so, it was generally accepted practice for programmers to rely upon such an action behaving sensibly.
If one accepts the notions that:
The authors of the Standard expected that comparisons between unrelated pointers would work usefully on those platforms, and only those platforms, where the most natural means of comparing related pointers would also work with unrelated ones, and
There exist platforms where comparing unrelated pointers would be problematic
Then it makes complete sense for the Standard to regard unrelated-pointer comparisons as Undefined Behavior. Had they anticipated that even compilers for platforms which define a disjoint global ranking for all pointers might make unrelated-pointer comparisons negate the laws of time and causality (e.g. given:
int needle_in_haystack(char const *hs_base, int hs_size, char *needle)
{ return needle >= hs_base && needle < hs_base+hs_size; }
a compiler may infer that the program will never receive any input which would cause needle_in_haystack to be given unrelated pointers, and any code which would only be relevant when the program receives such input may be eliminated) I think they would have specified things differently. Compiler writers would probably argue that the proper way to write needle_in_haystack would be:
int needle_in_haystack(char const *hs_base, int hs_size, char *needle)
{
for (int i=0; i<size; i++)
if (hs_base+i == needle) return 1;
return 0;
}
since their compilers would recognize what the loop is doing and also recognize that it's running on a platform where unrelated pointer comparisons work, and thus generate the same machine code as older compilers would have generated for the earlier-stated formulation. As to whether it would be better to require compilers provide a means of specifying that code resembling the former version should either sensibly on platforms that will support it or refuse compilation on those that won't, or better to require that programmers intending the former semantics should write the latter and hope that optimizers turn it into something useful, I leave that to the reader's judgment.

What C compilers have pointer subtraction underflows?

So, as I learned from Michael Burr's comments to this answer, the C standard doesn't support integer subtraction from pointers past the first element in an array (which I suppose includes any allocated memory).
From section 6.5.6 of the combined C99 + TC1 + TC2 (pdf):
If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined.
I love pointer arithmetic, but this has never been something I've worried about before. I've always assumed that given:
int a[1];
int * b = a - 3;
int * c = b + 3;
That c == a.
So while I believe I've done that sort of thing before, and not gotten bitten, it must have been due to the kindness of the various compilers I've worked with - that they've gone above and beyond what the standards require to make pointer arithmetic work the way I thought it did.
So my question is, how common is that? Are there commonly used compilers that don't do that kindness for me? Is proper pointer arithmetic beyond the bounds of an array a defacto standard?
MSDOS FAR pointers had problems like this, which were usually covered over by "clever" use of the overlap of the segment register with the offset register in real-mode. The effect there was that the 16-bit segment was a shifted left 4 bits, and added to the 16-bit offset which gave a 20-bit physical address that could address 1MB, which was plenty because everyone knew that noone would ever need as much as 640KB of RAM. ;-)
In protected mode, the segment register was actually an index into a table of memory descriptors. A typical DOS extending runtime would usually arrange things so that many segments could be treated just like they would have been in real mode, which made porting code from real mode easy. But it had some defects. Primarily, the segment before an allocation was not part of the allocation, and so its descriptor might not even be valid.
On the 80286 in protected mode, just loading a segment register with a value that would cause an invalid descriptor to load would cause an exception, whether or not the descriptor was actually used to refer to memory.
A similar issue potentially occurs at one byte past the allocation. The last ++ on the pointer might have carried over to the segment register, causing it to load a new descriptor. In this case, it is reasonable to expect that the memory allocator could arrange for one safe descriptor past the end of the allocated range, but it would be unreasonable to expect it to arrange for any more than that.
This is not "implementation defined" by the Standard, this is "undefined" by the Standard. Which means that you can't count on a compiler supporting it, you can't say, "well, this code is safe on compiler X". By invoking undefined behavior, your program is undefined.
The practical answer isn't "how (where, when, on what compiler) can I get away with this"; the practical answer is "don't do this".
Another reason is that there are optional conservative garbage collectors (like the boehm-weiser GC) that assume a pointer is always inside the allocated range and if not they are allowed to free the memory at any time.
There is one popular commercial quality and used library that does break this assumption and it is the Judy Trees Library from HP which uses pointer algorithms to implement a very complex hash structure.
ZETA-C for the TI Explorer; pointers are implemented as arrays and indexes or displaced arrays, IIRC, so your example probably wouldn't work. Start from zcprim>pointer-subtract in zcprim.lisp to figure out what the behavior would be. No idea whether this was correct per the standard, but I get the impression that it was.

Resources