C standard compliant way to access null pointer address? - c

In C, deferencing the null pointer is Undefined Behavior, however the null pointer value has a bit representation that in some architectures make it points to a valid address (e.g the address 0).
Let's call this address the null pointer address, for the sake of clarity.
Suppose I want to write a piece of software in C, in an environment with unrestrained access to memory. Suppose further I want to write some data at the null pointer address: how would I achieve that in a standard compliant way?
Example case (IA32e):
#include <stdint.h>
int main()
{
uintptr_t zero = 0;
char* p = (char*)zero;
return *p;
}
This code when compiled with gcc with -O3 for IA32e gets transformed into
movzx eax, BYTE PTR [0]
ud2
due to UB (0 is the bit representation of the null pointer).
Since C is close to low level programming, I believe there must be a way to access the null pointer address and avoid UB.
Just to be clear
I'm asking about what the standard has to say about this, NOT how to achieve this in a implementation defined way.
I know the answer for the latter.

I read (part of) the C99 standard to clear my mind. I found the sections that are of interest for my own question and I'm writing this as a reference.
DISCLAIMER
I'm an absolute beginner, 90% or more of what I have written is wrong, makes no sense, or may break you toaster. I also try to make a rationale out of the standard, often with disastrous and naive results (as stated in the comment).
Don't read.
Consult #Olaf, for a formal and professional answer.
For the following, the term architectural address designed a memory address as seen by the processor (logical, virtual, linear, physical or bus address). In other word the addresses that you would use in assembly.
In section 6.3.2.3. it reads
An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant.
If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal
to a pointer to any object or function.
and regarding integer to pointer conversion
An integer may be converted to any pointer type. Except as previously specified [i.e. for the case of null pointer constant], the
result is implementation-defined, might not be correctly aligned, might not point to an
entity of the referenced type, and might be a trap representation†.
These imply that the compiler, to be compliant, need only to implement a function int2ptr from integer to pointers that
int2ptr(0) is, by definition, the null pointer.
Note that int2ptr(0) is not mandated to be 0. It can be any bit representation.
*int2ptr(n != 0) has no constraints.
Note that this means that int2ptr needs not to be the identity function, nor a function that return valid pointers!
Given the code below
char* p = (char*)241;
The standard makes absolute no guarantee that the expression *p = 56; will write to the architectural address 241.
And so it gives no direct way to access any other architectural address (including int2ptr(0), the address designed by a null pointer, if valid).
Simply put the standard does not deal with architectural addresses, but with pointers, their comparison, conversions and their operations‡.
When we write code like char* p = (char*)K we are not telling the compiler to make p point to the architectural address K, we are telling it to make a pointer out of the integer K, or in other term to make p point to the (C abstract) address K.
Null pointer and the (architectural) address 0x0 are not the same (cit.) and so is true for any other pointer made from the integer K and the (architectural) address K.
For some reasons, childhood heritages, I thought that integer literals in C could be used to express architectural addresses, instead I was wrong and that only happen to be (sort of) correct in the compilers I was using.
The answer to my own question is simply: There is no standard way because there are no (architectural) address in the C standard document. This is true for every (architectural) address, not only the int2ptr(0) one1.
Note about return *(volatile char*)0;
The standard says that
If an
invalid value [a null pointer value is an invalid value] has been assigned to the pointer, the behavior of the unary * operator is undefined.
and that
Therefore any expression referring
to such an [volatile] object shall be evaluated strictly according to the rules of the abstract machine.
The abstract machine says that * is undefined for null pointer values, so that code shouldn't differ from this one
return *(char*)0;
which is also undefined.
Indeed they don't differ, at least with GCC 4.9, both compile to the instructions stated in my question.
The implementation defined way to access the 0 architectural address is, for GCC, the use of the -fno-isolate-erroneous-paths-dereference flag which produces the "expected" assembly code.
†The mapping functions for converting a pointer to an integer or an integer to a pointer are intended to
be consistent with the addressing structure of the execution environment.
‡Unfortunately it says that the & yields the address of its operand, I believe this is a bit improper, I would say that it yields a pointer to its operand. Consider a variable a that is known to resides at address 0xf1 in a 16 bit address space and consider a compiler that implements int2ptr(n) = 0x8000 | n. &a would yield a pointer whose bit representation is 0x80f1 which is not the address of a.
1Which was special to me because it was the only one, in my implementations, that couldn't be accessed.

As OP has correctly concluded in her answer to her own question:
There is no standard way because there are no (architectural) address in the C standard document. This is true for every (architectural) address, not only the int2ptr(0) one.
However, a situation where one would want to access memory directly is likely one where a custom linker script is employed. (I.e. some kind of embedded systems stuff.) So I would say, the standard compliant way of doing what OP asks would be to export a symbol for the (architectural) address in the linker script, and not bother with the exact address in the C code itself.
A variation of that scheme would be to define a symbol at address zero and simply use that to derive any other required address. To do that add something like the following to the SECTIONS portion of the linker script (assuming GNU ld syntax):
_memory = 0;
And then in your C code:
extern char _memory[];
Now it is possible to e.g. create a pointer to the zero address using for example char *p = &_memory[0]; (or simply char *p = _memory;), without ever converting an int to a pointer. Similarly, int addr = ...; char *p_addr = &_memory[addr]; will create a pointer to the address addr without technically casting an int to a pointer.
(This of course avoids the original question, because the linker is independent from the C standard and C compiler, and every linker might have a different syntax for their linker script. Also, the generated code might be less efficient, because the compiler is not aware of the address being accessed. But I think this still adds an interesting perspective to the question, so please forgive the slightly off-topic answer..)

Whatever solution is going to be implementation-dependent. Needfully. ISO C does not describe the environment a C programs runs on; rather, what a conforming C program looks like among a variety of environments («data-processing systems»). The Standard cannot indeed guarantee what you would get by accessing an address that is not an array of objects, i.e. something you visibly allocated, not the environment.
Therefore, I would use something the standard leaves as implementation-defined (and even as conditionally-supported) rather than undefined behavior*: Inline assembly. For GCC/clang:
asm volatile("movzx 0, %%eax;") // *(int*)0;
It also worth mentioning freestanding environments, the one you seem to be in. The standard says about this execution model (emphasis mine):
§ 5.1.2
Two execution environments are defined: freestanding and hosted. [...]
§ 5.1.2.1, comma 1
In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined. Any library facilities available to a freestanding program, other than the minimal set required by clause 4, are implementation-defined. [...]
Notice it doesn't say you can access any address at will.
Whatever that could mean. Things are a bit different when you are the implementation the standard delegates control to.
All quotes are from the draft N. 1570.

The C Standard does not require that implementations have addresses that resemble integers in any way shape or form; all it requires is that if types uintptr_t and intptr_t exist, the act of converting a pointer to uintptr_t or intptr_t will yield a number, and converting that number directly back to the same type as the original pointer will yield a pointer equal to the original.
While it is recommended that platforms which use addresses that resemble integers should define conversions between integers and addresses in a fashion that would be unsurprising to someone familiar with such mapping, that is not a requirement, and code relying upon such a recommendation would not be strictly conforming.
Nonetheless, I would suggest that if a quality implementation specifies that it performs integer-to-pointer conversion by a simple bitwise mapping, and if there may be plausible reasons why code would want to access address zero, a it should regard statements like:
*((uint32_t volatile*)0) = 0x12345678;
*((uint32_t volatile*)x) = 0x12345678;
as a request to write to address zero and address x, in that order even if
x happens to be zero, and even if the implementation would normally trap on
null pointer accesses. Such behavior isn't "standard", insofar as the
Standard says nothing about the mapping between pointers and integers, but
a good quality implementation should nonetheless behave sensibly.

I'm assuming the question you are asking is:
How do I access memory such that a pointer to that memory has the same representation as the null pointer?
According to a literal reading of the Standard, this is not possible. 6.3.2.3/3 says that any pointer to an object must compare unequal to the null pointer.
Therefore this pointer we are talking about must not point to an object. But the deference operator *, applied to an object pointer, only specifies the behaviour in the case that it points to an object.
Having said that, the object model in C has never been specified rigorously, so I would not put too much weight into the above interpretation. Nevertheless, it seems to me that whatever solution you come up with is going to have to rely on non-standard behaviour from whichever compiler is in use.
We see an example of this in the other answers in which gcc's optimizer detects an all-bits-zero pointer at a late stage of processing and flags it as UB.

Related

Is NULL data-pointer the same as NULL function-pointer?

POSIX requires (I think) that function pointers can be stored in a variable of type void* and/or passed to functions expecting a void* argument, even though this is strictly non-standard.
My question is this: if I test such a variable/argument for NULL-ness, if (!(variable or argument)) say, and the result is true, does that necessarily mean that the function-pointer is NULL? Could the bit pattern for a NULL void* data-pointer ever equate to a non-NULL function-pointer value? Would any sane implementation do this? Do any common implementations do this?
EDIT: this answer (to a different question admittedly) made me wonder if I had to cast the void* intermediate back to the original function pointer type before I could test NULL-ness, otherwise it's UB... is that true? Can those who posted answers weigh in on this question?
C 2018 6.3.2.3 4 says:
Conversion of a null pointer to another pointer type yields a null pointer of that type. Any two null pointers shall compare equal.
This paragraph, unlike paragraph 7, does not limit the conversions to pointers to object types or pointers to function types. Therefore, if a null pointer of some pointer-to-function type is converted to void *, the result is a null pointer, and then applying ! to it yields 1.
Establishing the converse, that if applying ! to a pointer yields 1, it necessarily arose from a null pointer, it more difficult. We could imagine some non-null function pointer that, when converted to void *, yields a null pointer. Considering the intent of POSIX to allow function pointers to be temporarily stored in void *, we can conclude that converting a pointer to a function to void * should never result in a null pointer.
Could the bit pattern for a NULL void* data-pointer ever equate to a non-NULL function-pointer value?
The C standard does not discuss the bit patterns used to represent pointers. The semantics are established in terms of the values.
Would any sane implementation do this?
Certainly bare-metal boot code on some hardware might put executable instructions at address zero and call a function there for some reason and might also use address zero as a null pointer. It will simply be designed not to depend on that function at address zero not being tested for being a null pointer.
Outside of such special situations, i.e., for all practical purposes, this is not done. If some software decides it needs a special representation for a null pointer, it will set aside some address for that and not use that address for any ordinary function or object.
This C11 Draft Standard suggests that, while not part of the core standard, casting function pointers to object pointers (for purposes of testing, as in your NULL-check) falls under the category of "Common Extensions" (Annex J.5):
J.5.7 Function pointer casts
1     A pointer to an object or to void may be cast to
a pointer to a function, allowing data to be invoked as a function
(6.5.4).
2     A pointer to a function may be cast to a pointer
to an object or to void, allowing a function to be inspected or
modified (for example, by a debugger) (6.5.4)
If an implementation were to target a platform where the common idiomatic bit pattern for a null code pointer was different from the common idiomatic bit pattern for a null data pointer, it would be free to either expose the fact that the pointers are different to the programmer, or wrap pointer operations in an abstraction that would make them behave identically. The authors of the Standard made no attempt to guess which approach would be more useful, since people who were actually writing code for the platform in question would be better placed than the Committee to judge the pros and cons of each approach.
In the vastly more common situation where a target platform uses the same representation for both kinds of null pointers, the authors of the Standard would have expected that implementations would store both kinds of pointers in that fashion, but would have also regarded the idea that such implementations should behave that way as sufficiently obvious that there was no need to expend ink recommending such behavior.
The Standard's failure to mandate the common behavior does not imply any judgment that implementations should ever do anything else, but merely an acknowledgment that implementations might exist where some other behavior might sometimes be more useful than the commonplace one.

Can I assign any integer value to a pointer variable directly?

Since addresses are numbers and can be assigned to a pointer variable, can I assign any integer value to a pointer variable directly, like this:
int *pPtr = 60000;
You can, but unless you're developing for an embedded device with known memory addresses with a compiler that explicitly allows it, attempting to dereference such a pointer will invoke undefined behavior.
You should only assign the address of a variable or the result of a memory allocation function such as malloc, or NULL.
Yes you can.
You should only assign the address of a variable or the result of a memory allocation function such as malloc, or NULL.
According to the pointer conversion rules, e.g. as described in this online c++ standard draft, any integer may be converted to a pointer value:
6.3.2.3 Pointers
(5) An integer may be converted to any pointer type. Except as
previously specified, the result is implementation-defined, might not
be correctly aligned, might not point to an entity of the referenced
type, and might be a trap representation.
Allowing the conversion does, however, not mean that you are allowed to dereference the pointer then.
You can but there's a lot of considerations.
1) What does that mean?
The only really useful abstraction when this actually gets used is that you need to access a specific memory location because something is mapped to a specific point, generally hardware control registers (less often: a specific area in flash or from the linker table). The fact that you are assigning 60000 (a decimal number rather than a hexadecimal address or a symbolic mnemonic) makes me quite worried.
2) Do you have "odd" pointers?
Some microcontrollers have pointers with strange semantics (near vs far, tied to a specific memory page, etc.) You may have to do odd things to make the pointer make sense. In addition, some pointers can do strange things depending upon where they point. For example, the PIC32 series can point to the exact same data but with different upper bits that will retrieve a cached copy or an uncached copy.
3) Is that value the correct size for the pointer?
Different architectures need different sizes. The newer data types like intptr_t are meant to paper over this.

Consequenes of warning “dereferencing type-punned pointer will break strict-aliasing rules”

I have gone through some queries on the similar topic and some material related to it.
But my query is mainly to understand the warning for the below code. I do not want a fix !!
I understand there are two ways, a union or using memcpy.
uint32 localval;
void * DataPtr;
localval = something;
(*(float32*)(DataPtr))= (*(const float32*)((const void*)(&localval)));
please note the below significant points
1. both the types involved in the cast here are 32 bit. (or am i wrong ?)
2. Both are local variables.
Compiler specific points:
1. The code is supposed to be platform independent, this is a requirement!!
2. I compiled on GCC and it just worked as expected. (I could reinterpret the int as a float) , which is why i ignored the warning.
My questions
1. What optimizations could the compiler perform in this aliasing case ?
2. As both would occupy the same size (correct me if not) what could be the side affects of such a compiler optimization ?
3. Can I safely ignore the warning or turn off aliasing ?
4. If the compiler hasn't performed an optimization and my program is not broken after my first compilation ? Can i safely assume that every time the compiler would behave the same way (does not do optimizations) ?
5. Does the aliasing apply to a void * typecast too ? or is it applicable only for the standard typecasts (int,float etc...) ?
6. what are the affects if I disable the aliasing rules ?
Edited
1. based on R's and Matt McNabb's corrections
2. added a new questions
Language standards try to strike a balance between the sometimes competing interests of programmers that will use the language and compiler writers that want to use a broad set of optimizations to generate reasonably fast code. Keeping variables in registers is one such optimization. For variables that are "live" in a section of a program the compiler tries to allocate them in registers. Storing at the address in a pointer could store anywhere in the program's address space - which would invalidate every single variable in a register. Sometimes the compiler could analyze a program and figure out where a pointer could or could not be pointing, but the C (and C++) language standards consider this an undue burden, and for "system" type of programs often an impossible task. So the language standards relax the constraints by specifying that certain constructs lead to "undefined behavior" so the compiler writer can assume they don't happen and generate better code under that assumption. In the case of strict aliasing the compromise reached is that if you store to memory using one pointer type, then variables of a different type are assumed to be unchanged, and thus can be kept in registers, or stores and loads to these other types can be reordered with respect to the pointer store.
There are many examples of these kind of optimizations in this paper "Undefined Behavior: What Happened to My Code?"
http://pdos.csail.mit.edu/papers/ub:apsys12.pdf
There is an example there of a violation of the strict-aliasing rule in the Linux kernel, apparently the kernel avoids the problem by telling the compiler not to make use of the strict-aliasing rule for optimizations "The Linux kernel uses -fno-strict-aliasing to
disable optimizations based on strict aliasing."
struct iw_event {
uint16_t len; /* Real length of this stuff */
...
};
static inline char * iwe_stream_add_event(
char * stream, /* Stream of events */
char * ends, /* End of stream */
struct iw_event *iwe, /* Payload */
int event_len ) /* Size of payload */
{
/* Check if it's possible */
if (likely((stream + event_len) < ends)) {
iwe->len = event_len;
memcpy(stream, (char *) iwe, event_len);
stream += event_len;
}
return stream;
}
Figure 7: A strict aliasing violation, in include/net/iw_handler.h of the
Linux kernel, which uses GCC’s -fno-strict-aliasing to prevent possible
reordering.
2.6 Type-Punned Pointer Dereference
C gives programmers the freedom to cast pointers of one type
to another. Pointer casts are often abused to reinterpret a given
object with a different type, a trick known as type-punning. By
doing so, the programmer expects that two pointers of different
types point to the same memory location (i.e., aliasing).
However, the C standard has strict rules for aliasing. In
particular, with only a few exceptions, two pointers of different
types do not alias [19, 6.5]. Violating strict aliasing leads to
undefined behavior.
Figure 7 shows an example from the Linux kernel. The
function first updates iwe->len, and then copies the content of
iwe, which contains the updated iwe->len, to a buffer stream
using memcpy. Note that the Linux kernel provides its own optimized memcpy implementation. In this case, when event_len
is a constant 8 on 32-bit systems, the code expands as follows.
iwe->len = 8;
*(int *)stream = *(int *)((char *)iwe);
*((int *)stream + 1) = *((int *)((char *)iwe) + 1);
The expanded code first writes 8 to iwe->len, which is of
type uint16_t, and then reads iwe, which points to the same
memory location of iwe->len, using a different type int. According to the strict aliasing rule, GCC concludes that the read
and the write do not happen at the same memory location,
because they use different pointer types, and reorders the two
operations. The generated code thus copies a stale iwe->len
value. The Linux kernel uses -fno-strict-aliasing to disable optimizations based on strict aliasing.
Answers
1) What optimizations could the compiler perform in this aliasing case ?
The language standard is very specific about the semantics (behavior) of a strictly conforming program - the burden is on the compiler writer or language implementor to get it right. Once the programmer crosses the line and invokes undefined behavior then the standard is clear that the burden of proof that this will work as intended falls on the programmer, not on the compiler writer - the compiler in this case has been nice enough to warn that undefined behavior has been invoked although it is under no obligation to even do that. Sometimes annoyingly people will tell you that at this point "anything can happen" usually followed by some joke/exaggeration. In the case of your program the compiler could generate code that is "typical for the platform" and store to localval the value of something and then load from localval and store at DataPtr, like you intended, but understand that it is under no obligation to do so. It sees the store to localval as a store to something of uint32 type and it sees the dereference of the load from (*(const float32*)((const void*)(&localval))) as a load from a float32 type and concludes these aren't to the same location so localval can be in a register containing something while it loads from an uninitialized location on the stack reserved for localval should it decide it needs to "spill" that register back to its reserved "automatic" storage (stack). It may or may not store localval to memory before dereferencing the pointer and loading from memory. Depending on what follows in your code it may decide that localval isn't used and the assignment of something has no side-effect, so it may decide that assignment is "dead code" and not even do the assignment to a register.
2) As both would occupy the same size (correct me if not) what could be the side affects of such a compiler optimization ?
The effect could be that an undefined value is stored at the address pointed to by DataPtr.
3) Can I safely ignore the warning or turn off aliasing ?
That is specific to the compiler you are using - if the compiler documents a way to turn off the strict aliasing optimizations then yes, with whatever caveats the compiler makes.
4) If the compiler hasn't performed an optimization and my program is not broken after my first compilation ? Can i safely assume that every time the compiler would behave the same way (does not do optimizations) ?
Maybe, sometimes very small changes in another part of your program could change what the compiler does to this code, think for a moment if the function is "inlined" it could be thrown in the mix of some other part of your code, see this SO question.
5) Does the aliasing apply to a void * typecast too ? or is it applicable only for the standard typecasts (int,float etc...) ?
You cannot dereference a void * so the compiler just cares about the type of your final cast (and in C++ it would gripe if you convert a const to non-const and vice-versa).
6) what are the affects if I disable the aliasing rules ?
See your compiler's documentation - in general you will get slower code, if you do this (like the Linux kernel chose to do in the example from the paper above) then limit this to a small compilation unit, with only the functions where this is necessary.
Conclusion
I understand your questions are for curiosity and trying to better understand how this works (or might not work). You mentioned it is a requirement that the code be portable, by implication then it is a requirement that the program be compliant and not invoke undefined behavior (remember, the burden is on you if you do). In this case, as you pointed out in the question, one solution is to use memcpy, as it turns out not only does that make your code compliant and therefore portable, it also does what you intend in the most efficient way possible on current gcc with optimization level -O3 the compiler converts the memcpy into a single instruction storing the value of localval at the address pointed to by DataPtr, see it live in coliru here - look for the movl %esi, (%rdi) instruction.
You have an incomplete example (as written, it exhibits UB since localval is uninitialized) so let me complete it:
uint32 localval;
void * DataPtr;
DataPtr = something;
localval = 42;
(*(float32*)(DataPtr))= (*(const float32*)((const void*)(&localval)));
Now, since localval has type uint32 and *(const float32*)((const void*)(&localval)) has type float32, they cannot alias, so the compiler is free to reorder the last two statements with respect to each other. This would obviously result in behavior different from what you want.
The correct way to write this is:
memcpy(DataPtr, &localval, sizeof localval);
The const makes no difference. To check if the types are the same size, you can compare sizeof (uint32) to sizeof (float32). It's also possible that the two types have differing alignment requirements.
Those things aside; the behaviour is undefined to read the memory of localval as if it had a float stored in it, that's what the strict aliasing rules say.
6.5#6:
The effective type of an object for an access to its stored value is the declared type of the
object, if any.
6.5#7:
An object shall have its stored value accessed only by an lvalue expression that has one of
the following types
localval has effective type uint32 , and the list of "the following types" doesn't include float32 so this is a violation of the aliasing rules.
If you were aliasing in dynamically allocated memory, then it is different. There's no "declared type", so the "effective type" is whatever was last stored in the object. You could malloc(sizeof (uint32)), and then store a float32 in it and read it back.
To sum up, you seem to be asking "I know this is undefined, but can I rely on my compiler successfully doing it?" To answer that question you will have to specify what your compiler is, and what switches you are invoking it with, at least.
Of course there is also the option of adjusting your code so it does not violate the strict-aliasing rules, but you haven't provided enough background info to proceed down this track.

How to store an integer in a location pointed to by a char*

This might be a very basic question that is already asked but I was not quite sure if the answer here Casting an int pointer to a char ptr and vice versa is applicable in my case.
So essentially I have something as follows:
void* head = sbrk(1024); //allocate 1024 bytes in heap
*((int*)(head+size)) = value; //value and size are int with valoues between 1 and 1023
I would like to know if for an arbitrary value of size the above does not work then what are the restrictions on the value of size? Does it have to be divisible by 4?
First of all, you can't do pointer arithmetic on void pointers. That code should not even compile.
For the sake of discussion, let us assume that you have a char pointer instead. Then formally, such casts followed by an access is undefined behavior. In the real world however, your code will always work if you can manually ensure alignment. You will have to ensure that the address where you write is at an aligned memory position, or there are no guarantees that the code will work.
EDIT with relevant quotes from the ISO 9899:2011 standard why pointer arithmetic on a void pointer is undefined behavior:
6.3.2.2 void
The (nonexistent) value of a void expression (an expression that has
type void) shall not be used in any way, and implicit or explicit
conversions (except to void) shall not be applied to such an
expression.
.
6.5.6 Additive operators
/--/
For addition, either both operands shall have arithmetic type, or one
operand shall be a pointer to a complete object type and the other
shall have integer type. (Incrementing is equivalent to adding 1.)
.
4 Conformance
If a ‘‘shall’’ or ‘‘shall not’’ requirement that appears outside of a
constraint or runtimeconstraint is violated, the behavior is
undefined. Undefined behavior is otherwise indicated in this
International Standard by the words ‘‘undefined behavior’’ or by the
omission of any explicit definition of behavior. There is no
difference in emphasis among these three; they all describe ‘‘behavior
that is undefined’’.
Whether code violating normative text in the standard "should compile" or not can certainly be debated, but I don't think that discussion is of benefit to the OP. Simply don't write code relying on undefined behavior, ever.
Use memcpy():
memcpy((char*)head + size, &value, sizeof(value));
On many systems, in this circumstance, it is required that size be a multiple of four (subject to additional conditions detailed below, including that the size of int be four bytes on your system). On systems that do not require this, it is usually preferred.
First, the type of head is void *, and the C standard does not define what happens when you do pointer arithmetic with void *.
Some compilers, notably GCC and its heirs, will treat this arithmetic as if the type were char *. I will proceed on this basis.
Second, I am not aware of a guarantee that sbrk returns an address with any particular alignment.
Let us suppose that sbrk does return a well-aligned address, and that your C implementation does the plain thing to evaluate * (int *) (head + size) = value, which is to issue a store instruction to write the value of value (converted to an int) to the address head + size.
Then your question becomes: What does my computing platform do with an int store to this address?
As long as head + size is an address suitably aligned for int on your platform, the store will execute as expected. On most platforms, four-byte integers prefer four-byte alignment, and eight-byte integers prefer eight-byte alignment. As long as head is aligned to a multiple of this preference and size is a multiple of this preference, then the store will execute normally.
Otherwise, what happens depends on your platform. On some platforms, the hardware executes the store but may do it more slowly than normal store instructions, because it breaks it into two separate writes to memory. (This also means that other processes sharing the same memory might be able to read memory while one part of the value has been stored but the other part has not. Again, this depends on the characteristics of your computing platform.)
On some platforms, the hardware signals an exception that interrupts program execution and transfers control to the operating system. Some operating systems fix up misaligned stores by analyzing the failing instruction and executing alternate instructions that perform the intended store (or the operating system relays the exception to special code in your program, possibly in automatically included libraries, that do this fix-up work). On these platforms, misaligned stores will be very slow; they can hugely degrade the performance of a program.
On some platforms, the hardware signals an exception, and the operating system does not fix up the misaligned store. Instead, the operating system either terminates your process or sends it a signal about the problem, which often results in your process terminating. (Other possibilities include triggering a debugger or entering special code you have included in your program to handle signals.)

How to implement memmove in standard C without an intermediate copy?

From the man page on my system:
void *memmove(void *dst, const void *src, size_t len);
DESCRIPTION
The memmove() function copies len bytes from string src to string dst.
The two strings may overlap; the copy is always done in a non-destructive
manner.
From the C99 standard:
6.5.8.5 When two pointers are compared, the result depends on the
relative locations in the address
space of the objects pointed to. If
two pointers to object or incomplete
types both point to the same object,
or both point one past the last
element of the same array object,
theycompare equal. If the objects
pointed to are members of the same
aggregate object, pointers to
structure members declared later
compare greater than pointers to
members declared earlier in the
structure, and pointers to array
elements with larger subscript values
compare greater than pointers to
elements of the same array with lower
subscript values. All pointers to
members of the same union object
compare equal. If the expression P
points to an element of an array
object and the expression Q points to
the last element of the same array
object, the pointer expression Q+1
compares greater than P. In all
other cases, the behavior is
undefined.
The emphasis is mine.
The arguments dst and src can be converted to pointers to char so as to alleviate strict aliasing problems, but is it possible to compare two pointers that may point inside different blocks, so as to do the copy in the correct order in case they point inside the same block?
The obvious solution is if (src < dst), but that is undefined if src and dst point to different blocks. "Undefined" means you should not even assume that the condition returns 0 or 1 (this would have been called "unspecified" in the standard's vocabulary).
An alternative is if ((uintptr_t)src < (uintptr_t)dst), which is at least unspecified, but I am not sure that the standard guarantees that when src < dst is defined, it is equivalent to (uintptr_t)src < (uintptr_t)dst). Pointer comparison is defined from pointer arithmetic. For instance, when I read section 6.5.6 on addition, it seems to me that pointer arithmetic could go in the direction opposite to uintptr_t arithmetic, that is, that a compliant compiler might have, when p is of type char*:
((uintptr_t)p)+1==((uintptr_t)(p-1)
This is only an example. Generally speaking very little seems to be guaranteed when converting pointers to integers.
This is a purely academic question, because memmove is provided together with the compiler. In practice, the compiler authors can simply promote undefined pointer comparison to unspecified behavior, or use the relevant pragma to force their compiler to compile their memmove correctly. For instance, this implementation has this snippet:
if ((uintptr_t)dst < (uintptr_t)src) {
/*
* As author/maintainer of libc, take advantage of the
* fact that we know memcpy copies forwards.
*/
return memcpy(dst, src, len);
}
I would still like to use this example as proof that the standard goes too far with undefined behaviors, if it is true that memmove cannot be implemented efficiently in standard C. For instance, no-one ticked when answering this SO question.
I think you're right, it's not possible to implement memmove efficiently in standard C.
The only truly portable way to test whether the regions overlap, I think, is something like this:
for (size_t l = 0; l < len; ++l) {
if (src + l == dst) || (src + l == dst + len - 1) {
// they overlap, so now we can use comparison,
// and copy forwards or backwards as appropriate.
...
return dst;
}
}
// No overlap, doesn't matter which direction we copy
return memcpy(dst, src, len);
You can't implement either memcpy or memmove all that efficiently in portable code, because the platform-specific implementation is likely to kick your butt whatever you do. But a portable memcpy at least looks plausible.
C++ introduced a pointer specialization of std::less, which is defined to work for any two pointers of the same type. It might in theory be slower than <, but obviously on a non-segmented architecture it isn't.
C has no such thing, so in a sense, the C++ standard agrees with you that C doesn't have enough defined behaviour. But then, C++ needs it for std::map and so on. It's much more likely that you'd want to implement std::map (or something like it) without knowledge of the implementation than that you'd want to implement memmove (or something like it) without knowledge of the implementation.
For two memory areas to be valid and overlapping, I believe you would need to be in one of the defined situations of 6.5.8.5. That is, two areas of an array, union, struct, etc.
The reason other situations are undefined are because two different objects might not even be in the same kind of memory, with the same kind of pointer. On PC architectures, addresses are usually just 32-bit address into virtual memory, but C supports all kinds of bizarre architectures, where memory is nothing like that.
The reason that C leaves things undefined is to give leeway to the compiler writers when the situation doesn't need to be defined. The way to read 6.5.8.5 is a paragraph carefully describing architectures that C wants to support where pointer comparison doesn't make sense unless it's inside the same object.
Also, the reason memmove and memcpy are provided by the compiler is that they are sometimes written in tuned assembly for the target CPU, using a specialized instruction. They are not meant to be able to be implemented in C with the same efficiency.
For starters, the C standard is notorious for having problems in the details like this. Part of the problem is because C is used on multiple platforms and the standard attempts to be abstract enough to cover all current and future platforms (which might use some convoluted memory layout that's beyond anything we've ever seen). There is a lot of undefined or implementation-specific behavior in order for compiler writers to "do the right thing" for the target platform. Including details for every platform would be impractical (and constantly out-of-date); instead, the C standard leaves it up to the compiler writer to document what happens in these cases. "Unspecified" behavior only means that the C standard doesn't specify what happens, not necessarily that the outcome cannot be predicted. The outcome is usually still predictable if you read the documentation for your target platform and your compiler.
Since determining if two pointers point to the same block, memory segment, or address space depends on how the memory for that platform is laid out, the spec does not define a way to make that determination. It assumes that the compiler knows how to make this determination. The part of the spec you quoted said that result of pointer comparison depends on the pointers' "relative location in the address space". Notice that "address space" is singular here. This section is only referring to pointers that are in the same address space; that is, pointers that are directly comparable. If the pointers are in different address spaces, then the result is undefined by the C standard and is instead defined by the requirements of the target platform.
In the case of memmove, the implementor generally determines first if the addresses are directly comparable. If not, then the rest of the function is platform-specific. Most of the time, being in different memory spaces is enough to ensure that the regions don't overlap and the function turns into a memcpy. If the addresses are directly comparable, then it's just a simple byte copy process starting from the first byte and going forward or from the last byte and going backwards (whichever one will safely copy the data without clobbering anything).
All in all, the C standard leaves a lot intentionally unspecified where it can't write a simple rule that works on any target platform. However, the standard writers could have done a better job explaining why some things are not defined and used more descriptive terms like "architecture-dependent".
Here's another idea, but I don't know if it's correct. To avoid the O(len) loop in Steve's answer, one could put it in the #else clause of an #ifdef UINTPTR_MAX with the cast-to-uintptr_t implementation. Provided that cast of unsigned char * to uintptr_t commutes with adding integer offsets whenever the offset is valid with the pointer, this makes the pointer comparison well-defined.
I'm not sure whether this commutativity is defined by the standard, but it would make sense, as it works even if only the lower bits of a pointer are an actual numeric address and the upper bits are some sort of black box.
I would still like to use this example as proof that the standard goes too far with undefined behaviors, if it is true that memmove cannot be implemented efficiently in standard C
But it's not proof. There's absolutely no way to guarantee that you can compare two arbitrary pointers on an arbitrary machine architecture. The behaviour of such a pointer comparison cannot be legislated by the C standard or even a compiler. I could imagine a machine with a segmented architecture that might produce a different result depending on how the segments are organised in RAM or might even choose to throw an exception when pointers into different segments are compared. This is why the behaviour is "undefined". The exact same program on the exact same machine might give different results from run to run.
The oft given "solution" of memmove() using the relationship of the two pointers to choose whether to copy from the beginning to the end or from the end to the beginning only works if all memory blocks are allocated from the same address space. Fortunately, this is usually the case although it wasn't in the days of 16 bit x86 code.

Resources