If dot_product is declared as
float dot_product(const float* restrict a, const float* restrict b, unsigned n);
would calling it with
dot_product(x, x, x_len)
be "undefined", according to the C99 standard?
Edit
x is a pointer, of course, pointing to sizeof(float) * x_len bytes of memory, x_len is unsigned. This question is about aliasing.
I do not have the original C99 (that is, ISO9899:1999) text; I only have a copy of ISO9899:2007:TC3. I expect this text, taken from page 111 of that document, is very similar to the text in the C99 standard.
6.7.3.1 Formal definition of restrict
...
10. EXAMPLE 3
The function parameter declarations
void h(int n, int * restrict p, int * restrict q, int * restrict r)
{
int i;
for (i = 0; i < n; i++)
p[i] = q[i] + r[i];
}
illustrate how an unmodified object can be aliased through two restricted
pointers. In particular, if a and b are disjoint arrays, a call of the form
h(100, a, b, b) has defined behavior, because array b is not modified within
function h.
This seems to clearly call out functions of the form you asked about as having defined behavior, provided the aliased pointers are used for read-only access. Writing through either of the aliased pointers would invoke undefined behavior.
First I don't think that the call itself is UB, UB can only occur inside the function, if the pointers that are passed as parameters are used in a way that conflicts with the specification of restrict. (UB makes not much sense for the call, if that (w/sh)ould be forbidden, this should have been made a constraint violation and not UB.)
Then, UB related to restrict can only appear, if the pointed to object is modified "in any way". So as long as your vectors aren't modified, everything is fine. Inside your function this shouldn't happen, because of the const qualification. And if something outside (say a different thread or a signal handler) modifies your vectors, you are screwed, anyhow.
Yes. It will invoke undefined behavior.
If the restrict keyword is used and the function is declared as :
float dot_product(const float* restrict a, const float* restrict b, unsigned n);
then the compiler is allowed to assume that a and b point to different locations and updating one pointer will not affect the other pointers. The programmer, not the compiler, is responsible for ensuring that the pointers do not point to identical locations.
Since your function call is
dot_product(x, x, x_len)
which is passing the same pointer x to the function, updating any of a or b will affect other causing undefined behavior.
Related
Note, I would not write code like this. I'm just curious, and it would help me writing a better answer for another question. But let's say that we have this function:
void foo(int a, int *b)
{
*b = 2*a;
}
And call it like this:
int x=42;
foo(x, &x);
Apart from the fact that it is a very strong code smell, can this cause any real problems? Is it UB or does it violate any rules in the C standard?
This
int x=42;
foo(x, &x);
is a well-formed code. The order of the evaluation of arguments is not important in this case.
In fact the function call is equivalent to
foo( 42, &x);
because the first argument is passed by value.
The code is OK, but to dig a bit deeper between the lines to see what special case situations exist, which ones that are fine and which ones that could potentially cause problems:
The order of evaluation of function arguments is unspecified and unsequenced, but it does not matter in this specific case.
Function arguments are evaluated before passed to a function, then there is a sequence point after the evaluation. Meaning that all calculations and side effects in the arguments occur before the function is called (but in an unspecified order in relation to each other).
So even (artificial crap) code such as this is actually well-defined:
void foo(int a, int *b)
{
printf("%d\n", a); // prints 42
*b = 2**b; // gives 43 * 43 = 86
}
...
int x=42;
foo(x++, &x);
The x++ vs &x is fine since &x is not a value computation of the object. And thanks to the sequence points, the x++ occurs before the calculation inside the function, so that part is also well-defined.
The parameter a inside the function is of course a local copy of caller-side x, so the function can do what it pleases with that one.
Had it been two pointers pointing at the same object, then they would "overlap" and that's undefined behavior in some cases, depending on what the function does. For example memcpy(&x, &x, sizeof x); is undefined behavior.
There is a sequence point at each ; and also at the end of the function.
For variables declared at file scope, the function must assume that pointer parameters modifying the pointed-at value might modify the file scope variable. So in case of this code:
void foo(int a, int *b)
{
extern int x;
x = 2*a;
printf("%d\n", *b);
}
the compiler must fetch the value of *b after the assignment to x, because it can't assume that b doesn't point at x - the pointer could be an alias of &x. And here's where the various rules of pointer aliasing comes in.
Similarly, two pointer parameters of the same type might point at the same object in the caller code and the compiler isn't allowed to assume that they don't, unless we manually add a restrict qualifier to them.
The question is already well answered, but the above would work even without being in a separate function due to evaluation order.
Even the following code would work fine.
void foo(int *a, int *b) {
*b = 2*(*a);
}
int x = 42;
foo(&x, &x);
and x would contain 84 after it executes. The right-hand side of the assignment is evaluated before the left-hand side (that's why we can do x = x+1; and drive mathematicians crazy :) )
If inside a function I have the following variables:
{
int a;
};
{
int b;
};
Will then &a be equal to &b?
The ISO-C-standard answer is that we cannot tell. Since a and b are not in the same scope, we cannot evaluate an expression which contains both &a and &b. Moreover, the following trick to get around that is undefined behavior:
int *pa, *pb;
{
int a;
pa = &a;
}
{
int b;
pb = &b;
}
// these pointer values are indeterminate; using them is
// undefined behavior.
pa == pb;
But yes, compilers can and do reduce the storage for local variables in that way. It can be important if some of those variables are large-ish arrays. Though formally undefined behavior, the pa == pb comparison will de facto work in just about any compiler, making it possible to investigate the issue, though the best way to be sure is to obtain an assembly language listing of the generated code and read that.
Suppose you have some DEBUG_PRINT macro which expands to a block of code that declares a local char buf[512] array. If that is used numerous times in a function, it would be poor to have that many repetitions of the buffer reserved in the stack frame. The same remarks apply to inline functions.
Suppose we have a function declaration for which we do not have access to its definition:
void f(int * restrict p, int * restrict q, int * restrict r);
Since we do not know how the pointers will be accessed, we cannot know if a call will trigger undefined behavior or not -- even if we are passing the same pointer, like the example at 6.7.3.1.10 explains:
The function parameter declarations:
void h(int n, int * restrict p, int * restrict q, int * restrict r)
{
int i;
for (i = 0; i < n; i++)
p[i] = q[i] + r[i];
}
illustrate how an unmodified object can be aliased through two restricted pointers. In particular, if a and b are disjoint arrays, a call of the form h(100, a, b, b) has defined behavior, because array b is not modified within function h.
Therefore, is restrict superfluous in these cases, except as a hint/annotation for callers, unless we know something more about the function?
For instance, let's take sprintf (7.21.6.6) from the standard library:
Synopsis
#include <stdio.h>
int sprintf(char * restrict s,
const char * restrict format, ...);
Description
The sprintf function is equivalent to fprintf, except that the output is written into an array (specified by the argument s) rather than to a stream. (...)
From the synopsis and the first sentence of the description, we know that s will be written to and that s is a restricted pointer. Therefore, can we already assume (without reading further) that a call like:
char s[4];
sprintf(s, "%s", s);
will trigger undefined behavior?
If yes, then: is the last sentence of sprintf's description superfluous (even if clarifying)?
If copying takes place between objects that overlap, the behavior is undefined.
If not, then, the other way around: is the restrict qualifier superfluous since the description is the one that is actually letting us know what will be undefined behavior?
If yes, then: is the last sentence of sprintf's description superfluous (even if clarifying)?
If copying takes place between objects that overlap, the behavior is undefined.
int sprintf(char * restrict s, const char * restrict format, ...);
The restrict on s means that reads and writes only depends on what sprintf() does. The following code does that, reading and writing data pointed by p1 as the char * restrict s argument. Read/write only happened due to direct sprintf() code and not a side effect.
char p[100] = "abc";
char *p1 = p;
char *p2 = p;
sprintf(p1, "<%s>", p2);
Yet when sprintf() accesses the data pointed to by p2 , there is no restrict. The “If copying takes place between objects that overlap, the behavior is undefined” applies to p2 to say p2's data must not change due to some side effect.
If not, then, the other way around: is the restrict qualifier superfluous since the description is the one that is actually letting us know what will be undefined behavior?
restrict here is for the compiler to implement the restrict access. We do not need to see it given the “If copying takes place...” spec.
Consider the simpler strcpy() which has the same “If copying takes place between objects that overlap, the behavior is undefined.”. It is redundant for us, the readers, here, as careful understanding of restrict (new in C99), would not need that.
char *strcpy(char * restrict s1, const char * restrict s2);
C89 (pre-restrict days) also has this wording for strcpy(), sprintf(), ... and so may be simply a left-over over-spec in C99 for strcpy().
The most challenging aspect of type * restrict p I find is that it refers to what will not happen to its data (p data will not change unexpectedly - only through p). Yet writing to p data is allowed mess up others - unless they have a restrict.
restrict was introduced in C99.
Since we do not know how the pointers will be accessed, we cannot know if a call will trigger undefined behavior
Yes. But this is a question of trust. A function declaration is a contract, between the programmer who wrote the function definition and programmer that uses the function. Remember, once in C we would just write void f(); - here f is a function that takes an unspecified number of parameters. If you don't trust that programmer who wrote that function, no one will and don't use that functions. In C we are passing address of the first array element, so seeing a function declared like this, I would assume: the programmer who wrote that function gives some description on how these pointers are used or function f uses them as pointers to single element, not as arrays.
( In times like this I like to use C99 VLAs in function declaration to specify how long arrays my function expects: void f(int p[restrict 5], int q[restrict 10], int r[restrict 15]);. Such function declaration is exactly equal to yours, but you have some idea what memory can't overlap. )
char s[4]; sprintf(s, "%s", s); will trigger undefined behavior?
Yes. Copying takes place between objects that overlap and restrict location is accessed by two pointers.
Given a function signature like:
void copySomeInts(int * restrict dest, int * restrict src, int n);
someone who wanted the function to yield defined behavior in the case where the source and destination overlap [or even where they are equal] would need to expend some extra effort to do so. It would be possible, e.g.
void copySomeInts(int * restrict dest, int const * restrict src, int n)
{
for (int i=0; i<n; i++)
{
if (dest+i == src)
{
int delta = src-dest;
for (i=0; i<n; i++)
dest[i] = dest[delta+i];
return;
}
if (src+i == dest)
{
int delta = src-dest;
for (i=n-1; i>=0; i--)
dest[i] = src[delta+i];
return;
}
}
/* No overlap--safe to copy in normal fashion */
for (int i=0; i<n; i++)
dest[i] = src[i];
}
and thus there's no way a compiler generating code to call copySomeInts would be able to make any inferences about its behavior unless it could actually see the definition copySomeInts and not just the signature.
While the restrict qualfiers would not imply that the function couldn't handle the case where source and destination overlap, it they would suggest that such handling would likely be more complicated than would be necessary without the qualifier. This would in turn suggest that in unless there is of explicit documentation promising to handle that case, the function should not be expected to handle it in defined fashion.
Note that in the case where the source and destination overlap, no storage is actually addressed using the src pointer nor anything derived from it. If src+i==dest or dest+i==src, that would mean both src and dest identify elements of the same array, and thus src-dest would simply represent the difference in their indices. The const and restrict qualifiers on src mean that nothing which is accessed with a pointer derived from src can be modified in any way during the execution of the function, but that restriction only applies to things that are accessed with pointers derived from src. If nothing is actually accessed with such pointers, the restriction is vacuous.
Does test_func the following snippet trigger undefined behavior under the strict aliasing rules when the two arguments partially overlap?
That is the second argument is a member of the first:
#include <stdio.h>
typedef struct
{
//... Other fields
int x;
//... Other fields
} A;
int test_func(A *a, int *x)
{
a->x = 0;
*x = 1;
return a->x;
}
int main()
{
A a = {0};
printf("%d\n", test_func(&a, &a.x));
return 0;
}
Is the compiler allowed to think test_func will just return 0, based on the assumption that A* and int* will not alias? so the *x cannot overwrite the member?
Strict aliasing refers to when a pointer is converted to another pointer type, after which the contents are accessed. Strict aliasing means that the involved pointed-at types must be compatible. That does not apply here.
There is however the term pointer aliasing, meaning that two pointers can refer to the same memory. The compiler is not allowed to assume that this is the case here. If it wants to do optimizations like those you describe, it would perhaps have to add machine code that compares the pointers with each other, to determine if they are the same or not. Which in itself would make the function slightly slower.
To help the compiler optimize such code, you can declare the pointers as restrict, which tells the compiler that the programmer guarantees that the pointers are not pointing at the same memory.
Your function compiled with gcc -O3 results in this machine code:
0x00402D09 mov $0x1,%edx
Which basically means that the whole function was replaced (inlined) with "set a.x to 1".
But if I rewrite your function as
int test_func(A* restrict a, int* restrict x)
{
a->x = 0;
*x = 1;
return a->x;
}
and compile with gcc -O3, it does return 0. Because I have now told the compiler that a->X and x do not point at the same memory, so it can assume that *x = 1; does not affect the result and skip the line *x = 1; or sequence it before the line a->x = 0;.
The optimized machine code of the restrict version actually skips the whole function call, since it knows that the value is already 0 as per your initialization.
This is of course a bug, but the programmer is to blame for it, for careless use of restrict.
This is not a violation of strict aliasing. The strict aliasing rule says (simplified) that you can access the value of an object only using an lvalue expression of a compatible type. In this case, the object you're accessing is the member x of main's a variable. This member has type int. And the expression you use to access it (*x) also has type int. So there's no problem.
You may be confusing strict aliasing with restrict. If you had used the restrict keyword in the declaration of one of the pointer parameters, the code would be invalid because restrict prevents you from using different pointers to access the same object - but this is a different issue than strict aliasing.
I went through all answers but I can't get the answer about whether the following example is undefined behavior.
It's the same example as in 6.7.3.1 of C99 spec.
EXAMPLE 3 The function parameter declarations
void h(int n, int * restrict p, int * restrict q, int * restrict r)
{
int i;
for (i = 0; i < n; i++)
p[i] = q[i] + r[i];
}
illustrate how an unmodified object can be aliased through two restricted pointers. In particular, if a and b are disjoint arrays, a call of the form h(100, a, b, b) has defined behavior, because array b is not modified within function h.
In short, it is explicitly mentioned it's defined behavior if b is not modified within function h. However, whether is it undefined behavior if a call of the form h(100, a, a, b)?
A little bit more backgrounds why I want to make it clear. There are some basic functions which we want to use in in-place or out-of-place manner. In order to reduce the effort, it is desired if we don't need to provide both h(int n, int * restrict p, int * restrict q, int * restrict r) and h_inplace(int n, int * restrict p, int * restrict q). From current observation, it seems gcc, clang, icc, msvc can give correct result even if we call it as the form of h(100, a, a, b). However, we definitely don't want to have the risk if it is undefined behavior (which means it may be wrong from other compilers or future versions of gcc, clang, icc, msvc). What do you think?
h(100, a, a, b) obviously causes UB because p and q were promised not to alias each other, and the code writes through one of them. See C11 6.7.3.1/4:
If L is used to access the value of the object X that it designates, and X is also modified (by any means), then the following requirements apply: [...] Every other lvalue used to access the value of X shall also have its address based on P
Writing through p accesses the value of an object (called X here), and modifies it. Therefore every lvalue used to access the object within the function must be generated from p. However q is not generated from p.
It is a hint to the code optimizer. Using restrict ensures it that it can store a pointer variable in a CPU register and not have to flush an update of the pointer value to memory so that an alias is updated as well.
Whether or not it takes advantage of it depends heavily on implementation details of the optimizer and the CPU. Code optimizers already are heavily invested in detecting non-aliasing since it is such an important optimization.
However, if the restrict keyword is used and the function is declared as
void updatePtrs(size_t *restrict ptrA, size_t *restrict ptrB, size_t *restrict val);
then the compiler is allowed to assume that ptrA, ptrB, and val point to different locations and updating one pointer will not affect the other pointers. The programmer, not the compiler, is responsible for ensuring that the pointers do not point to identical locations.
If as a programmer you are not ensuring that ptrA , ptrB points to different location then it is obvious you are breaking the rule which may result into undefined behavior .
It is fairly clear, as the M.M's first answer states, that h(100, a, a, b) is undefined behavior, but it is possible to create a low-overhead version of h that does have defined behavior (IMO: I'm currently having an argument with Richard about this.)
void h2(int n, int * restrict p, int * restrict q_opt, int * restrict r)
{
int i;
int * restrict q = (p == q_opt) ? p : q_opt;
for (i = 0; i < n; i++)
p[i] = q[i] + r[i];
}
An even simpler version (but which requires the caller to know the magic value) has the new line:
int * restrict q = q_opt ? q_opt : p;
and the caller gets the in-place behavior by calling h2(100, a, 0, b).
In both cases the q pointer is either distinct from p according to the rules of restrict or derived from p (and not q), and so no conflict arises.
In particular, the discussion and rationale in the standard are actually pretty good, I think. The reason for the restriction is that faster code can sometimes be made by first copying any of the data addressed by a restrict pointer into another memory space (per argument) operating on it there, and then copying the results back. This is particularly true when the extra memory copy is into registers, and doubly true when the compiler is allowed to unroll loops, all of which are good go-fast tools.
The strategy here clearly works fine when a source at q_opt is copied to another location before being used (before use, outside the loop), whether it refers to p or to a different array. It would clearly fail if the user called h2(100, a+1, a, r), because that would still be a violation of the rules of restrict.