This question already has an answer here:
What is the purpose of static keyword in array parameter of function like "char s[static 10]"?
(1 answer)
Closed 9 years ago.
A very simple program in C:
#include <stdio.h>
#include <stdlib.h>
void process(int array[static 5]){
int i;
for(i=0; i<5; i++)
printf("%d ", array[i]);
printf("\n");
}
int main(){
process((int[]){1,2,3});
process(NULL);
return 0;
}
I compile it: gcc -std=c99 -Wall -o demo demo.c
It does compile and when I run it, it crashes (quite predictable).
Why? What is the purpose of the static keyword in array parameter (whats the name of this construct btw?) ?
The static there is an indication (a hint — but not more than a hint) to the optimizer that it may assume there is a minimum of the appropriate number (in the example, 5) elements in the array (and therefore that the array pointer is not null too). It is also a directive to the programmer using the function that they must pass a big enough array to the function to avoid undefined behaviour.
ISO/IEC 9899:2011
§6.7.6.2 Array declarators
Constraints
¶1 In addition to optional type qualifiers and the keyword static, the [ and ] may delimit
an expression or *. If they delimit an expression (which specifies the size of an array), the
expression shall have an integer type. If the expression is a constant expression, it shall
have a value greater than zero. The element type shall not be an incomplete or function
type. The optional type qualifiers and the keyword static shall appear only in a
declaration of a function parameter with an array type, and then only in the outermost
array type derivation.
§6.7.6.3 Function declarators (including prototypes)
¶7 A declaration of a parameter as "array of type" shall be adjusted to "qualified pointer to
type", where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation. If the keyword static also appears within the [ and ] of the
array type derivation, then for each call to the function, the value of the corresponding
actual argument shall provide access to the first element of an array with at least as many
elements as specified by the size expression.
Your code crashes because if you pass a null pointer to a function expecting an array (that is guaranteed to be the start of an array of 5 elements). You are invoking undefined behaviour and crash is an eminently sensible way of dealing with your mistake.
It is more subtle when you pass an array of 3 integers to a function that's guaranteed an array of 5 integers; again, you invoke undefined behaviour and the results are unpredictable. A crash is relatively unlikely; spurious results are very probable.
In effect, the static in this context has two separate jobs — it defines two separate contracts:
It tells the user of the function that they must provide an array of at least 5 elements (and if they do not, they will invoke undefined behaviour).
It tells the optimizer that it may assume a non-null pointer to an array of at least 5 elements and it may optimize accordingly.
If the user of the function violates the requirements of the function, all hell may break loose ('nasal demons' etc; generally, undefined behaviour).
Your code is correct (and actually recommended... see the C99, N1124/1256, clause 6.7.5.3-7 (see Jonathan's full text below):
If the keyword static also appears within the [ and ] of the array
type derivation, then for each call to the function, the value of the
corresponding actual argument shall provide access to the first
element of an array with at least as many elements as specified by the
size expression.
The error is that your array definition -- you allocate it to hold 3 elements, but then you call a function that requires five elements (via the [static 5]), triggering a crash.
Related
I have the following code:
#include <stdio.h>
int main(void) {
int array[0];
printf("%d", array);
return 0;
}
As we know, an array always points to its first item, but we don't have items in this example, but this code produces some memory address. What does it point to?
An array of size 0 is considered a constraint violation. So having such an array and attempting to use it triggers undefined behavior.
Section 6.7.6.2p1 of the C standard regarding constraints on Array Declarators states:
In addition to optional type qualifiers and the keyword static, the [ and ] may delimit an expression or *. If they delimit an expression (which specifies the size of an array), the expression shall have an integer type. If the expression is a constant expression, it shall have a value greater than zero. The element type shall not be an incomplete or function type. The optional type qualifiers and the keyword static shall appear only in a declaration of a function parameter with an array type, and then only in the outermost array type derivation
GCC will allow a zero length array as an extension, but only if it is the last member of a struct. This is an alternate method of specifying a flexible array member which is allowed in the C standard if the array size is omitted.
Why does this show compiler error if i don`t specify the size of the array before and try to initialize it?
int main()
{
int ar[] = {};
int n = 5 ;
for(int i = 0; i < n ; i++)
ar[i] = i+1;
}
From C11, chapter 6.7.6.2,
In addition to optional type qualifiers and the keyword static, the [ and ] may delimit an expression or *. If they delimit an expression (which specifies the size of an array), the expression shall have an integer type. If the expression is a constant expression, it shall have a value greater than zero. The element type shall not be an incomplete or function type. The optional type qualifiers and the keyword static shall appear only in a declaration of a function parameter with an array type, and then only in the outermost array type derivation
So, this syntax
int ar[] = {};
is a constraint violation. If you enable basic compiler warning, you should see something like
In function ‘main’:
error: ISO C forbids empty initializer braces [-Werror=pedantic]
int ar[] = {};
^
error: zero or negative size array ‘ar’
int ar[] = {};
To answer why there is a compiler error, maybe it's useful to think about the structure of the memory used by the program.
When the program is executing the function referenced in the question it adds a frame to the stack.
"Adding a frame" just means the information for the current function is added onto the end of the stack and when the program runs the code it needs to also create space for the array.
To me the given syntax could only mean a dynamically sized array, meaning items could be added or removed arbitrarily resizing the structure.
The problem with this is that arrays are stored on the stack and since there's usually things added after the function variables in the frame of a function there's no space to add more items to the array!
Therefore it seems like C would create many issues by allowing this use and specifying a size for your array allows the program to use that amount of space in the stack which causes no memory issues with the stack during runtime.
You are declaring and initializing a zero-length array (GNU C extension) in that line:
int ar[] = {};
The uses of zero-length arrays are rare (e.g. tail padding structs, for alignment purposes), and your code does not look like it's meant to be using it, so you are probably doing so by accident.
As mentioned here, here and here a function (in c99 or newer) defined this way
void func(int ptr[static 1]){
//do something with ptr, knowing that ptr != NULL
}
has one parameter (ptr) of type pointer to int and the compiler can assume that the function will never be called with null as argument. (e.g. the compiler can optimize null pointer checks away or warn if func is called with a nullpointer - and yes I know, that the compiler is not required to do any of that...)
C17 section 6.7.6.3 Function declarators (including prototypes) paragraph 7 says:
A declaration of a parameter as “array of type” shall be adjusted to “qualified pointer to type”, where
the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the
keyword static also appears within the [ and ] of the array type derivation, then for each call to
the function, the value of the corresponding actual argument shall provide access to the first element
of an array with at least as many elements as specified by the size expression.
In case of the definition above the value of ptr has to provide access to the first element of an array with at least 1 element. It is therefore clear that the argument can never be null.
What I'm wandering is, whether it is valid to call such a function with the address of an int that is not part of an array. E.g. is this (given the definition of func above) technically valid or is it undefined behavior:
int var = 5;
func(&var);
I am aware that this will practically never be an issue, because no compiler I know of differentiates between a pointer to a member of an int array and a pointer to a local int variable. But given that a pointer in c (at least from the perspective of the standard) can be much more than just some integer with a special compile time type I wandered if there is some section in the standard, that makes this valid.
I do suspect, that it is actually not valid, as section 6.5.6 Additive operators paragraph 8 contains:
[...] If both the pointer operand and the result point
to elements of the same array object, or one past the last element of the array object, the evaluation
shall not produce an overflow; otherwise, the behavior is undefined. [...]
To me that sounds as if for any pointer that points to an array element adding 1 is a valid operation while it would be UB to add 1 to a pointer that points to a regular variable. That would mean, that there is indeed a difference between a pointer to an array element and a pointer to a normal variable, which would make the snippet above UB...
Section 6.5.6 Additive operators paragraph 7 contains:
For the purposes of these operators, a pointer to an object that is not an element of an array behaves
the same as a pointer to the first element of an array of length one with the type of the object as its
element type.
As the paragraph begins with "for the purposes of these operators" I suspect that there can be a difference in other contexts?
tl;dr;
Is there some section of the standard, that specifies, that there is no difference between a pointer to a regular variable of type T and a pointer to the element of an array of length one (array of type T[1])?
At face value, I think you have a point. We aren't really passing a pointer to the first element of an array. This may be UB if we consider the standard in a vacuum.
Other than the paragraph you quote in 6.5.6, there is no passage in the standard equating a single object to an array of one element. And there shouldn't be, since the two things are different. An array (of even one element) is implicitly converted to a pointer when appearing in most expressions. That's obviously not a property most object types posses.
The definition of the static keyword in [] mentions that the the pointer being passed, must be to the initial element of an array that contains at least a certain number of elements. There is another problem with the wording you cited, what about
int a[2];
func(a + 1);
Clearly the pointer being passed is not to the first element of an array. That is UB too if we take a literal interpretation of 6.7.6.3p7.
Putting the static keyword aside, when a function accepts a pointer to an object, whether the object is a member of an array (of any size) or not matters in only one context: pointer arithmetic.
In the absence of pointer arithmetic, there is no distinguishable difference in behavior when using a pointer to access an element of an array, or a standalone object.
I would argue that the intent behind 6.7.6.3p7 has pointer arithmetic in mind. And so the semantic being mentioned comes hand in hand with trying to do pointer arithmetic on the pointer being passed into the function.
The use of static 1 simply emerged naturally as useful idiom, and maybe wasn't the intent from get go. While the normative text may do with a slight correction, I think the intent behind it is clear. It isn't meant to be undefined behavior by the standard.
The authors of the Standard almost certainly intended that quality implementations would treat the value of a pointer to a non-array object in the same way as it would treat the value of a pointer to the first element of an array object of length 1. Had it merely said that a pointer to a non-array object was equivalent to a pointer to an array, however, that might have been misinterpreted as applying to all expressions that yield pointer values. This could cause problems given e.g. char a[1],*p=a;, because the expressions a and p both yield pointers of type char* with the same value, but sizeof p and sizeof a would likely yield different values.
The language was in wide use before the Standard was written, and it was hardly uncommon for programs to rely upon such behavior. Implementations that make a bona fide effort to behave in a fashion consistent with the Standard Committee's intentions as documented in the published Rationale document should thus be expected to process such code meaningfully without regard for whether a pedantic reading of the Standard would require it. Implementations that do not make such efforts, however, should not be trusted to process such code meaningfully.
While browsing some source code I came across a function like this:
void someFunction(char someArray[static 100])
{
// do something cool here
}
With some experimentation it appears other qualifiers may appear there too:
void someFunction(char someArray[const])
{
// do something cool here
}
It appears that qualifiers are only allowed inside the [ ] when the array is declared as a parameter of a function. What do these do? Why is it different for function parameters?
The first declaration tells the compiler that someArray is at least 100 elements long. This can be used for optimizations. For example, it also means that someArray is never NULL.
Note that the C Standard does not require the compiler to diagnose when a call to the function does not meet these requirements (i.e., it is silent undefined behaviour).
The second declaration simply declares someArray (not someArray's elements!) as const, i.e., you can not write someArray=someOtherArray. It is the same as if the parameter were char * const someArray.
This syntax is only usable within the innermost [] of an array declarator in a function parameter list; it would not make sense in other contexts.
The Standard text, which covers both of the above cases, is in C11 6.7.6.3/7 (was 6.7.5.3/7 in C99):
A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to type’’, where the type qualifiers (if any) are those specified within the [ and ] of the array type derivation. If the keyword static also appears within the [ and ] of the array type derivation, then for each call to the function, the value of the corresponding actual argument shall provide access to the first element of an array with at least as many
elements as specified by the size expression.
The following compiles and prints "string" as an output.
#include <stdio.h>
struct S { int x; char c[7]; };
struct S bar() {
struct S s = {42, "string"};
return s;
}
int main()
{
printf("%s", bar().c);
}
Apparently this seems to invokes an undefined behavior according to
C99 6.5.2.2/5 If an attempt is made to modify the result of a function
call or to access it after the next sequence point, the behavior is
undefined.
I don't understand where it says about "next sequence point". What's going on here?
You've run into a subtle corner of the language.
An expression of array type is, in most contexts, implicitly converted to a pointer to the first element of the array object. The exceptions, none of which apply here, are:
When the array expression is the operand of a unary & operator (which yields the address of the entire array);
When it's the operand of a unary sizeof or (as of C11) _Alignof operator (sizeof arr yields the size of the array, not the size of a pointer); and
When it's a string literal in an initializer used to initialize an array object (char str[6] = "hello"; doesn't convert "hello" to a char*.)
(The N1570 draft incorrectly adds _Alignof to the list of exceptions. In fact, for reasons that are not clear, _Alignof can only be applied to a type name, not to an expression.)
Note that there's an implicit assumption: that the array expression refers to an array object in the first place. In most cases, it does (the simplest case is when the array expression is the name of a declared array object) -- but in this one case, there is no array object.
If a function returns a struct, the struct result is returned by value. In this case, the struct contains an array, giving us an array value with no corresponding array object, at least logically. So the array expression bar().c decays to a pointer to the first element of ... er, um, ... an array object that doesn't exist.
The 2011 ISO C standard addresses this by introducing "temporary lifetime", which applies only to "A non-lvalue expression with structure or union type, where the structure or union
contains a member with array type" (N1570 6.2.4p8). Such an object may not be modified, and its lifetime ends at the end of the containing full expression or full declarator.
So as of C2011, your program's behavior is well defined. The printf call gets a pointer to the first element of an array that's part of a struct object with temporary lifetime; that object continues to exist until the printf call finishes.
But as of C99, the behavior is undefined -- not necessarily because of the clause you quote (as far as I can tell, there is no intervening sequence point), but because C99 doesn't define the array object that would be necessary for the printf to work.
If your goal is to get this program to work, rather than to understand why it might fail, you can store the result of the function call in an explicit object:
const struct s result = bar();
printf("%s", result.c);
Now you have a struct object with automatic, rather than temporary, storage duration, so it exists during and after the execution of the printf call.
The sequence point occurs at the end of the full expression- i.e., when printf returns in this example. There are other cases where sequence points occur
Effectively, this rule states that function temporaries do not live beyond the next sequence point- which in this case, occurs well after it's use, so your program has quite well-defined behaviour.
Here's a simple example of not well-defined behaviour:
char* c = bar().c; *c = 5; // UB
Here, the sequence point is met after c is created, and the memory it points to is destroyed, but we then attempt to access c, resulting in UB.
In C99 there is a sequence point at the call to a function, after the arguments have been evaluated (C99 6.5.2.2/10).
So, when bar().c is evaluated, it results in a pointer to the first element in the char c[7] array in the struct returned by bar(). However, that pointer gets copied into an argument (a nameless argument as it happens) to printf(), and by the time the call is actually made to the printf() function the sequence point mentioned above has occurred, so the member that the pointer was pointing to may no longer be alive.
As Keith Thomson mentions, C11 (and C++) make stronger guarantees about the lifetime of temporaries, so the behavior under those standards would not be undefined.