Why can arrays be assigned directly? - c

Consider this code snippet:
void foo(int a[], int b[]){
static_assert(sizeof(a) == sizeof(int*));
static_assert(sizeof(b) == sizeof(int*));
b = a;
printf("%d", b[1]);
assert(a == b); // This also works!
}
int a[3] = {[1] = 2}, b[1];
foo(a, b);
Output (no compilation error):
2
I can't get the point why b = a is valid. Even though arrays may decay to pointers, shouldn't they decay to const pointers (T * const)?

They can't.
Arrays cannot be assigned to. There are no arrays in the foo function. The syntax int a[] in a function parameter list means to declare that a has type "pointer to int". The behaviour is exactly the same as if the code were void foo(int *a, int *b). (C11 6.7.6.3/7)
It is valid to assign one pointer to another. The result is that both pointers point to the same location.
Even though arrays may decay to pointers, shouldn't they decay to const pointers (T * const)?
The pointer that results from array "decay" is an rvalue. The const qualifier is only meaningful for lvalues (C11 6.7.3/4). (The term "decay" refers to conversion of the argument, not the adjustment of the parameter).

Quoting C11, chapter §6.7.6.3, Function declarators (including prototypes)
A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to
type’’, where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation. [...]
So, a and b are actually pointers, not arrays.
There's no assignment to any array type happennning here, hence there's no problem with the code.

Yes, it would have made sense for array parameters declared with [] to be adjusted to const-qualified pointers. However, const did not exist when this behavior was established.
When the C language was being developed, it made sense to pass an array by passing its address, or, more specifically, the address of the first element. You certainly did not want to copy the entire array to pass it. Passing the address was an easy way to make the array known to the called function. (The semantics for the reference types we see in C++ had not been invented yet.) To make that easy for programmers, so that they could write foo(ArrayA, ArrayB) instead of foo(&Array[0], &ArrayB[0]), the mechanism of converting an array to a pointer to its first element was invented. (Per M.M. and The Development of the C Language by Dennis M. Ritchie, this notation for parameters already existed in C’s predecessor language, B.)
That is fine, you have hidden the conversion. But that is only where the function is called. In the called routine, the programmer who is thinking about passing an array is going to write void foo(int ArrayA[], int ArrayB[]). But since we are actually passing pointers, not arrays, these need to be changed to int *ArrayA and int *ArrayB. So the notion that parameters declared as arrays are automatically adjusted to pointers was created.
As you observe, this leaves the programmer able to assign values to the parameters, which changes the apparent base address of the array. It would have made sense for a parameter declared as int ArrayA[] to be adjusted to int * const ArrayA, so that the value of the parameter ArrayA could not be changed. Then it would act more like an array, whose address also cannot be changed, so this better fits the goal of pretending to pass arrays even though we are passing addresses.
However, at the time, const did not exist, so this was not possible, and nobody thought of inventing const at that time (or at least did work on it enough to get it adopted into the language).
Now there is a large amount of source code in the world that works with the non-const adjustment. Changing the specification of the C language now would cause problems with the existing code.

Related

Is using array arguments in C considered bad practice?

When declaring a function that accesses several consecutive values in memory, I usually use array arguments like
f(int a[4]);
It works fine for my purposes. However, I recently read the opinion of Linus Torvalds.
So I wonder if the array arguments are today considered obsolete? More particularly,
is there any case where the compiler can utilize this information (array size) to check out-of-bound access, or
is there any case where this technique brings some optimization opportunities?
In any case, what about pointers to arrays?
void f(int (*a)[4]);
Note that this form is not prone to "sizeof" mistakes. But what about efficiency in this case? I know that GCC generates the same code (link). Is that always so? And what about further optimization opportunities in this case?
If you write
void f(int a[4]);
that has exactly the same meaning to the compiler as if you wrote
void f(int *a);
This is why Linus has the opinion that he does. The [4] looks like it defines the expected size of the array, but it doesn't. Mismatches between what the code looks like it means and what it actually means are very bad when you're trying to maintain a large and complicated program.
(In general I advise people not to assume that Linus' opinions are correct. In this case I agree with him, but I wouldn't have put it so angrily.)
Since C99, there is a variation that does mean what it looks like it means:
void f(int a[static 4]);
That is, all callers of f are required to supply a pointer to an array of at least four ints; if they don't, the program has undefined behavior. This can help the optimizer, at least in principle (e.g. maybe it means the loop over a[i] inside f can be vectorized).
Your alternative construct
void f(int (*a)[4]);
gives the parameter a a different type ('pointer to array of 4 int' rather than 'pointer to int'). The array-notation equivalent of this type is
void f(int a[][4]);
Written that way, it should be immediately clear that that declaration is appropriate when the argument to f is a two-dimensional array whose inner size is 4, but not otherwise.
sizeof issues are another can of worms; my recommendation is to avoid needing to use sizeof on function arguments at almost any cost. Do not contort the parameter list of a function to make sizeof come out "right" inside the function; that makes it harder to call the function correctly, and you probably call the function a lot more times than you implement it.
Unless it is the operand of the sizeof or unary & operators, or is a string literal used to initialize a character array in a declaration, an expression of type "N-element array of T" will be converted ("decay") to an expression of type "pointer to T", and the value of the expression will be the address of the first element in the array.
When you pass an array expression as an argument to a function:
int arr[100];
...
foo( arr );
what the function actually receives is a pointer to the first element of the array, not a copy of the array. The behavior is exactly the same as if you had written
foo( &arr[0] );
There's a rule that function parameters of type T a[N] or T a[] are "adjusted" to T *a, so if your function declaration is
void foo( int a[100] )
it will be interpreted as though you wrote
void foo( int *a )
There are a couple of significant consequences of this:
Arrays are implicitly passed "by reference" to functions, so changes to the array contents in the function are reflected in the caller (unlike literally every other type);
You can't use sizeof to determine how many elements are in the passed array because there's no way to get that information from a pointer. If your function needs to know the physical size of the array in order to use it properly, then you must pass that length as a separate parameter1.
In my own code, I do not use array-style declarations in function parameter lists - what the function receives is a pointer, so I use pointer-style declarations. I can see the argument for using array-style declarations, mostly as a matter of documentation (this function is expecting an array of this size), but I think it's valuable to reinforce the pointer-ness of the parameter.
Note that you have the same problem with pointers to arrays - if I call
foo( &arr );
then the prototype for foo needs to be
void foo( int (*a)[100] );
But that's also the same prototype as if I had called it as
void bar[10][100];
foo( bar );
Just like you cannot know whether the parameter a points to a single int or the first in a sequence of ints, you can't know whether bar points to a single 100-element array, or to the first in a sequence of 100-element arrays.
This is why the gets function was deprecated in after C99 and removed from the standard library in C2011 - there's no way to tell it the size of the target buffer, so it will happily write input past the end of the array and clobber whatever follows. That's why it was such a popular malware exploit.

What is the difference between declaring an char pointer and an array of chars in a function parameter list?

What is the difference between
foo(char* grid){}
And
foo(char grid[]){}
I have a program that I tested on both styles of the function parameters. It seemed to work, but why does it work? What is the difference? Which one is more efficient, and is the first one passing by reference?
In the case of a function parameter (and only in that case), they mean the same thing.
Note that C99 dropped the "implicit int" rule, so your examples should be something like:
void foo(char* grid){}
and
void foo(char grid[]){}
A parameter defined with an array type is "adjusted" to become a parameter of pointer type, pointing to the the element type of the array.
Reference: N1570 6.7.6.3 paragraph 7. (This is a freely available draft of the 2011 ISO C standard, PDF, 1.7 MB.)
(In all other contexts, a pointer declaration and an array declaration are different. See section 6 of the comp.lang.c FAQ.
All parameters in C are passed by value, not by reference. In this case, the value being passed happens to be a pointer value, which is the address of a char object. For example, if you write:
char arr[10];
func(arr);
then the value being passed is &arr[0] (there's a separate language rule that says that array expressions are converted/adjusted to become pointer expressions in most but not all contexts). Note that no information about the length of the array is passed; if you want to keep track of that, you'll have to do so explicitly.
C doesn't have pass-by-reference as a language feature. Passing a pointer value is a way to emulate pass-by-reference.

When are array names constants or pointers?

In a document by Nick Parlante, it says, Array names are constant i.e array base address behaves like a const pointer.
e.g
{
int array[100], i;
array = NULL; //cannot change base address
array = &i; // not possible
}
But at the same time why is this valid:
void foo(int arrayparm[]){
arrayparm = NULL; // says it changes the local pointer
}
Function parameter declarations are different then formal declaration is C, in function declaration:
void foo(int arrayparm[])
^^^^^^^^^ is pointer not array
arrayparm is pointer but not array its type is int*. This is equivalent to:
void foo(int *arrayparm)
In function foo you can modify arrayparm.
Whereas in formal declaration(in side function) e.g.
int array[100];
array is not a pointer but it is a constant, It is type is char[100] and it is not modifiable lvalue.
Arrays decay into pointers in functions. The array name is a non-modifable lvalue. What this means is that, you can do this:
int x=10,y=20;
int *p = &x; // <---- p Now points to x
p = &y; // <---- p Now points to y
But not this:
int arr[10], x=10;
arr = &x; // <----- Error - Array name is a non-modifiable lvalue.
Since arrays decay immediately into pointers, an array is never actually passed to a function. As a convenience, any parameter declarations which "look like" arrays, e.g.
f(a)
char a[];
are treated by the compiler as if they were pointers, since that is what the function will receive if an array is passed:
f(a)
char *a;
This conversion holds only within function formal parameter declarations, nowhere else. If this conversion bothers you, avoid it; many people have concluded that the confusion it causes outweighs the small advantage of having the declaration "look like" the call and/or the uses within the function.
References: K&R I Sec. 5.3 p. 95, Sec. A10.1 p. 205; K&R II Sec. 5.3 p. 100, Sec. A8.6.3 p. 218, Sec. A10.1 p. 226;
When array names are passed as function argument, it "decays" to a pointer. So you can treat it like a normal pointer.
Reference: C FAQ what is meant by the ``equivalence of pointers and arrays'' in C?
Array types are not assignable in C. It was just a design decision. (It is possible to make assignment of array types copy one array over another, like assignment of structs. They just chose not to do this.)
The C99 standard, 6.5.16 paragraph 2:
An assignment operator shall have a modifiable lvalue as its left
operand.
C99 standard, 6.3.2.1 paragraph 1:
... A modifiable lvalue is an lvalue that does not have array type,
...
Plus, even if array types were assignable, NULL is not a value of type int[100].

Can I use arrays as a function parameter in C99?

The C99 standard says the following in 6.7.5.3/7:
A declaration of a parameter as ‘‘array of type’’ shall be adjusted to ‘‘qualified pointer to
type’’, where the type qualifiers (if any) are those specified within the [ and ] of the
array type derivation.
Which I understand as:
void foo(int * arr) {} // valid
void foo(int arr[]) {} // invalid
However, gcc 4.7.3 will happily accept both function definitions, even when compiled with gcc -Wall -Werror -std=c99 -pedantic-errors. Since I am not a C expert, I am unsure if maybe I misinterpreted what the standard is saying.
I also noticed that
size_t foo(int arr[]) { return sizeof(arr); }
will always return sizeof(int *) instead of the array size, which firms my belief that int arr[] is handled as int * and gcc is just trying to make me feel more comfortable.
Can someone shed some light on this issue? Just for reference, this question arose from this comment.
Some context:
First of all, remember that when an expression of type "N-element array of T" appears in a context where it isn't the operand of the sizeof or unary & operator, or isn't a string literal being used to initialize another array in a declaration, it will be converted to an expression of type "pointer to T" and its value will be the address of the first element in the array.
That means when you pass an array argument to a function, the function will receive a pointer value as a parameter; the array expression is converted to a pointer type before the function is called.
That's all well and good, but why is arr[] allowed as a pointer declaration? I can't say that this is the reason for sure, but I suspect it's a holdover from the B language, from which C was derived. In fact, pretty much everything hinky or unintuitive about arrays in C is a holdover from B.
B was a "typeless" language; you didn't have different types for floats, integers, text, whatever. Everything was stored as fixed-size words, or "cells", and memory was treated as a linear array of cells. When you declared an array in B, as in
auto arr[10];
the compiler would set aside 10 cells for the array, and then set aside an additional 11th cell that would store an offset to the first element of the array, and that additional cell would be bound to the variable arr. As in C, array indexing in B was computed as *(arr + i); you'd take the value stored in arr, add an offset i, and dereference the result. Ritchie retained most of these semantics, with the huge exception of no longer setting aside storage for the pointer to the first element of the array; instead, that pointer value would be computed from the array expression itself when the code was translated. This is why array expressions are converted to pointer types, why &arr and arr give the same value, if different types (the address of the array and the address of the first element of the array are the same) and why an array expression cannot be the target of an assignment (there's nothing to assign to; no storage has been set aside for a variable independent of the array elements).
Now here's the fun bit; in B, you'd declare a "pointer" as
auto ptr[];
This had the effect of allocating the cell to store the offset to the first element of the array and binding it to ptr, but ptr didn't point anywhere in particular; you could assign it to point to various locations. I suspect that notation was held over for a couple of reasons:
Most of the guys who worked on the initial version of C were familiar with it;
It sort of emphasizes that the parameter represents an array in the caller;
Personally, I would have preferred that Ritchie had used * to designate pointers everywhere, but he didn't (or, alternately, use [] to designate a pointer in all contexts, not just a function parameter declaration). I will normally recommend that everyone use * notation for function parameters instead of [], simply because it more accurately conveys the type of the parameter, but I can understand why people would prefer the second notation.
Both your valid and invalid declarations are internally equivalent, i.e., the compiler converts the latter to the former.
What your function sees is the pointer to the first element of the array.
PS. The alternative would be to push the whole array on the stack, which would be grossly inefficient from both time and space viewpoints.

C programming: arrays and pointers [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Is array name a pointer in C?
If I define:
int tab[4];
tab is a pointer, because if I display tab:
printf("%d", tab);
the code above will display the address to the first element in memory.
That's why i was wondering why we don't define an array like the following:
int *tab[4];
as tab is a pointer.
Thank you for any help!
tab is a pointer
No, tab is an array. An int[4] to be specific. But when you pass it as an argument to a function (and in many other contexts) the array is converted to a pointer to its first element. You can see the difference between arrays and pointers for example when you call sizeof array vs. sizeof pointer, when you try to assign to an array (that won't compile), and more.
int *tab[4];
declares an array of four pointers to int. I don't see how that is related to the confusion between arrays and pointers.
tab is not a pointer it's an array of 4 integers when passed to a function it decays into a pointer to the first element:
int tab[4];
And this is another array but it holds 4 integer pointers:
int *tab[4];
Finally, for the sake of completeness, this is a pointer to an array of 4 integers, if you dereference this you get an array of 4 integers:
int (*tab)[4];
You are not completely wrong, meaning that your statement is wrong but you are not that far from the truth.
Arrays and pointers under C share the same arithmetic but the main difference is that arrays are containers and pointers are just like any other atomic variable and their purpose is to store a memory address and provide informations about the type of the pointed value.
I suggest to read something about pointer arithmetic
Pointer Arithmetic
http://www.learncpp.com/cpp-tutorial/68-pointers-arrays-and-pointer-arithmetic/
Considering the Steve Jessop comment I would like to add a snippet that can introduce you to the simple and effective world of the pointer arithmetic:
#include <stdio.h>
int main()
{
int arr[10] = {10,11,12,13,14,15,16,17,18,19};
int pos = 3;
printf("Arithmetic part 1 %d\n",arr[pos]);
printf("Arithmetic part 2 %d\n",pos[arr]);
return(0);
}
arrays can behave like pointers, even look like pointers in your case, you can apply the same exact kind of arithmetic by they are not pointers.
int *tab[4];
this deffinition means that the tab array contains pointers of int and not int
From C standard
Coding Guidelines
The implicit conversion of array objects to a
pointer to their first element is a great inconvenience in trying to
formulate stronger type checking for arrays in C. Inexperienced, in
the C language, developers sometimes equate arrays and a pointers much
more closely than permitted by this requirement (which applies to uses
in expressions, not declarations). For instance, in:
file_1.c
extern int *a;
file_2.c
extern int a[10];
the two declarations of a are sometimes incorrectly assumed by
developers to be compatible. It is difficult to see what guideline
recommendation would overcome incorrect developer assumptions (or poor
training). If the guideline recommendation specifying a single point
of declaration is followed, this problem will not 419.1 identifier
declared in one file occur. Unlike the function designator usage,
developers are familiar with the fact that objects having an array
function designator converted to typetype are implicitly converted to
a pointer to their first element. Whether applying a unary & operator
to an operand having an array type provides readers with a helpful
visual cue or causes them to wonder about the intent of the author
(“what is that redundant operator doing there?”) is not known.
Example
static double a[5];
void f(double b[5])
{
double (*p)[5] = &a;
double **q = &b; /* This looks suspicious, */
p = &b; /* and so does this. */
q = &a;
}
If the array object has register storage class, the behavior is undefined
Under most circumstances, an expression of array type will be converted ("decay") to an expression of pointer type, and the value of the expression will be the address of the first element in the array. The exceptions to this rule are when the array expression is an operand of the sizeof, _Alignof, or unary & operators, or is a string literal being used to initialize another array in a declaration.
int tab[4];
defines tab as a 4-element array if int. In the statement
printf("%d", tab); // which *should* be printf("%p", (void*) tab);
the expression tab is converted from type "4-element array of int" to "pointer to int".

Resources