I heard from a friend that two dimensional arrays in C are only supported syntactically.
He told me to better use float arr[M * N] instead of float[M][N] because C compilers like the gcc can't guarantee that on every system/platform the data lies in series within the memory.
I want to use this as an argument in my master thesis but I don't have any referrence.
So first question:
Is that right what he's saying?
Second question:
Do you know if there is a book or an article where to find this statement?
Thanks + Regards
No, he's wrong.
Look at the C standard. Some relevant bits (bold emphasis mine):
6.2.5 Types ¶20
An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type.
6.7.6.2 Array declarators ¶3 (note 142)
When several "array of" specifications are adjacent, a multidimensional array is declared.
6.5.2.1 Array subscripting ¶3
Successive subscript operators designate an element of a multidimensional array object. ... It follows from this that arrays are stored in row-major order (last subscript varies fastest).
And perhaps most explicitly, the example in 6.5.2.1 Array subscripting ¶4:
EXAMPLE Consider the array object defined by the declaration
int x[3][5];
Here x is a 3 × 5 array of ints; more precisely, x is an array of three element objects, each of which is an array of five ints. In the expression x[i], which is equivalent to (*((x)+(i))), x is first converted to a pointer to the initial array of five ints. Then i is adjusted according to the type of x, which conceptually entails multiplying i by the size of the object to which the pointer points, namely an array of five int objects. The results are added and indirection is applied to yield an array of five ints. When used in the expression x[i][j], that array is in turn converted to a pointer to the first of the ints, so x[i][j] yields an int.
Multidimensional arrays in C are just "arrays of arrays". They work fine and are 100% defined by the standard.
You may also find it helpful to read Section 6, Arrays and Pointers in the comp.lang.c FAQ.
The issue is a bit more subtle than the other answers make it sound:
While multi-dimensional arrays are (semantically, possibly not physically) contiguous, pointer arithmetics is only defined if you stay within the bounds of the array your pointer originally referenced (actually, you can go 1 element past the upper bound, but only if you don't dereference).
This means that language semantics forbid walking through a multi-dimensional array from start to end, and a bounds-checking implementation of the C language (which are possible in principle but rarely seen in the wild for performance reasons) could raise a segfault, print a diagnostic or make demons fly from your nose whenever you cross a sub-array's boundary.
I'm not sure if compilers use this information for optimization purposes, but in principle, they could. For example, if you have
float *p = &arr[2][3];
float *q = &arr[5][9];
then p + x and q + y should never alias, regardless of the values of x and y.
Section 6.2.5.20 requires that arrays be contiguously allocated. This applies as much to an array of arrays as it does to a single dimensional array.
Your friend is simply wrong.
Built-in multi-dimensional arrays in C are implemented through index translation. This means that, for example, a 3D array T a[M][N][K] is implemented as a 1D array T a_impl[M * N * K], with multi-dimensional access a[i][j][k] being implicitly translated into the single-dimensional access a_impl[((i * N) + j) * K + k]. The language specification does not explicitly describe this implementation, however the requirements mandate it pretty much directly.
Taking this into account, it is not clear why your friend would tell you to use float arr[M * N] explicitly instead of relying on the implicit implementation of the same thing by the compiler.
The situation that might make you to consider float arr[M * N] approach is when both M and N are run-time values and your compiler does not support variable-length arrays (or you for some reason do not want to use them). In such cases the built-in support for multidimensional arrays is no longer applicable, since it relies on all sizes (except the first one) being compile-time constants. Maybe this is what your friend had in mind.
Related
I imagine we all agree that it is considered idiomatic C to access a true multidimensional array by dereferencing a (possibly offset) pointer to its first element in a one-dimensional fashion, e.g.:
void clearBottomRightElement(int *array, int M, int N)
{
array[M*N-1] = 0; // Pretend the array is one-dimensional
}
int mtx[5][3];
...
clearBottomRightElement(&mtx[0][0], 5, 3);
However, the language-lawyer in me needs convincing that this is actually well-defined C! In particular:
Does the standard guarantee that the compiler won't put padding in-between e.g. mtx[0][2] and mtx[1][0]?
Normally, indexing off the end of an array (other than one-past the end) is undefined (C99, 6.5.6/8). So the following is clearly undefined:
struct {
int row[3]; // The object in question is an int[3]
int other[10];
} foo;
int *p = &foo.row[7]; // ERROR: A crude attempt to get &foo.other[4];
So by the same rule, one would expect the following to be undefined:
int mtx[5][3];
int (*row)[3] = &mtx[0]; // The object in question is still an int[3]
int *p = &(*row)[7]; // Why is this any better?
So why should this be defined?
int mtx[5][3];
int *p = &(&mtx[0][0])[7];
So what part of the C standard explicitly permits this? (Let's assume c99 for the sake of discussion.)
EDIT
Note that I have no doubt that this works fine in all compilers. What I'm querying is whether this is explicitly permitted by the standard.
All arrays (including multidimensional ones) are padding-free. Even if it's never explicitly mentioned, it can be inferred from sizeof rules.
Now, array subscription is a special case of pointer arithmetics, and C99 section 6.5.6, §8 states clearly that behaviour is only defined if the pointer operand and the resulting pointer lie in the same array (or one element past), which makes bounds-checking implementations of the C language possible.
This means that your example is, in fact, undefined behaviour. However, as most C implementations do not check bounds, it will work as expected - most compilers treat undefined pointer expressions like
mtx[0] + 5
identically to well-defined counterparts like
(int *)((char *)mtx + 5 * sizeof (int))
which is well-defined because any object (including the whole two-dimensional array) can always be treated as a one-dimensinal array of type char.
On further meditation on the wording of section 6.5.6, splitting out-of-bounds access into seemingly well-defined subexpression like
(mtx[0] + 3) + 2
reasoning that mtx[0] + 3 is a pointer to one element past the end of mtx[0] (making the first addition well-defined) and as well as a pointer to the first element of mtx[1] (making the second addition well-defined) is incorrect:
Even though mtx[0] + 3 and mtx[1] + 0 are guaranteed to compare equal (see section 6.5.9, §6), they are semantically different. For example, the former can't be dereferenced and thus does not point to an element of mtx[1].
The only obstacle to the kind of access you want to do is that objects of type int [5][3] and int [15] are not allowed to alias one another. Thus if the compiler is aware that a pointer of type int * points into one of the int [3] arrays of the former, it could impose array bounds restrictions that would prevent accessing anything outside that int [3] array.
You might be able to get around this issue by putting everything inside a union that contains both the int [5][3] array and the int [15] array, but I'm really unclear on whether the union hacks people use for type-punning are actually well-defined. This case might be slightly less problematic since you would not be type-punning individual cells, only the array logic, but I'm still not sure.
One special case that should be noted: if your type were unsigned char (or any char type), accessing the multi-dimensional array as a one-dimensional array would be perfectly well-defined. This is because the one-dimensional array of unsigned char that overlaps it is explicitly defined by the standard as the "representation" of the object, and is inherently allowed to alias it.
It is sure that there is no padding between the elements of an array.
There are provision for doing address computation in smaller size than the full address space. This could be used for instance in the huge mode of 8086 so that the segment part would not always be updated if the compiler knew that you couldn't cross a segment boundary. (It's too long ago for me to remind if the compilers I used took benefit of that or not).
With my internal model -- I'm not sure it is perfectly the same as the standard one and it is too painful to check, the information being distributed everywhere --
what you are doing in clearBottomRightElement is valid.
int *p = &foo.row[7]; is undefined
int i = mtx[0][5]; is undefined
int *p = &row[7]; doesn't compile (gcc agree with me)
int *p = &(&mtx[0][0])[7]; is in the gray zone (last time I checked in details something like this, I ended up by considering invalid C90 and valid C99, it could be the case here or I could have missed something).
My understanding of the C99 standard is that there is no requirement that multidimensional arrays must be laid out in a contiguous order in memory. Following the only relevant information I found in the standard (each dimension is guaranteed to be contiguous).
If you want to use the x[COLS*r + c] access, I suggest you stick to single dimension arrays.
Array subscripting
Successive subscript operators designate an element of a multidimensional array object.
If E is an n-dimensional array (n ≥ 2) with dimensions i × j × . . . × k, then E (used as
other than an lvalue) is converted to a pointer to an (n − 1)-dimensional array with
dimensions j × . . . × k. If the unary * operator is applied to this pointer explicitly, or
implicitly as a result of subscripting, the result is the pointed-to (n − 1)-dimensional array,
which itself is converted into a pointer if used as other than an lvalue. It follows from this
that arrays are stored in row-major order (last subscript varies fastest).
Array type
— An array type describes a contiguously allocated nonempty set of objects with a
particular member object type, called the element type.
36)
Array types are
characterized by their element type and by the number of elements in the array. An
array type is said to be derived from its element type, and if its element type is T , the
array type is sometimes called ‘‘array of T ’’. The construction of an array type from
an element type is called ‘‘array type derivation’’.
Is this always the case , i mean , that array name is always a pointer to the first element of the array.why is it so?is it something implementation kinda thing or a language feature?
An array name is not itself a pointer, but decays into a pointer to the first element of the array in most contexts. It's that way because the language defines it that way.
From C11 6.3.2.1 Lvalues, arrays, and function designators, paragraph 3:
Except when it is the operand of the sizeof operator, the _Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type "array of type" is converted to an expression with type "pointer to type" that points to the initial element of the array object and is not an lvalue.
You can learn more about this topic (and lots about the subtle behaviour involved) from the Arrays and Pointers section of the comp.lang.c FAQ.
Editorial aside: The same kind of behaviour takes place in C++, though the language specifies it a bit differently. For reference, from a C++11 draft I have here, 4.2 Array-to-pointer conversion, paragraph 1:
An lvalue or rvalue of type "array of N T" or "array of unknown bound of T" can be converted to an rvalue of type "pointer to T". The result is a pointer to the first element of the array.
The historical reason for this behavior can be found here.
C was derived from an earlier language named B (go figure). B was a typeless language, and memory was treated as a linear array of "cells", basically unsigned integers.
In B, when you declared an N-element array, as in
auto a[10];
N cells were allocated for the array, and another cell was set aside to store the address of the first element, which was bound to the variable a. As in C, array indexing was done through pointer arithmetic:
a[j] == *(a+j)
This worked pretty well until Ritchie started adding struct types to C. The example he gives in the paper is a hypothetical file system entry, which is a node id followed by a name:
struct {
int inumber;
char name[14];
};
He wanted the contents of the struct type to match the data on the disk; 2 bytes for an integer immediately followed by 14 bytes for the name. There was no good place to stash the pointer to the first element of the array.
So he got rid of it. Instead of setting aside storage for the pointer, he designed the language so that the pointer value would be computed from the array expression itself.
This, incidentally, is why an array expression cannot be the target of an assignment; it's effectively the same thing as writing 3 = 4; - you'd be trying to assign a value to another value.
Carl Norum has given the language-lawyer answer on the question (and got my upvote on it), here comes the implementation detail answer:
To the computer, any object in memory is just a range of bytes, and, as far as memory handling is concerned, uniquely identified by an address to the first byte and a size in bytes. Even when you have an int in memory, its address is nothing more or less than the address of its first byte. The size is almost always implicit: If you pass a pointer to an int, the compiler know its size because it knows that the bytes at that address are to be interpreted as an int. The same goes for structures: their address is the address of their first byte, their size is implicit.
Now, the language designers could have implemented a similar semantic with arrays as they did with structures, but they didn't for a good reason: Copying was then even more inefficient than now compared to just passing a pointer, structures were already passed around using pointers most of the time, and arrays are usually meant to be large. Prohibitively large to force value semantics on them by language.
Thus, arrays were just forced to be memory objects at all times by specifying that the name of an array would be virtually equivalent to a pointer. In order, to not break the similarity of arrays to other memory objects, the size was again said to be implicit (to the language implementation, not the programmer!): The compiler could just forget about the size of an array when it was passed somewhere else and rely on the programmer to know, how many objects were inside the array.
This had the benefit that array accesses are excrutiatingly simple; they decay to a matter of pointer arithmetic, of multiplying the index with the size of the object in the array and adding that offset to the pointer. It's the reason why a[5] is exactly the same as 5[a], it's a shorthand for *(a + 5).
Another performance related aspect is that it is excrutiatingly simple to make a subarray from an array: only the start address needs to be calculated. There is nothing that would force us to copy the data into a new array, we just have to remember to use the correct size...
So, yes, it has profound reasons in terms of implementation simplicity and performance that array names decay to pointers the way they do, and we should be glad for it.
I have the following code:
int *pa;
int a[3] = {1, 2, 3};
Why pa = a is ok, but a = pa is not allowed?
The main difference is that type of a is still an array but it just decays into a pointer when you do pa=a;. pa will now point to the first element of the array not the entire array itself. When you do a=pa it doesnot make any sense as you are trying point a datatype which is holding 3 integers to a type which can point only to a single integer.
Note: This is purely conceptual, this is not the actual reason why this happens.
I like to think of pointer assignment like OOP & Inheritance.
Imagine int * is a generic object. Now, think of int [] as an object that inherits from int *.
As you can see, you can cast down from int [] to int *, but not casting upwards.
Well, the simple answer is that the language definition simply doesn't allow it - it's a design choice.
Chapter and verse:
6.5.16 Assignment operators
...
Constraints
2 An assignment operator shall have a modifiable lvalue as its left operand.
And what's a modifiable lvalue?
6.3.2.1 Lvalues, arrays, and function designators
1 An lvalue is an expression with an object type or an incomplete type other than void;53)
if an lvalue does not designate an object when it is evaluated, the behavior is undefined.
When an object is said to have a particular type, the type is specified by the lvalue used to
designate the object. A modifiable lvalue is an lvalue that does not have array type, does
not have an incomplete type, does not have a const-qualified type, and if it is a structure
or union, does not have any member (including, recursively, any member or element of
all contained aggregates or unions) with a const-qualified type.
...
53) The name ‘‘lvalue’’ comes originally from the assignment expression E1 = E2, in which the left
operand E1 is required to be a (modifiable) lvalue. It is perhaps better considered as representing an
object ‘‘locator value’’. What is sometimes called ‘‘rvalue’’ is in this International Standard described
as the ‘‘value of an expression’’.
Emphasis added.
Array expressions in C are treated differently than most other expressions. The reason for this is explained in an article Dennis Ritchie wrote about the development of the C language:
NB existed so briefly that no full description of it was written. It supplied the types int and char, arrays of them, and pointers to them, declared in a style typified by
int i, j;
char c, d;
int iarray[10];
int ipointer[];
char carray[10];
char cpointer[];
The semantics of arrays remained exactly as in B and BCPL: the declarations of iarray and carray create cells dynamically initialized with a value pointing to the first of a sequence of 10 integers and characters respectively. The declarations for ipointer and cpointer omit the size, to assert that no storage should be allocated automatically. Within procedures, the language's interpretation of the pointers was identical to that of the array variables: a pointer declaration created a cell differing from an array declaration only in that the programmer was expected to assign a referent, instead of letting the compiler allocate the space and initialize the cell.
Values stored in the cells bound to array and pointer names were the machine addresses, measured in bytes, of the corresponding storage area. Therefore, indirection through a pointer implied no run-time overhead to scale the pointer from word to byte offset. On the other hand, the machine code for array subscripting and pointer arithmetic now depended on the type of the array or the pointer: to compute iarray[i] or ipointer+i implied scaling the addend i by the size of the object referred to.
These semantics represented an easy transition from B, and I experimented with them for some months. Problems became evident when I tried to extend the type notation, especially to add structured (record) types. Structures, it seemed, should map in an intuitive way onto memory in the machine, but in a structure containing an array, there was no good place to stash the pointer containing the base of the array, nor any convenient way to arrange that it be initialized. For example, the directory entries of early Unix systems might be described in C as
struct {
int inumber;
char name[14];
};
I wanted the structure not merely to characterize an abstract object but also to describe a collection of bits that might be read from a directory. Where could the compiler hide the pointer to name that the semantics demanded? Even if structures were thought of more abstractly, and the space for pointers could be hidden somehow, how could I handle the technical problem of properly initializing these pointers when allocating a complicated object, perhaps one that specified structures containing arrays containing structures to arbitrary depth?
The solution constituted the crucial jump in the evolutionary chain between typeless BCPL and typed C. It eliminated the materialization of the pointer in storage, and instead caused the creation of the pointer when the array name is mentioned in an expression. The rule, which survives in today's C, is that values of array type are converted, when they appear in expressions, into pointers to the first of the objects making up the array.
This invention enabled most existing B code to continue to work, despite the underlying shift in the language's semantics. The few programs that assigned new values to an array name to adjust its origin—possible in B and BCPL, meaningless in C—were easily repaired. More important, the new language retained a coherent and workable (if unusual) explanation of the semantics of arrays, while opening the way to a more comprehensive type structure.
It's a good article, and well worth reading if you're interested in the "whys" of C.
int a[10];
int b[10];
a = b; // illegal
typedef struct {
int real;
int imag;
} complex;
complex c,d;
c = d; //legal
[I realize that a and b are addresses in 1st case,but symbols in 2nd case]
For historical info, this may be interesting: http://cm.bell-labs.com/who/dmr/chist.html
In B, declaring an array would set aside memory for the array, just as C does, but the name supplied for the variable was used to define a pointer to the array. Ritchie changed this in C, so that the name "is" the array but can decay to a pointer when used:
The rule, which survives in today's C, is that values of array type
are converted, when they appear in expressions, into pointers to the
first of the objects making up the array.
This invention enabled most existing B code to continue to work,
despite the underlying shift in the language's semantics. The few
programs that assigned new values to an array name to adjust its
origin—possible in B and BCPL, meaningless in C—were easily repaired.
If at that very early stage, Ritchie had defined a = b to copy the array, then the code he was trying to port from B to C would not have been as easily repaired. As he defined it, that code would give an error, and he could fix it. If he'd made C copy the array, then he would have silently changed the meaning of the code to copy the array rather than reseating the name being used to access an array.
There's still the question, "why hasn't this feature been added in the 40 years since", but I think that's why it wasn't there to start with. It would have been effort to implement, and that effort would actually have made that early version of C worse, in the sense of being slightly harder to port B and BCPL code to C. So of course Ritchie didn't do it.
Because C says you can't, it says "A modifiable lvalue is an lvalue that does not have array type, does
not have an incomplete type", so an array can't be assigned to.
Moreover,
When you use the name of an array in a value context likea = b; , both the names a and b mean
&a[0] and &b[0]. Often referred to as an array "decaying" to a pointer.
However, arrays are not pointers, so trying to assign an array by using pointers wouldn't make sense.
The main reason is of course the Standard. On the assignment operator constraint it says:
(C99, 6.5.16p2) "An assignment operator shall have a modifiable lvalue as its left operand"
where it defines a modifiable lvalue as
(C99, 6.3.2.1p1) "A modifiable lvalue is an lvalue that does not have array type, [...]".
So assigning to arrays is not permitted.
But the main reasons are historical reasons at the times where array copy was considered not appropriate for the hardware (the old PDP systems). Not that also in the first versions of C, the assignment of structure types objects was also not allowed. It was later added to the language but for the array to many parts of the language would have been needed to be changed to allow to assign to arrays.
The first thing to understand is that arrays are not pointers. Read section 6 of the comp.lang.c FAQ. I'll wait.
...
Ok, done? No, go back and read the whole thing.
...
Great, thanks.
Generally speaking, arrays in C are second class citizens. There are array types, and array objects, and even array values, but arrays are almost always manipulated via pointers to their elements.
This requires a bit more work for the programmer (as you've seen, you can't just assign arrays), but it also gives you more flexibility. Programs that deal with arrays usually need to deal with arrays of different sizes, even sizes that can't be determined until execution time. Even if array assignment were permitted, an array of 10 ints and an array of 20 ints are of different and incompatible types. If you have a fixed-size array, as in the code in your question, it's common for only some of the elements to be currently relevant; you might have a 10-element array, but you're only currently using the first 5 elements. Processing such an array element-by-element makes it easier to process only the elements that are currently active (something you have to keep track of yourself).
For a struct, on the other hand, the number and types of the members are determined when you define the type. You can't traverse the members of a struct by advancing a pointer, as you would for an array, since the members are typically of different types. Arrays and structures are different things, and they have different sets of operations that make sense for them.
There are a couple of rules in the language that try to make it easier to do this, namely:
An array expression, in most but not all contexts, is implicitly converted to a pointer to the array's first element. The exceptions are:
When the array expression is the operand of the & (address) operator;
When it's the operand of sizeof; and
When it's a string literal in an initializer, used to initialize an array object.)
A declared array parameter, as in int func(char s[]); is adjusted to a pointer parameter: int func(char *s);.
(One could argue that these rules cause more confusion than they prevent, but that's the way the language is defined.)
Now I suppose the language could have been defined, or could be redefined, so that array assignment is permitted in cases where it makes sense:
int a[10];
int b[10];
/* ... */
a = b; /* Why not? */.
Perhaps such a change could even be made without breaking existing code. But that would require another special case for the array-to-pointer conversion rule. And it would only be useful in the case of fixed-size arrays like a and b, which though they're quite common in introductory programming exercises, are not as common in production code.
An array name is a const pointer so you can't change what it is pointing to.
Assuming that you meant c = d on the last line is legal, it's simply copying a non-const variable to another non-const variable, which is perfectly legal.
a is actually the "pointer" to the first element of array and it's a constant "pointer",
so you are trying to assign an l-"pointer".
you can achieve what are you trying to do by :
struct arraystruct
{
int t[10];
};
struct arraystruct a,b;
a=b;
EDIT:well i forgot to mention that there are a few exceptions where an array should not be considered as a pointer:
-you can use sizeof(array) but you cannot use sizeof(pointer)
-array of literal string
-a and &a are the same
That is because the kind of array you are using is a so-called static array, ie. the memory for it is on the stack.
If you would use dynamic arrays (with pointers) your assignment would be legal (but a memory leak would be possible). This would be a shallow copy.
See also Static array vs. dynamic array in C++
I imagine we all agree that it is considered idiomatic C to access a true multidimensional array by dereferencing a (possibly offset) pointer to its first element in a one-dimensional fashion, e.g.:
void clearBottomRightElement(int *array, int M, int N)
{
array[M*N-1] = 0; // Pretend the array is one-dimensional
}
int mtx[5][3];
...
clearBottomRightElement(&mtx[0][0], 5, 3);
However, the language-lawyer in me needs convincing that this is actually well-defined C! In particular:
Does the standard guarantee that the compiler won't put padding in-between e.g. mtx[0][2] and mtx[1][0]?
Normally, indexing off the end of an array (other than one-past the end) is undefined (C99, 6.5.6/8). So the following is clearly undefined:
struct {
int row[3]; // The object in question is an int[3]
int other[10];
} foo;
int *p = &foo.row[7]; // ERROR: A crude attempt to get &foo.other[4];
So by the same rule, one would expect the following to be undefined:
int mtx[5][3];
int (*row)[3] = &mtx[0]; // The object in question is still an int[3]
int *p = &(*row)[7]; // Why is this any better?
So why should this be defined?
int mtx[5][3];
int *p = &(&mtx[0][0])[7];
So what part of the C standard explicitly permits this? (Let's assume c99 for the sake of discussion.)
EDIT
Note that I have no doubt that this works fine in all compilers. What I'm querying is whether this is explicitly permitted by the standard.
All arrays (including multidimensional ones) are padding-free. Even if it's never explicitly mentioned, it can be inferred from sizeof rules.
Now, array subscription is a special case of pointer arithmetics, and C99 section 6.5.6, §8 states clearly that behaviour is only defined if the pointer operand and the resulting pointer lie in the same array (or one element past), which makes bounds-checking implementations of the C language possible.
This means that your example is, in fact, undefined behaviour. However, as most C implementations do not check bounds, it will work as expected - most compilers treat undefined pointer expressions like
mtx[0] + 5
identically to well-defined counterparts like
(int *)((char *)mtx + 5 * sizeof (int))
which is well-defined because any object (including the whole two-dimensional array) can always be treated as a one-dimensinal array of type char.
On further meditation on the wording of section 6.5.6, splitting out-of-bounds access into seemingly well-defined subexpression like
(mtx[0] + 3) + 2
reasoning that mtx[0] + 3 is a pointer to one element past the end of mtx[0] (making the first addition well-defined) and as well as a pointer to the first element of mtx[1] (making the second addition well-defined) is incorrect:
Even though mtx[0] + 3 and mtx[1] + 0 are guaranteed to compare equal (see section 6.5.9, §6), they are semantically different. For example, the former can't be dereferenced and thus does not point to an element of mtx[1].
The only obstacle to the kind of access you want to do is that objects of type int [5][3] and int [15] are not allowed to alias one another. Thus if the compiler is aware that a pointer of type int * points into one of the int [3] arrays of the former, it could impose array bounds restrictions that would prevent accessing anything outside that int [3] array.
You might be able to get around this issue by putting everything inside a union that contains both the int [5][3] array and the int [15] array, but I'm really unclear on whether the union hacks people use for type-punning are actually well-defined. This case might be slightly less problematic since you would not be type-punning individual cells, only the array logic, but I'm still not sure.
One special case that should be noted: if your type were unsigned char (or any char type), accessing the multi-dimensional array as a one-dimensional array would be perfectly well-defined. This is because the one-dimensional array of unsigned char that overlaps it is explicitly defined by the standard as the "representation" of the object, and is inherently allowed to alias it.
It is sure that there is no padding between the elements of an array.
There are provision for doing address computation in smaller size than the full address space. This could be used for instance in the huge mode of 8086 so that the segment part would not always be updated if the compiler knew that you couldn't cross a segment boundary. (It's too long ago for me to remind if the compilers I used took benefit of that or not).
With my internal model -- I'm not sure it is perfectly the same as the standard one and it is too painful to check, the information being distributed everywhere --
what you are doing in clearBottomRightElement is valid.
int *p = &foo.row[7]; is undefined
int i = mtx[0][5]; is undefined
int *p = &row[7]; doesn't compile (gcc agree with me)
int *p = &(&mtx[0][0])[7]; is in the gray zone (last time I checked in details something like this, I ended up by considering invalid C90 and valid C99, it could be the case here or I could have missed something).
My understanding of the C99 standard is that there is no requirement that multidimensional arrays must be laid out in a contiguous order in memory. Following the only relevant information I found in the standard (each dimension is guaranteed to be contiguous).
If you want to use the x[COLS*r + c] access, I suggest you stick to single dimension arrays.
Array subscripting
Successive subscript operators designate an element of a multidimensional array object.
If E is an n-dimensional array (n ≥ 2) with dimensions i × j × . . . × k, then E (used as
other than an lvalue) is converted to a pointer to an (n − 1)-dimensional array with
dimensions j × . . . × k. If the unary * operator is applied to this pointer explicitly, or
implicitly as a result of subscripting, the result is the pointed-to (n − 1)-dimensional array,
which itself is converted into a pointer if used as other than an lvalue. It follows from this
that arrays are stored in row-major order (last subscript varies fastest).
Array type
— An array type describes a contiguously allocated nonempty set of objects with a
particular member object type, called the element type.
36)
Array types are
characterized by their element type and by the number of elements in the array. An
array type is said to be derived from its element type, and if its element type is T , the
array type is sometimes called ‘‘array of T ’’. The construction of an array type from
an element type is called ‘‘array type derivation’’.