Wrong array indexing does not cause error - c

Consider the following program:
#include <stdio.h>
int main(void)
{
int a[] = {1, 2, 3};
for (size_t i = 0; i < 3; i++)
printf ("%i\n", a[0, i]);
return 0;
}
Obviously, the one-dimensional array a is accessed like a two-dimensional array in for example Python. However, this code does compile with a unused-value warning. I expected it to produce an error, because I always thought this for is of multiindexing is simply wrong in C (See K&R page 112). To my surprise, the above code indeed prints out the array elements.
If you change a[0, i] on line six to a[i, 0], the first array element is printed three times. If you you use a[i, 1] the second element is printed three times.
How is a syntactically wrong multi-index on a one-dimensional array translated to pointer arithmatic and what value of the result of a[i, 0] is unused?
And, yes, I know how to multi-index in C.

0, i is a valid expression in C. The comma is an operator that evaluates both operands and discards the result of the left operand. When used in a[0, i], it is equivalent to a[i]. And a[i, 0] is equivalent to a[0].
(Note that in function calls such as f(a, b, c), the comma is an argument separator. This is a different part of the C grammar, and the comma is not an operator in this context.)

The comma here, is the comma operator. It's not a multi-indexing (which ideally would have been of the form [0][i] or [i][0]).
Quoting C11, chapter §6.5.17 (emphasis mine)
The left operand of a comma operator is evaluated as a void expression; there is a
sequence point between its evaluation and that of the right operand. Then the right
operand is evaluated; the result has its type and value.
So, in your case,
a[0, i]
is the same as
a[i]
and
a[i, 0]
is same as
a[0]

Related

Is *(arr -1) the same as arr[-1]?

I'm trying to solve this question that has a vector and there is a pointer that leads to an negative index. I'd like to know if C ignores it and considers it as a positive number or it goes backwards in the vector, like -1 being the last element of the array.
Yes, you can have negative indexes. This follows from the definition of pointer arithmetic in C, namely that p[i] is exactly equivalent to *(p + i), and it's perfectly legal for i to be a negative number.
So, no, the sign is not ignored. Also, if you're dealing directly with an array, there's no rule that says that a negative index is computed with respect to the end of the array. If you're dealing directly with an array, a negative index ends up trying to access memory "off to the left" of the beginning of the array, and is quite against the rules, leading to undefined behavior.
You might wish to try this illustrative program:
#include <stdio.h>
int main()
{
int arr[] = {1, 2, 3, 4, 5, 6, 7, 8, 9};
int *p = &arr[4];
int *endp = &arr[8];
printf("%d %d %d\n", arr[0], arr[4], arr[8]); /* prints "1 5 9" */
printf("%d %d %d\n", p[-1], p[0], p[1]); /* prints "4 5 6" */
printf("%d %d %d\n", endp[-2], endp[-1], endp[0]); /* prints "7 8 9" */
printf("x %d %d x\n", arr[-1], endp[1]); /* WRONG, undefined */
}
The last line is, as the comment indicates, wrong. It attempts to access memory outside of the array arr. It will print "random" garbage values, or theoretically it might crash.
Is *(arr -1) the same as arr[-1]?
Yes. Strictly speaking, the array subscripting [] operator is defined as arr[n] being equivalent to *( (arr) + (n) ). In this case: *( (arr) + (-1) ).
or it goes backwards in the vector
Well it goes backwards until it reaches array item 0, from there on you access the array out of bounds. It doesn't "wrap around" the index or anything like that. Example:
int arr[3] = ...;
int* ptr=arr+1;
printf("%d\n", ptr[-1]); // ok, prints item 0
int* ptr=arr;
printf("%d\n", ptr[-1]); // not ok, access out-of-bounds.
According to the C Standard (6.5.2.1 Array subscripting)
2 A postfix expression followed by an expression in square brackets []
is a subscripted designation of an element of an array object. The
definition of the subscript operator [] is that E1[E2] is identical to
(*((E1)+(E2))). Because of the conversion rules that apply to the
binary + operator, if E1 is an array object (equivalently, a pointer
to the initial element of an array object) and E2 is an integer,
E1[E2] designates the E2-th element of E1 (counting from zero).
So the expression arr[-1] is evaluated as *( arr + -1 ) that is the same as *( arr - 1).
However pay attention to that arr[-1] is not the same as -1[arr] because before the square brackets there must be a postfix expression
postfix-expression [ expression ]
and the unary minus operator has less precedence than a protfix expression.
But you may write ( -1 )[arr] that is the same as arr[-1].

Why Pointer assignment shows lvalue error when assignments look appropriate?

I am given a piece of code for which we have to guess output.
My Output: 60
#include <stdio.h>
int main()
{
int d[] = {20,30,40,50,60};
int *u,k;
u = d;
k = *((++u)++);
k += k;
(++u) += k;
printf("%d",*(++u));
return 0;
}
Expected:
k = *((++u)++) will be equal to 30 as it will iterate once(++u) and then will be iterated but not assigned. So we are in d[1].
(++u) += k here u will iterate to next position, add k to it and then assign the result to further next element of u.
Actual result:
main.c: In function ‘main’:
main.c:16:16: error: lvalue required as increment operand
k = *((++u)++);
^
main.c:18:11: error: lvalue required as left operand of assignment
(++u) += k;
And this has confused me further in concepts of pointers. Please help.
As the compiler has told you, the program is not valid C.
In C, pre-increment results in an rvalue expression, which you may not assign to or increment.
It's not a logical problem; it's a language problem. You should split that complex formula into multiple code statements.
That's all there is to it.
(In C++ it's an lvalue though and you can do both those things.)
In C, ++a is not an l-value.
Informally this means that you can't have it on the left hand side of an assignment.
It also means that you can't increment it.
So (++a)++ is invalid code.
(Note that it is valid C++).

Achieve the output in one statement

I was given this question by my school teacher. I was supposed to add in one statement in the C code and achieve this desired output.
I have tried but i am stuck. I think the main idea of this question is to establish the relationship between the int x[] and the y[] string as i increases from 0 to 6.
The code is below:
#include <stdio.h>
int main(){
int i, x[] = {-5,10,-10,-2,23,-20};
char y[20] = "goodbye";
char * p = y;
for (i=0;i<6;i++){
*(p + i) = //Fill in the one line statement here
}
y[6] = '\0';
printf("%s\n",p); //should print out "byebye"
}
As you can see the ascii value of the characters b is from 5 lesser than g and similarly for y it is 10 greater than o..so it will be (This meets the criteria of using x) (solution utilizing the values of x)
*(p+i) = (char)(*(p+i)+x[i]);
Yes one thing that is mentioned by rici is very important. *(p+i) is nothing other than p[i] - in fact it is much leaner to use and underneath it is still being calculated as *(p+i).
From standard 6.5.2.1p2 C11 N1570
A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).
The standard mentions this also. Being said this it would be as simple as
p[i]+=x[i];
Thoughts that came to my mind while solving.
It would be (things that came to my mind when I saw it very first time - this is establishing no relation between x and y).
*(p + i) = "byebye"[i];
String literals are basically arrays and it decays into pointer to the first element of it and then we do this *(decayed pointer + i). This will eventually assign the characters of "byebye" to the char array y.
Or something like this:- (too many hardcoded values - this does relate x and y)
*(p+i) = *(y+4+i%3);
Using a the modulus operation you can manipulate your loop to assign byebye to the 6 *char values in p.
This works because you are starting from y[4] which is 'b'.
The 6 in the for loop is your next hint. You need to iterate through bye twice. bye has 3 characters.
This gives you:
*(p + i) = y[4+(i%3)];

Unclear behavior of "," operator in C

In a given code I found following sequence,
data = POC_P_Status, TE_OK;
I don't understand what that does mean.
Does the data element receive the first or the second element or something else?
Update:
I read somewhere that this behavior is like this,
if i would write that:
if(data = POC_P_Status, TE_OK) { ... }
then teh if clause will be true if TE_OK is true.
What do you mean?
It stores POC_P_Status into data.
i = a, b; // stores a into i.
This is equivalent to
(i = a), b;
because the comma operator has lower precedence than assignment.
It's equivalent to the following code:
data = POC_P_Status;
TE_OK;
In other words, it assigns POC_P_Status to data and evaluates to TE_OK.
In your first case, the expression stands alone, so TE_OK is meaningful only if it's a macro with side effects. In the second case, the expression is actually part of an if statement, so it always evaluates to the value of TE_OK. The statement could be rewritten as:
data = POC_P_Status;
if (TE_OK) { ... }
From the C11 draft (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1124.pdf) :
The left operand of a comma operator is evaluated as a void
expression; there is a sequence point after its evaluation. Then the
right operand is evaluated; the result has its type and value. If
an attempt is made to modify the result of a comma operator or to
access it after the next sequence point, the behavior is undefined.
That means that in the expression:
a, b
The a is evaluated and thrown away, and then b is evaluated. The value of the whole expression is equal to b:
(a, b) == b
Comma operator is often used in places where multiple assignments are necessary but only one expression is allowed, such as for loops:
for (int i=0, z=length; i < z; i++, z--) {
// do things
}
Comma in other contexts, such as function calls and declarations, is not a comma operator:
int func(int a, int b) {...}
^
|
Not a comma operator
int a, b;
^
|
Not a comma operator

Expression x[--i] = y[++i] = z[i++], which is evaluated first?

When the evaluation of l-value precedes the evaluation of r-value and the assignment also returns a value, which of the following is evaluated first?
int i = 2;
int x[] = {1, 2, 3};
int y[] = {4, 5, 6};
int z[] = {7, 8, 9};
x[--i] = y[++i] = z[i++]; // Out of bound exception or not?
NOTE: generic C-like language with l-value evaluation coming first. From my textbook:
In some languages, for example C,
assignment is considered to be an
operator whose evaluation, in addition
to producing a side effect, also
returns the r-value thus computed.
Thus, if we write in C:
x = 2;
the evaluation of such a command, in
addition to assigning the value 2 to x,
returns the value 2. Therefore, in C,
we can also write:
y = x = 2;
which should be interpreted as:
(y = (x = 2));
I'm quite certain that the behaviour in this case is undefined, because you are modifying and reading the value of the variable i multiple times between consecutive sequence points.
Also, in C, arrays are declared by placing the [] after the variable name, not after the type:
int x[] = {1, 2, 3};
Edit:
Remove the arrays from your example, because they are [for the most part] irrelevant. Consider now the following code:
int main(void)
{
int i = 2;
int x = --i + ++i + i++;
return x;
}
This code demonstrates the operations that are performed on the variable i in your original code but without the arrays. You can see more clearly that the variable i is being modified more than once in this statement. When you rely on the state of a variable that is modified between consecutive sequence points, the behaviour is undefined. Different compilers will (and do, GCC returns 6, Clang returns 5) give different results, and the same compiler can give different results with different optimization options, or for no apparent reason at all.
If this statement has no defined behaviour because i is modified several times between comsecutive sequence points, then the same can be said for your original code. The assignment operator does not introduce a new sequence point.
General
In C, the order of any operation between two sequence points should not be dependent on. I do not remember the exact wording from the standard, but it is for this reason
i = i++;
is undefined behaviour. The standard defines a list of things that makes up sequence points, from memory this is
the semicolon after a statement
the comma operator
evaluation of all function arguments before the call to the function
the && and || operand
Looking up the page on wikipedia, the lists is more complete and describes more in detail. Sequence points is an extremely important concept in C and if you do not already know what it means, do learn it immediately.
Specific
No matter how well defined the order of evaluation and assignment of the x, y and z variables are, for
x[--i] = y[++i] = z[i++];
this statement cannot be anything but undefined behaviour because of the i--, i++ and i++.
On the other hand
x[i] = y[i] = z[i];
is well defined, but I am not sure what the status for the order of evaluation for this. If this is important however I would rather prefer this to be split into two statements along with a comment "It is important that ... is assigned/initialized before ... because ...".
i think its the same as
x[3] = y[4] = z[2];
i = 3;

Resources