Numbers without assignment or statement in C - c

What happens to the values or text without assignment. How does compiler handle them? Do they impact performance of the code (What does the CPU/processor do with them)?
Here is the example code: I just put 3, 5 and "abcd", code doesn't throw any error.
int main()
{
// Here I declared the variables
int a, b;
// Here there are some numbers
3;
5;
"abcd";
return 0;
}

The compiler will tell you:
<source>:9:5: warning: expression result unused [-Wunused-value]
If the expression result is not used then the compiler will not generate any resulting code. It is very likely that the string literal will be not present in the object and executable files
As #EricPostpischil stated if the expression fas side effect it will be evaluated.
examples:
p++;
1 + foo(p);
1 + p++;
foo(p);
also if you declare objects as side effects prone this kind of expressions will be evaluated.
example
volatile int a, b;
a; //this will generate the code
b; //this will generate the code
https://godbolt.org/z/fa4sa51nq

C 2018 6.8.3 says:
The expression in an expression statement is evaluated as a void expression for its side effects.
Thus no use is made of the value. Only side effects are useful. In a statement such as printf("Hello, world.\n");, a so-called side effect is to send the string to standard output.
Statements such as 3; and 5; have no side effects, and the main value is ignored, so they have no effects.
All expression statements you think of as “doing something”, such as:
printf("x = %d.\n", x);
b = 4;
FindMatchingThings(a, b, c);
++y;
actually do those things by way of their side effects: Sending data to a stream, updating the value of an object, or calling a function which has its own side effects.

Related

Return and ++ in C

The output of:
#include <stdio.h>
int Fun1(int *X)
{
return (*X)++;
}
int main(void)
{
int a = 0, b = 5;
a = Fun1(&b);
printf("a = %d. b = %d.\n", a, b);
}
is:
a = 5. b = 6.
When we return (*X)++, should not the function execution stop when we return (*X), so the ++ is not executed?
The return statement causes its operand to be evaluated. Evaluation of an expression includes both computation of its value and of its side effects.
The C standard specifies the behavior of postfix ++ in 6.5.2.4 2:
The result of the postfix ++ operator is the value of the operand. As a side effect, the value of the operand object is incremented (that is, the value 1 of the appropriate type is added to it)…
Similarly, if we wrote a = b++;, the assignment would not just assign the value and be done. The side effect occurs too.
The side effects are part of what an expression does, per 6.5 1:
An expression is a sequence of operators and operands that specifies computation of a value, or that designates an object or a function, or that generates side effects, or that performs a combination thereof…
This:
*X = (*X)++ + ++(*X)
is undefined behavior, so you cannot really predict the result: Why are these constructs using pre and post-increment undefined behavior?
Also, if you want to know what undefined behavior is: Undefined, unspecified and implementation-defined behavior
This:
return (*X)++;
is equivalent to this: (Provided that X is an int)
int tmp = (*X)++;
return tmp;
return does not do anything magical here. I refer to #Eric's answer for a more in depth explanation of postfix operator.
When we return (*X)++, should not the function execution stop when we return (*X), so the ++ is not executed?
No

What is this C behavior called

I'm learning C as Javascript developer and a common mistake that I make is when I'm supposed to define multiple variables in C like this
int a, b, c;
a = b = c = 0;
I accidentally do it the Javascript way
a,b,c = 0;
I'm wondering what the above is called and when I should define variables like this.
I'm wondering what the above is called....
Two things about this statement
a,b,c = 0;
First, this is same as
a;
b;
c = 0;
The expression a; and b; result is unused.
The compiler must be throwing warning messages on this statement. When I compiled with clang compiler, I am getting following warnings:
p.c:6:2: warning: expression result unused [-Wunused-value]
a, b, c = 9;
^
p.c:6:5: warning: expression result unused [-Wunused-value]
a, b, c = 9;
^
and Second, the , in the statement is , (comma) operator.
Precisely stated, the meaning of the comma operator in the general expression
e1 , e2
is evaluate the subexpression e1 and discards the result , then evaluate e2; the value of the expression is the value of e2.
So, the value of expression a,b,c = 0 is value of c = 0. The variable a and b will remain uninitialised.
May you can try this and check the variable values after this statement:
a = 99, b = 5, c = 0;
Since you are learning C, let me tell you one more thing - The , (comma) act as separator in function calls and definitions, variable declarations, enum declarations, and similar constructs. To begin with, check this.
and when I should define variables like this.
The statement a,b,c = 0; is not definition of variable a, b and c. You have defined the variable a, b and c here
int a, b, c;
Note that in this statement, the , is act as separator.
Its use is completely depends on you as long as you know very well about it. One of the very common use of , (comma) operator is in for loop, where it can be used to initialise multiple variables and/or increment/decrement loop counter variable and other variables etc., for example :
for (i = 0, j = some_num; i < some_num; ++i, --j) ....
Divide it up by the operators:
int a, b;
a = b = 10;
^ ^^^^^^
| Right operand
Left operand
Here, first it calculates b = 10 to assign to a. b = 10 "returns" the new value of b which is 10, and that is assigned to a. It works similarly with more variables:
int a, b, c;
a = b = c = 10;
^^^^^^-Done first
^^^^^-Done second
^^^^^-Done last
a, b, c = 10 does not work the same way. First of all, you have to declare them first before assigning. Also, this uses the , operator which has a lower precedence than =, so the line is equivalent to a; b; c = 10;. As you can see, nothing really happens to a or b and only c is set to 10.
The comma used as an operator is called the comma operator.
For details, read a C standard, like n1570 (§6.5.17) or later. The semantics is defined as:
The left operand of a comma operator is evaluated as a void expression; there is a sequence point between its evaluation and that of the right operand. Then the right operand is evaluated; the result has its type and value
Of course, inside function calls like printf("x=%d y=%d\n", x, y) the comma is separating arguments. Inside macro invocations and definitions also. In that case, it is the comma punctuator (see n1570 §6.5.2, 6.7, 6.7.2.1, 6.7.2.2,6.7.2.3, 6.7.9)
If you compile with GCC, invoke it as gcc -Wall -Wextra -g. You could get useful warnings.
Consider, if so allowed, to use the Clang static analyzer (or other tools like Frama-C or Bismon; you may contact me in 2021 by email to basile.starynkevitch#cea.fr)
Take inspiration from the source code of existing free software programs like GNU make or GNU bash. They rarely use the comma operator.
In c language you can define the variable using one line
int a,b,c;
This is same as
int a;
int b;
int c;
You can define and initialize this one like this
int a,b,c;
a=b=c=10;
if you write
int a,b,c;
a,b,c=10;
it will initialize c as 10 but others will not inilizes as 10;
a and b will be defined but not initialized it will return garbage value.

Advantage of using compound assignment

What is the real advantage of using compound assignment in C/C++ (or may be applicable to many other programming languages as well)?
#include <stdio.h>
int main()
{
int exp1=20;
int b=10;
// exp1=exp1+b;
exp1+=b;
return 0;
};
I looked at few links like microsoft site, SO post1, SO Post2 .
But the advantage says exp1 is evaluated only once in case of compound statement. How exp1 is really evaluated twice in first case? I understand that current value of exp1 is read first and then new value is added. Updated value is written back to the same location. How this really happens at lower level in case of compound statement? I tried to compare assembly code of two cases, but I did not see any difference between them.
For simple expressions involving ordinary variables, the difference between
a = a + b;
and
a += b;
is syntactical only. The two expressions will behave exactly the same, and might well generate identical assembly code. (You're right; in this case it doesn't even make much sense to ask whether a is evaluated once or twice.)
Where it gets interesting is when the left-hand side of the assignment is an expression involving side effects. So if you have something like
*p++ = *p++ + 1;
versus
*p++ += 1;
it makes much more of a difference! The former tries to increment p twice (and is therefore undefined). But the latter evaluates p++ precisely once, and is well-defined.
As others have mentioned, there are also advantages of notational convenience and readability. If you have
variable1->field2[variable1->field3] = variable1->field2[variable2->field3] + 2;
it can be hard to spot the bug. But if you use
variable1->field2[variable1->field3] += 2;
it's impossible to even have that bug, and a later reader doesn't have to scrutinize the terms to rule out the possibility.
A minor advantage is that it can save you a pair of parentheses (or from a bug if you leave those parentheses out). Consider:
x *= i + 1; /* straightforward */
x = x * (i + 1); /* longwinded */
x = x * i + 1; /* buggy */
Finally (thanks to Jens Gustedt for reminding me of this), we have to go back and think a little more carefully about what we meant when we said "Where it gets interesting is when the left-hand side of the assignment is an expression involving side effects." Normally, we think of modifications as being side effects, and accesses as being "free". But for variables qualified as volatile (or, in C11, as _Atomic), an access counts as an interesting side effect, too. So if variable a has one of those qualifiers, a = a + b is not a "simple expression involving ordinary variables", and it may not be so identical to a += b, after all.
Evaluating the left side once can save you a lot if it's more than a simple variable name. For example:
int x[5] = { 1, 2, 3, 4, 5 };
x[some_long_running_function()] += 5;
In this case some_long_running_function() is only called once. This differs from:
x[some_long_running_function()] = x[some_long_running_function()] + 5;
Which calls the function twice.
This is what the standard 6.5.16.2 says:
A compound assignment of the form E1 op= E2 is equivalent to the simple assignment expression E1 = E1 op (E2), except that the lvalue E1 is evaluated only once
So the "evaluated once" is the difference. This mostly matters in embedded systems where you have volatile qualifiers and don't want to read a hardware register several times, as that could cause unwanted side-effects.
That's not really possible to reproduce here on SO, so instead here's an artificial example to demonstrate why multiple evaluations could lead to different program behavior:
#include <string.h>
#include <stdio.h>
typedef enum { SIMPLE, COMPOUND } assignment_t;
int index;
int get_index (void)
{
return index++;
}
void assignment (int arr[3], assignment_t type)
{
if(type == COMPOUND)
{
arr[get_index()] += 1;
}
else
{
arr[get_index()] = arr[get_index()] + 1;
}
}
int main (void)
{
int arr[3];
for(int i=0; i<3; i++) // init to 0 1 2
{
arr[i] = i;
}
index = 0;
assignment(arr, COMPOUND);
printf("%d %d %d\n", arr[0], arr[1], arr[2]); // 1 1 2
for(int i=0; i<3; i++) // init to 0 1 2
{
arr[i] = i;
}
index = 0;
assignment(arr, SIMPLE);
printf("%d %d %d\n", arr[0], arr[1], arr[2]); // 2 1 2 or 0 1 2
}
The simple assignment version did not only give a different result, it also introduced unspecified behavior in the code, so that two different results are possible depending on the compiler.
Not sure what you're after. Compound assignment is shorter, and therefore simpler (less complex) than using regular operations.
Consider this:
player->geometry.origin.position.x += dt * player->speed;
versus:
player->geometry.origin.position.x = player->geometry.origin.position.x + dt * player->speed;
Which one is easier to read and understand, and verify?
This, to me, is a very very real advantage, and is just as true regardless of semantic details like how many times something is evaluated.
Advantage of using compound assignment
There is a disadvantage too.
Consider the effect of types.
long long exp1 = 20;
int b=INT_MAX;
// All additions use `long long` math
exp1 = exp1 + 10 + b;
10 + b addition below will use int math and overflow (undefined behavior)
exp1 += 10 + b; // UB
// That is like the below,
exp1 = (10 + b) + exp1;
A language like C is always going to be an abstraction of the underlying machine opcodes. In the case of addition, the compiler would first move the left operand into the accumulator, and add the right operand to it. Something like this (pseudo-assembler code):
move 1,a
add 2,a
This is what 1+2 would compile to in assembler. Obviously, this is perhaps over-simplified, but you get the idea.
Also, compiler tend to optimise your code, so exp1=exp1+b would very likely compile to the same opcodes as exp1+=b.
And, as #unwind remarked, the compound statement is a lot more readable.

Why does a=(b++) have the same behavior as a=b++?

I am writing a small test app in C with GCC 4.8.4 pre-installed on my Ubuntu 14.04. And I got confused for the fact that the expression a=(b++); behaves in the same way as a=b++; does. The following simple code is used:
#include <stdint.h>
#include <stdio.h>
int main(int argc, char* argv[]){
uint8_t a1, a2, b1=10, b2=10;
a1=(b1++);
a2=b2++;
printf("a1=%u, a2=%u, b1=%u, b2=%u.\n", a1, a2, b1, b2);
}
The result after gcc compilation is a1=a2=10, while b1=b2=11. However, I expected the parentheses to have b1 incremented before its value is assigned to a1.
Namely, a1 should be 11 while a2 equals 10.
Does anyone get an idea about this issue?
However, I expected the parentheses to have b1 incremented before its value is assigned to a1
You should not have expected that: placing parentheses around an increment expression does not alter the application of its side effects.
Side effects (in this case, it means writing 11 into b1) get applied some time after retrieving the current value of b1. This could happen before or after the full assignment expression is evaluated completely. That is why a post-increment will remain a post-increment, with or without parentheses around it. If you wanted a pre-increment, place ++ before the variable:
a1 = ++b1;
Quoting from the C99:6.5.2.4:
The result of the postfix ++ operator is the value of the operand.
After the result is obtained, the value of the operand is incremented.
(That is, the value 1 of the appropriate type is added to it.) See the
discussions of additive operators and compound assignment for
information on constraints, types, and conversions and the effects of
operations on pointers. The side effect of updating the stored value
of the operand shall occur between the previous and the next sequence
point.
You can look up the C99: annex C to understand what the valid sequence points are.
In your question, just adding a parentheses doesn't change the sequence points, only the ; character does that.
Or in other words, you can view it like there's a temporary copy of b and the side-effect is original b incremented. But, until a sequence point is reached, all evaluation is done on the temporary copy of b. The temporary copy of b is then discarded, the side effect i.e. increment operation is committed to the storage,when a sequence point is reached.
Parentheses can be tricky to think about. But they do not mean, "make sure that everything inside happens first".
Suppose we have
a = b + c * d;
The higher precedence of multiplication over addition tells us that the compiler will arrange to multiply c by d, and then add the result to b. If we want the other interpretation, we can use parentheses:
a = (b + c) * d;
But suppose that we have some function calls thrown into the mix. That is, suppose we write
a = x() + y() * z();
Now, while it's clear that the return value of y() will be multiplied by the return value of z(), can we say anything about the order that x(), y(), and z() will be called in? The answer is, no, we absolutely cannot! If you're at all unsure, I invite you to try it, using x, y, and z functions like this:
int x() { printf("this is x()\n"); return 2; }
int y() { printf("this is y()\n"); return 3; }
int z() { printf("this is z()\n"); return 4; }
The first time I tried this, using the compiler in front of me, I discovered that function x() was called first, even though its result is needed last. When I changed the calling code to
a = (x() + y()) * z();
the order of the calls to x, y, and z stayed exactly the same, the compiler just arranged to combine their results differently.
Finally, it's important to realize that expressions like i++ do two things: they take i's value and add 1 to it, and then they store the new value back into i. But the store back into i doesn't necessarily happen right away, it can happen later. And the question of "when exactly does the store back into i happen?" is sort of like the question of "when does function x get called?". You can't really tell, it's up to the compiler, it usually doesn't matter, it will differ from compiler to compiler, if you really care, you're going to have to do something else to force the order.
And in any case, remember that the definition of i++ is that it gives the old value of i out to the surrounding expression. That's a pretty absolute rule, and it can not be changed just by adding some parentheses! That's not what parentheses do.
Let's go back to the previous example involving functions x, y, and z. I noticed that function x was called first. Suppose I didn't want that, suppose I wanted functions y and z to be called first. Could I achieve that by writing
x = z() + ((y() * z())?
I could write that, but it doesn't change anything. Remember, the parentheses don't mean "do everything inside first". They do cause the multiplication to happen before the addition, but the compiler was already going to do it that way anyway, based on the higher precedence of multiplication over addition.
Up above I said, "if you really care, you're going to have to do something else to force the order". What you generally have to do is use some temporary variables and some extra statements. (The technical term is "insert some sequence points.") For example, to cause y and z to get called first, I could write
c = y();
d = z();
b = x();
a = b + c * d;
In your case, if you wanted to make sure that the new value of b got assigned to a, you could write
c = b++;
a = b;
But of course that's silly -- if all you want to do is increment b and have its new value assigned to a, that's what prefix ++ is for:
a = ++b;
Your expectations are completely unfounded.
Parentheses have no direct effect on the order of execution. They don't introduce sequence points into the expression and thus they don't force any side-effects to materialize earlier than they would've materialized without parentheses.
Moreover, by definition, post-increment expression b++ evaluates to the original value of b. This requirement will remain in place regardless of how many pair of parentheses you add around b++. Even if parentheses somehow "forced" an instant increment, the language would still require (((b++))) to evaluate to the old value of b, meaning that a would still be guaranteed to receive the non-incremented value of b.
Parentheses only affects the syntactic grouping between operators and their operands. For example, in your original expression a = b++ one might immediately ask whether the ++ apples to b alone or to the result of a = b. In your case, by adding the parentheses you simply explicitly forced the ++ operator to apply to (to group with) b operand. However, according to the language syntax (and the operator precedence and associativity derived from it), ++ already applies to b, i.e. unary ++ has higher precedence than binary =. Your parentheses did not change anything, it only reiterated the grouping that was already there implicitly. Hence no change in the behavior.
Parentheses are entirely syntactic. They just group expressions and they are useful if you want to override the precedence or associativity of operators. For example, if you use parentheses here:
a = 2*(b+1);
you mean that the result of b+1 should be doubled, whereas if you omit the parentheses:
a = 2*b+1;
you mean that just b should be doubled and then the result should be incremented. The two syntax trees for these assignments are:
= =
/ \ / \
a * a +
/ \ / \
2 + * 1
/ \ / \
b 1 2 b
a = 2*(b+1); a = 2*b+1;
By using parentheses, you can therefore change the syntax tree that corresponds to your program and (of course) different syntax may correspond to different semantics.
On the other hand, in your program:
a1 = (b1++);
a2 = b2++;
parentheses are redundant because the assignment operator has lower precedence than the postfix increment (++). The two assignments are equivalent; in both cases, the corresponding syntax tree is the following:
=
/ \
a ++ (postfix)
|
b
Now that we're done with the syntax, let's go to semantics. This statement means: evaluate b++ and assign the result to a. Evaluating b++ returns the current value of b (which is 10 in your program) and, as a side effect, increments b (which now becomes 11). The returned value (that is, 10) is assigned to a. This is what you observe, and this is the correct behaviour.
However, I expected the parentheses to have b1 incremented before its value is assigned to a1.
You aren't assigning b1 to a1: you're assigning the result of the postincrement expression.
Consider the following program, which prints the value of b when executing assignment:
#include <iostream>
using namespace std;
int b;
struct verbose
{
int x;
void operator=(int y) {
cout << "b is " << b << " when operator= is executed" << endl;
x = y;
}
};
int main() {
// your code goes here
verbose a;
b = 10;
a = b++;
cout << "a is " << a.x << endl;
return 0;
}
I suspect this is undefined behavior, but nonetheless when using ideone.com I get the output shown below
b is 11 when operator= is executed
a is 10
OK, in a nutshell: b++ is a unary expression, and parentheses around it won't ever take influence on precedence of arithmetic operations, because the ++ increment operator has one of the highest (if not the highest) precedence in C. Whilst in a * (b + c), the (b + c) is a binary expression (not to be confused with binary numbering system!) because of a variable b and its addend c. So it can easily be remembered like this: parentheses put around binary, ternary, quaternary...+INF expressions will almost always have influence on precedence(*); parentheses around unary ones NEVER will - because these are "strong enough" to "withstand" grouping by parentheses.
(*)As usual, there are some exceptions to the rule, if only a handful: e. g. -> (to access members of pointers on structures) has a very strong binding despite being a binary operator. However, C beginners are very likely to take quite awhile until they can write a -> in their code, as they will need an advanced understanding of both pointers and structures.
The parentheses will not change the post-increment behaviour itself.
a1=(b1++); //b1=10
It equals to,
uint8_t mid_value = b1++; //10
a1 = (mid_value); //10
Placing ++ at the end of a statement (known as post-increment), means that the increment is to be done after the statement.
Even enclosing the variable in parenthesis doesn't change the fact that it will be incremented after the statement is done.
From learn.geekinterview.com:
In the postfix form, the increment or decrement takes place after the value is used in expression evaluation.
In prefix increment or decrement operation the increment or decrement takes place before the value is used in expression evaluation.
That's why a = (b++) and a = b++ are the same in terms of behavior.
In your case, if you want to increment b first, you should use pre-increment, ++b instead of b++ or (b++).
Change
a1 = (b1++);
to
a1 = ++b1; // b will be incremented before it is assigned to a.
To make it short:
b++ is incremented after the statement is done
But even after that, the result of b++ is put to a.
Because of that parentheses do not change the value here.

Expression x[--i] = y[++i] = z[i++], which is evaluated first?

When the evaluation of l-value precedes the evaluation of r-value and the assignment also returns a value, which of the following is evaluated first?
int i = 2;
int x[] = {1, 2, 3};
int y[] = {4, 5, 6};
int z[] = {7, 8, 9};
x[--i] = y[++i] = z[i++]; // Out of bound exception or not?
NOTE: generic C-like language with l-value evaluation coming first. From my textbook:
In some languages, for example C,
assignment is considered to be an
operator whose evaluation, in addition
to producing a side effect, also
returns the r-value thus computed.
Thus, if we write in C:
x = 2;
the evaluation of such a command, in
addition to assigning the value 2 to x,
returns the value 2. Therefore, in C,
we can also write:
y = x = 2;
which should be interpreted as:
(y = (x = 2));
I'm quite certain that the behaviour in this case is undefined, because you are modifying and reading the value of the variable i multiple times between consecutive sequence points.
Also, in C, arrays are declared by placing the [] after the variable name, not after the type:
int x[] = {1, 2, 3};
Edit:
Remove the arrays from your example, because they are [for the most part] irrelevant. Consider now the following code:
int main(void)
{
int i = 2;
int x = --i + ++i + i++;
return x;
}
This code demonstrates the operations that are performed on the variable i in your original code but without the arrays. You can see more clearly that the variable i is being modified more than once in this statement. When you rely on the state of a variable that is modified between consecutive sequence points, the behaviour is undefined. Different compilers will (and do, GCC returns 6, Clang returns 5) give different results, and the same compiler can give different results with different optimization options, or for no apparent reason at all.
If this statement has no defined behaviour because i is modified several times between comsecutive sequence points, then the same can be said for your original code. The assignment operator does not introduce a new sequence point.
General
In C, the order of any operation between two sequence points should not be dependent on. I do not remember the exact wording from the standard, but it is for this reason
i = i++;
is undefined behaviour. The standard defines a list of things that makes up sequence points, from memory this is
the semicolon after a statement
the comma operator
evaluation of all function arguments before the call to the function
the && and || operand
Looking up the page on wikipedia, the lists is more complete and describes more in detail. Sequence points is an extremely important concept in C and if you do not already know what it means, do learn it immediately.
Specific
No matter how well defined the order of evaluation and assignment of the x, y and z variables are, for
x[--i] = y[++i] = z[i++];
this statement cannot be anything but undefined behaviour because of the i--, i++ and i++.
On the other hand
x[i] = y[i] = z[i];
is well defined, but I am not sure what the status for the order of evaluation for this. If this is important however I would rather prefer this to be split into two statements along with a comment "It is important that ... is assigned/initialized before ... because ...".
i think its the same as
x[3] = y[4] = z[2];
i = 3;

Resources