Why are statements with no effect considered legal in C?

Why are statements with no effect considered legal in C? - c

Pardon if this question is naive. Consider the following program:
#include <stdio.h>
int main() {
int i = 1;
i = i + 2;
5;
i;
printf("i: %d\n", i);
}
In the above example, the statements 5; and i; seem totally superfluous, yet the code compiles without warnings or errors by default (however, gcc does throw a warning: statement with no effect [-Wunused-value] warning when ran with -Wall). They have no effect on the rest of the program, so why are they considered valid statements in the first place? Does the compiler simply ignore them? Are there any benefits to allowing such statements?

One benefit to allowing such statements is from code that's created by macros or other programs, rather than being written by humans.
As an example, imagine a function int do_stuff(void) that is supposed to return 0 on success or -1 on failure. It could be that support for "stuff" is optional, and so you could have a header file that does
#if STUFF_SUPPORTED
#define do_stuff() really_do_stuff()
#else
#define do_stuff() (-1)
#endif
Now imagine some code that wants to do stuff if possible, but may or may not really care whether it succeeds or fails:
void func1(void) {
if (do_stuff() == -1) {
printf("stuff did not work\n");
}
}
void func2(void) {
do_stuff(); // don't care if it works or not
more_stuff();
}
When STUFF_SUPPORTED is 0, the preprocessor will expand the call in func2 to a statement that just reads
(-1);
and so the compiler pass will see just the sort of "superfluous" statement that seems to bother you. Yet what else can one do? If you #define do_stuff() // nothing, then the code in func1 will break. (And you'll still have an empty statement in func2 that just reads ;, which is perhaps even more superfluous.) On the other hand, if you have to actually define a do_stuff() function that returns -1, you may incur the cost of a function call for no good reason.

Simple Statements in C are terminated by semicolon.
Simple Statements in C are expressions. An expression is a combination of variables, constants and operators. Every expression results in some value of a certain type that can be assigned to a variable.
Having said that some "smart compilers" might discard 5; and i; statements.

Statements with no effect are permitted because it would be more difficult to ban them than to permit them. This was more relevant when C was first designed and compilers were smaller and simpler.
An expression statement consists of an expression followed by a semicolon. Its behavior is to evaluate the expression and discard the result (if any). Normally the purpose is that the evaluation of the expression has side effects, but it's not always easy or even possible to determine whether a given expression has side effects.
For example, a function call is an expression, so a function call followed by a semicolon is a statement. Does this statement have any side effects?
some_function();
It's impossible to tell without seeing the implementation of some_function.
How about this?
obj;
Probably not -- but if obj is defined as volatile, then it does.
Permitting any expression to be made into an expression-statement by adding a semicolon makes the language definition simpler. Requiring the expression to have side effects would add complexity to the language definition and to the compiler. C is built on a consistent set of rules (function calls are expressions, assignments are expressions, an expression followed by a semicolon is a statement) and lets programmers do what they want without preventing them from doing things that may or may not make sense.

The statements you listed with no effect are examples of an expression statement, whose syntax is given in section 6.8.3p1 of the C standard as follows:
expression-statement:
expressionopt ;
All of section 6.5 is dedicated to the definition of an expression, but loosely speaking an expression consists of constants and identifiers linked with operators. Notably, an expression may or may not contain an assignment operator and it may or may not contain a function call.
So any expression followed by a semicolon qualifies as an expression statement. In fact, each of these lines from your code is an example of an expression statement:
i = i + 2;
5;
i;
printf("i: %d\n", i);
Some operators contain side effects such as the set of assignment operators and the pre/post increment/decrement operators, and the function call operator () may have a side effect depending on what the function in question does. There is no requirement however that one of the operators must have a side effect.
Here's another example:
atoi("1");
This is calling a function and discarding the result, just like the call printf in your example but the unlike printf the function call itself does not have a side effect.

Sometimes such a statements are very handy:
int foo(int x, int y, int z)
{
(void)y; //prevents warning
(void)z;
return x*x;
}
Or when reference manual tells us to just read the registers to archive something - for example to clear or set some flag (very common situation in the uC world)
#define SREG ((volatile uint32_t *)0x4000000)
#define DREG ((volatile uint32_t *)0x4004000)
void readSREG(void)
{
*SREG; //we read it here
*DREG; // and here
}
https://godbolt.org/z/6wjh_5

Related

Defining a function as macro

I am trying to understand defining functions as macros and I have the following code, which I am not sure I understand:
#define MAX(i, limit) do \
{ \
if (i < limit) \
{ \
i++; \
} \
} while(1)
void main(void)
{
MAX(0, 3);
}
As I understand it tries to define MAX as an interval between 2 numbers? But what's the point of the infinite loop?
I have tried to store the value of MAX in a variable inside the main function, but it gives me an error saying expected an expression

I am currently in a software developing internship, and trying to learn embedded C since it's a new field for me. This was an exercise asking me what the following code will do. I was confused since I had never seen a function written like this
You are confused because this is a trick question. The posted code makes no sense whatsoever. The MAX macro expands indeed to an infinite loop and since its first argument is a literal value, i++ expands to 0++ which is a syntax error.
The lesson to be learned is: macros are confusing, error prone and should not be used to replace functions.

You have to understand that before your code gets to compiler, first it goes through a preprocessor. And it basically changes your text-written code. The way it changes the code is controlled with preprocessor directives (lines that begin with #, e.g. #include, #define, ...).
In your case, you use a #define directive, and everywhere a preprocessor finds a MAX(i, limit) will be replaced with its definition.
And the output of a preprocessor is also a textual file, but a bit modified. In your case, a preprocessor will replace MAX(0, 3) with
do
{
if (0 < 3)
{
0++;
}
} while(1)
And now the preprocessor output goes to a compiler like that.
So writing a function in a #define is not the same as writing a normal function void max(int i, int limit) { ... }.

Suppose you had a large number of statements of the form
if(a < 10) a++;
if(b < 100) b++;
if(c < 1000) c++;
In a comment, #the busybee refers to this pattern as a "saturating incrementer".
When you see a repeated pattern in code, there's a natural inclination to want to encapsulate the pattern somehow. Sometimes this is a good idea, or sometimes it's fine to just leave the repetition, if the attempt to encapsulate it ends up making things worse.
One way to encapsulate this particular pattern — I'm not going to say whether I think it's a good way or not — would be to define a function-like macro:
#define INCR_MAX(var, max) if(var < max) var++
Then you could say
INCR_MAX(a, 10);
INCR_MAX(b, 100);
INCR_MAX(c, 1000);
One reason to want to make this a function-like macro (as opposed to a true function) is that a macro can "modify its argument" — in this case, whatever variable name you hand to it as var — in a way that a true function couldn't. (That is, if your saturating incrementer were a true function, you would have to call it either as incr_max(&a, 10) or a = incr_max(a, 10), depending on how you chose to set it up.)
However, there's an issue with function-like macros and the semicolon at the end. I'm not going to explain that whole issue here; there's a big long previous SO question about it.
Applying the lesson of that other question, an "improved" INCR_MAX macro would be
#define INCR_MAX(var, max) do { if(var < max) var++; } while(0)
Finally, it appears that somewhere between your exercise and this SO question, the while(0) at the end somehow got changed to while(1). This just about has to have been an unintentional error, since while(1) makes no sense in this context whatsoever.

Yeah, there's a reason you don't understand it - it's garbage.
After preprocessing, the code is
void main(void)
{
do
{
if ( 0 < 3 )
{
0++;
}
} while(1);
}
Yeah, no clue what this thing is supposed to do. The name MAX implies that it should evaluate to the larger of its two arguments, a la
#define MAX(a,b) ((a) < (b) ? (b) : (a))
but that's obviously not what it's doing. It's not defining an interval between two numbers, it's attempting to set the value of the first argument to the second, but in a way that doesn't make a lick of sense.
There are three problems (technically, four):
the compiler will yak on 0++ - a constant cannot be the operand of the ++ or -- operators;
If either i or limit are expressions, such as MAX(i+1, i+5) you're going to have the same problem with the ++ operator and you're going to have precedence issues;
assuming you fix those problems, you still have an infinite loop;
The (technical) fourth problem is ... using a macro as a function. I know, this is embedded world, and embedded world wants to minimize function call overhead. That's what the inline function specifier is supposed to buy you so you don't have to go through this heartburn.
But, okay, maybe the compiler available for the system you're working on doesn't support inline so you have to go through this exercise.
But you're going to have to go to the person who gave you this code and politely and respectfully ask, "what is this crap?"

Where can I find weird, specific C syntax rules?

I will take an exam and my teacher asks weird C syntax rules. Like:
int q=5;
for(q=-2;q=-5;q+=3) { //assignment in condition part??
printf("%d",q); //prints -5
break;
}
Or
int d[][3][2]={4,5,6,7,8,9,10,11,12,13,14,15,16};
int i=-1;
int j;
j=d[i++][++i][++i];
printf("%d",j); //prints 4?? why j=d[0][0][0] ?
Or
extern int a;
int main() {
do {
do {
printf("%o",a); //prints 12
} while(!1);
} while(0);
return 0;
}
int a=10;
I could not find it rules any site or book. Really absurd and uncommon. Where can I find?

To me it seems that your teacher is asking questions which invole undefined behavior.
If you tell him that this is incorrect, you're directly confronting him.
However, you could do the following:
Compile the code on different platforms
Compile the code with different compilers
Compile the code with different versions of the same compiler
Build a matrix with the results. You'll find out that they differ
Show the results to your teacher ans ask him to explain why that happens
That way you do not say that he's wrong, you're just showing some facts and you're showing that you're willing to learn and work.
Do that a long before the exam so that the teacher can look into it and think about his questions so that he can change the exam in time.
I could not find it rules any site or book. Where can I find?
See Where do I find the current C or C++ standard documents?. If you have a good library at university, they should own a copy.

Concerning for(q=-2;q=-5;q+=3) {, all you need to do is to break this down into its components. q=-2 is ran first, then q=-5 is tested, and if that is not 0 (which it isn't since it's an expression with value -5), then the loop body runs once. Then break forces a premature exit from an otherwise infinite loop. The expression then q+=3 is never reached.
The behaviour of d[i++][++i][++i] is undefined. Tell your teacher that, tactfully.
The "%o" format denotes octal output. a is set to 10 in decimal which is 12 in octal. Your code would be clearer if you had written:
int a=012; // octal constant.

The online version of the C language standard has what you need (and is what I will be referring to in this answer); just bear in mind is is a language definition and not a tutorial, and as such may not be easy to read for someone who doesn't have a lot of experience yet.
Having said that, your teacher is throwing you a few foul balls. For example:
j=d[i++][++i][++i];
This statement results in undefined behavior for several reasons. The first several paragraphs of section 6.5 of the document linked above explain the problem, but in a nutshell:
Except in a few situations, C does not guarantee left-to-right evaluation of expressions; neither does it guarantee that side effects are applied immediately after evaluation;
Attempting to modify the value of an object more than once between sequence points1, or modifying and then trying to use the value of an object without an intervening sequence point, results in undefined behavior.
Basically, don't write anything of the form:
x = x++;
x++ * x++;
a[i] = i++;
a[i++] = i;
C does not guarantee that each ++i and i++ is evaluated from left to right, and it does not guarantee that the side effect of each evaluation is applied immediately. So the result of j[i++][++i][++i] is not well-defined, and the result will not be consistent over different programs, or even different builds of the same program2.
AND, on top of that, i++ evaluates to the current value of i; so clearly, your teacher's intent was for j[i++][++i][++i] to evaluate to j[-1][1][2], which would also result in undefined behavior since you're attempting to index outside of the array bounds.
This is why I hate, hate, hate it when teachers throw this kind of code at their students - not only is it needlessly confusing, not only does it encourage bad practice, but more often than not it's just plain wrong.
As for the other questions:
for(q=-2;q=-5;q+=3) { //assignment in condition part??
See sections 6.5.16 and 6.8.5.3. In short, an assignment expression has a value (the value of the left operand after any type conversions), and it can appear as part of a controlling expression in a for loop. As long as the result of the assignment is non-zero (as in the case above), the loop will execute.
printf("%o",a); //prints 12
See section 7.21.6.1. The o conversion specifier tells printf to format the integer value as octal: 1010 == 128
A sequence point is a point in a programs execution where an expression has been fully evaluated and any side effects have been applied. Sequence points occur at the ends of statements, between the evaluation of a function's parameters and the function call, after evaluating the left operand of the &&, ||, and ?: operators, and a few other places. See Annex C for the complete list.
Or even different runs of the same build, although in practice you won't see values change from run to run unless you're doing something really hinky.

Call by Need and Standard C Output?

The following code is here with keep the C Language Syntax:
#include <stdio.h>
int func(int a, int b){
if (b==0)
return 0;
else return func(a,b);
}
int main(){
printf("%d \n", func(func(1,1),func(0,0)));
return 0;
}
What is the output of this code at 1) run with standard C, 2) with any
language that has call by need property, Then:
in (1) the programs loop into infinite call and in (2) we have ouptut zero !! this is an example solved by TA in programming language course, any idea to
describe it for me? thanks

1) In C (which uses strict evaluation semantics) we get infinite recursion because in strict evaluation arguments are evaluated before a function is called. So in f(f(1,1), f(0,0)) f(1,1) and f(0,0) are evaluated before the outer f (which one of the two arguments is evaluated first is unspecified in C, but that does not matter). And since f(1,1) causes infinite recursion, we get infinite recursion.
2) In a language using non-strict evaluation (be it call-by-name or call-by-need) arguments are substituted into the function body unevaluated and are only evaluated when and if they're needed. So the outer call to f is evaluated first as such:
if (f(0, 0) == 0)
return 0;
else return f(f(1,1), f(0,0));
So when evaluating the if, we need to evaluate f(0,0), which simply evaluates to 0. So we go into the then-branch of the if and never execute the else-branch. Since all calls to f are only used in the else-branch, they're never needed and thus never evaluated. So there's no recursion, infinite or otherwise, and we just get 0.

With C, in general, it is not defined the order of arguments a and b evaluation with a function like int func(int a, int b)
Obviously evaluating func(1,1) is problematic and the code suffers from that regardless if func(1,1) is evaluated before/after/simultaneous with func(0,0)
Analysis of func(a,b) based on need may conclude that if b==0, no need to call func() and then replace with 0.
printf("%d \n", func(func(1,1),func(0,0)));
// functionally then becomes
printf("%d \n", func(func(1,1),0));
Applied again and
// functionally then becomes
printf("%d \n", 0);
Of course this conclusion is not certain as the analysis of b != 0 and else return func(a,b); leads to infinite recursion. Such code may have a useful desired side-effect (e.g. stack-overflow and system reset.) So the analysis may need to be conservative and not assume func(1,1) will ever return and not optimize out the call even if it optimized out the func(0,0) call.

To address the first part,
The C Draft, 6.5.2.2->10(Function calls) says
The order of evaluation of ... the actual arguments... is unspecified.
and for such reason, something such as
printf("%d%d",i++,++i);
has undefined behaviour, because
both ++i and i++ has side-effects, ie incrementing the value of i by one.
The comma inside printf is just a separator and NOT a [ sequence point ].
Even though function call itself is a sequence point, for the above reason, the order in which two modifications of i take place is not defined.
In your case
func(func(1,1),func(0,0))
though, the arguments for outer func ie func(1,1) or func(0,0) have no bearing on each other contrary to the case shown above. Any order of evaluation of these arguments eventually leads to infinite recursion and so the program crashes due to depleted memory.

Interpretation of instructions without effect

How can we interpret the following program and its success?(Its obvious that there must not be any error message). I mean how does compiler interpret lines 2 and 3 inside main?
#include <stdio.h>
int main()
{
int a,b;
a; //(2)
b; //(3)
return 0;
}

Your
a;
is just an expression statement. As always in C, the full expression in expression statement is evaluated and its result is immediately discarded.
For example, this
a = 2 + 3;
is an expression statement containing full expression a = 2 + 3. That expression evaluates to 5 and also has a side-effect of writing 5 into a. The result is evaluated and discarded.
Expression statement
a;
is treated in the same way, except that is has no side-effects. Since you forgot to initialize your variables, evaluation of the above expression can formally lead to undefined behavior.
Obviously, practical compilers will simply skip such expression statements entirely, since they have no observable behavior.

That's why you should use some compilation warning flags!
-Wall would trigger a "statement with no effect" warning.
If you want to see what the compilation produces, compile using -S.
Try it with your code, with/without -O (optimization) flag...

This is just like you try something like this:
#include <stdio.h>
int main(void){
1;
2;
return 0;
}
As we can see we have here two expressions followed by semicolon (1; and 2;). It is a well formed statement according to the rules of the language.
There is nothing wrong with it, it is just useless.
But if you try to use though statements (a or b) the behavior will be undefined.
Of course that, the compiler will interpret it as a statement with no effect
L.E:
If you run this:
#include <stdio.h>
int main(void){
int a;
int b;
printf("A = %d\n",a);
printf("B = %d\n",b);
if (a < b){
printf("TRUE");
}else{
printf("FALSE");
}
return 0;
}
You wil get:
A = 0
B = 0
FALSE
Because a and b are set to 0;

Sentences in C wich are not control structures (if, switch, for, while, do while) or control statements (break, continue, goto, return) are expressions.
Every expression has a resulting value.
An expression is evaluated for its side effects (change the value of an object, write a file, read volatile objects, and functions doing some of these things).
The final result of such an expression is always discarded.
For example, the function printf() returns an int value, that in general is not used. However this value is produced, and then discarded.
However the function printf() produces side effects, so it has to be processed.
If a sentence has no side effects, then the compiler is free to discard it at all.
I think that for a compiler will not be so hard to check if a sentence has not any side effects. So, what you can expect in this case is that the compiler will choose to do nothing.
Moreover, this will not affect the observable behaviour of the program, so there is no difference in what is obtained in the resulting execution of the program. However, of course, the program will run faster if any computation is ignored at all by the compiler.
Also, note that in some cases the floating point environment can set flags, which are considered side-effects.
The Standard C (C11) says, as part of paragraph 5.1.2.3p.4:
An actual implementation need not evaluate part of an expression if it
can deduce that its value is not used and that no needed side effects
are produced [...]
CONCLUSION: One has to read the documentation of the particular compiler that oneself is using.

Legal uses of setjmp and GCC

Using GCC (4.0 for me), is this legal:
if(__builtin_expect(setjmp(buf) != 0, 1))
{
// handle error
}
else
{
// do action
}
I found a discussion saying it caused a problem for GCC back in 2003, but I would imagine that they would have fixed it by now. The C standard says that it's illegal to use setjmp unless it's one of four conditions, the relevant one being this:
one operand of a relational or equality operator with the other operand an integer constant expression, with the resulting expression being the entire controlling expression of a selection or iteration statement;
But if this is a GCC extension, can I guarantee that it will work under for GCC, since it's already nonstandard functionality? I tested it and it seemed to work, though I don't know how much testing I'd have to do to actually break it. (I'm hiding the call to __builtin_expect behind a macro, which is defined as a no-op for non-GCC, so it would be perfectly legal for other compilers.)

I think that what the standard was talking about was to account for doing something like this:
int x = printf("howdy");
if (setjmp(buf) != x ) {
function_that_might_call_longjmp_with_x(buf, x);
} else {
do_something_about_them_errors();
}
In this case you could not rely on x having the value that it was assigned in the previous line anymore. The compiler may have moved the place where x had been (reusing the register it had been in, or something), so the code that did the comparison would be looking in the wrong spot. (you could save x to another variable, and then reassign x to something else before calling the function, which might make the problem more obvious)
In your code you could have written it as:
int conditional;
conditional = setjump(buf) != 0 ;
if(__builtin_expect( conditional, 1)) {
// handle error
} else {
// do action
}
And I think that we can satisfy ourselves that the line of code that assigns the variable conditional meets that requirement.

But if this is a GCC extension, can I guarantee that it will work under for GCC, since it's already nonstandard functionality? I tested it and it seemed to work, though I don't know how much testing I'd have to do to actually break it. (I'm hiding the call to __builtin_expect behind a macro, which is defined as a no-op for non-GCC, so it would be perfectly legal for other compilers.)
You are correct, __builtin_expect should be a macro no-op for other compilers so the result is still defined.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Why are statements with no effect considered legal in C? - c

Related

Defining a function as macro

Where can I find weird, specific C syntax rules?

Call by Need and Standard C Output?

Interpretation of instructions without effect

Legal uses of setjmp and GCC

Categories

Resources