It's clear from the C standard that general function calls are expressions from the definition:
An expression is a sequence of operators and operands that specifies computation of a value, or that designates an object or a function, or that generates side effects, or that performs a combination thereof. The value computations of the operands of an operator are sequenced before the value computation of the result of the operator.
(6.5.1)
Since the () are operators and it returns a value, regular function calls are obviously expressions.
But those which don't return a value don't seem to fit with this definition. The function name itself does (as it designates a function), but this isn't a function call.
The standard does clearly say that a function call is an expression, and that it can return void, but this seems to conflict with the definition of an expression. What am I missing?
Calling a function is an expression regardless of the function's return type. C's grammar is orthogonal to its type system. They are independent pieces of the language. Grammatically func(); is an expression statement.
expression_statement
: ';'
| expression ';'
;
postfix_expression
: primary_expression
| postfix_expression '[' expression ']'
| postfix_expression '(' ')'
| postfix_expression '(' argument_expression_list ')'
There are very few things you can do with a void result. You can't assign it to a variable since void variables aren't allowed. If func()'s result is void you can use four operators:
Parentheses: (func())
Comma sequencing: func(), 42
Ternary operator: 42 ? func() : func().
Cast to void: (void) func()
You can also return a void result:
return func();
Finally, in a for(init; condition; increment) loop the three pieces are all expressions. init and increment (but not condition) can be void.
for (func(); 42; func()) { }
Few of these are useful and none are good style, but they're all legal.
Paragraph 1 of clause 6.5 was not completely thought out with regard to void. The C standard is imperfect and has a number of defects. This paragraph should be received as a general description to orient readers and is not a precise mathematical specification of what an expression is.
It is said that:
An expression is a sequence of operators and operands that
specifies computation of a value, or
that designates an object or a function or
that generates side effects
or that performs a combination thereof.
The specifies computation of a value is but one among possibilities. The void function call would be the one "that generates side effects".
Any expression in the expression statement in C is considered a void expression. C11 6.8.3 Expression and null statements p2:
The expression in an expression statement is evaluated as a void expression for its side effects.153)
153) Such as assignments, and function calls which have side effects.
i.e. in the expression statement
a = 5;
a = 5 is a void expression that is evaluated for its side effects only, i.e. the assignment of value 5 into a, not for computation of a value, even though a = 5 could be used for a computation of a value in other contexts. Likewise you can write a; and it is a legal use of an expression "evaluated for its side effects", even though it has none. It does not cease to be an expression there.
The LHS of a comma operator is a void expression. A void expression can be used in ? : - then both branches will be void expressions and the entire expression in itself will be a void expression.
An expression in C can be void.
Such an expression has not a value and then it cannot be assigned to an object.
Moreover, any expression can be cast to void.
Related
Probably, there is a contradiction is the C standard for VM types used in conditional operator. Assume:
int f(void) { return 42; }
Now, what is the type of a following expression?
1 ? 0 : (int(*)[f()]) 0
It's value must be equal to NULL but I am not sure what is the type.
The int(*)[f()] is a pointer to VLA of size f(). However, to complete this VM type the size expression f() must be evaluated. The problem is that it belong to a branch of ternary operator which is not evaluated. From 6.5.15p4:
The second operand is evaluated only if the first compares unequal to 0; the third operand is evaluated only if the first compares equal to 0
Rules of the conditional operator require the combination of 0 pointer be the type of the other branch 6.5.15p6:
... if one operand is a null pointer constant, the result has the type of the other operand; ...
How to solve this contradiction? Possible solutions are:
int(*)[f()] - the f() is evaluated anyway
int(*)[] - the array type stays incomplete
undefined behavior
something else?
The rules of composite type suggest that this may be UB but I am not sure if those rules apply for this case. I look for an answer that cites the specification of C17 but wording from upcoming C2X is fine as well.
I think this boils down to what 6.5.15 p4 means with not being evaluated. Paragraph has a footnote 112:
112)A conditional expression does not yield an lvalue.
We don't really need the lvalue of null pointer (int(*)[f()]) 0; type is enough.
Given above, if we don't count non-lvalue generating partial evaluation as the evaluation, then there is no contradiction, f() can be evaluated and type can be int(*)[42].
I am learning C so I tried the below code and am getting an output of 7,6 instead of 6,7. Why?
#include <stdio.h>
int f1(int);
void main()
{
int b = 5;
printf("%d,%d", f1(b), f1(b));
}
int f1(int b)
{
static int n = 5;
n++;
return n;
}
The order of the evaluation of the function arguments is unspecified in C. (Note there's no undefined behaviour here; the arguments are not allowed to be evaluated concurrently for example.)
Typically the evaluation of the arguments is either from right to left, or from left to right.
As a rule of thumb don't call the same function twice in a function parameter list if that function has side-effects (as it does in your case), or if you pass the same parameter twice which allows something in the calling site to be modified (e.g. passing a pointer).
https://en.cppreference.com/w/c/language/eval_order
Before C11, you must follow Rule (2)
There is a sequence point after evaluation of the first (left) operand and
before evaluation of the second (right) operand of the following binary
operators: && (logical AND), || (logical OR), and , (comma).
Because arguments are considered separated by comma operator before C11. This is not optimal because arguments are pushed right to left on some platform. Thus, C11 adds Rule (12) making it unspecified.
A function call that is not sequenced before or sequenced after another
function call is indeterminately sequenced (CPU instructions that
constitute different function calls cannot be interleaved, even if the
functions are inlined)
Even C99 designated initializers still go back to Rule (2), where earlier (left) initializers are resolved before later (right) initializers relative to the comma operator. That is, until C11 adds Rule (13) making it unspecified.
In initialization list expressions, all evaluations are indeterminately
sequenced
In other words, before Rule (12) and Rule (13), the comma operator from Rule (2) is the specified behavior. Rule (2) leads to inefficient code that cannot be optimized on some platform. There is not enough registers if the number of structure member or function parameter exceed some threshold. That is, "Register Pressure" becomes an issue.
Historically, aggregate type initializers and function arguments falls back to the comma operator. In C11, they specifically add the definition that commas in those aggregate type initializers and function arguments are not "comma operators" so that Rule (12) and Rule (13) makes sense, and that Rule (2) is not applied.
I was reading this excerpt from the GNU C manual:
You use the comma operator, to separate two (ostensibly related) expressions.
Later in the description:
If you want to use the comma operator in a function argument, you need
to put parentheses around it. That’s because commas in a function
argument list have a different meaning: they separate arguments.
Until now, everything is alright. The weird part is:
foo (x, (y=47, x), z); is a function call with just three
arguments. (The second argument is (y=47, x) .)
The question is: how is the parameter pushed on the stack, how do I access it from within the function?
In your case,
foo (x, (y=47, x), z);
is functionally similar as
foo (x, x, z);
As per the property of comma operator, the LHS operand is evaluated and the result is discarded, then the RHS operand is evaluated and that's the result.
For sake of completion, quoting the C11, chapter §6.5.17
The left operand of a comma operator is evaluated as a void expression; there is a
sequence point between its evaluation and that of the right operand. Then the right
operand is evaluated; the result has its type and value.
Point to note: the variable y will be updated, as the LHS operand is evaluated as a void expression, but has no effect on this funcion call. In case, the y is a global variable and used in foo() function, it will see an initial value of 47.
That said, to answer
how is the parameter pushed on the stack
is very very implementation (architecture) dependent. C does not specify any order for function argument passing and some architecture may event not use "stack" for function argument passing, at all!!
In standard C (C99/C11) we have the so-called integer constant expressions, which are constant expressions whose operands are all constant integers.
The following definition applies:
Standard C99, Section 6.6(par.6):
An integer constant expression) shall have integer type and shall
only have operands that are integer costants, enumeration constants,
character constants, sizeof expressions whose results are integer
constants, and floating constants that are the immediate operands of
casts.
Standard C99
This appears after the definition of the more general constant expression.
(Since integer constant expression are defined after constant expression, I assume that the former is a particular case of the last.)
On the other hand, conditional expressions are considered constant expressions, constrained by the following rule:
Standard C99, Section 6.6:
Constant expressions shall not contain assignment, increment,
decrement, function-call, or comma operators, except when they are
contained within a subexpression that is not evaluated.
By unrolling the meaning of conditional expression we can fall down to postfix expressions and/or unary expressions.
Now, if we apply these constraints to integer constant expressions, we roughly obtain that they consist of conditional expressions restricted in such a way that every operand is integer/enumeration/character constants (or floating constant immediately preceded by a cast), and such that there are no assignment, increment, decrement, function-call or comma operators.
By simplicity, let us suppose that E is a such expression, without any sizeof operator and without non-evaluated operands.
MY QUESTION IS:
Are the following operators indirectly forbidden in E:
& (address),
* (indirection),
[] (array-subscript),
. (struct member),
-> (pointer to struct members).
In addition, are compound literals also forbidden?
Aditional note: I am interested in answering this question for strict conforming programs (C99/C11).
I think that they cannot be in any subexpression of E, but I am not sure if this is completely true. My quick reasoning is as follows:
If F is an integer constant subexpression of E, then F has, by definition, an integer type T.
If the unary operator & appears before F in E, then &F ins an operand having type "pointer to T", which is not allowed in E (in despite of that F is not an object, but only an integer value, so & cannot be applied). Thus & cannot appear in any E.
Since F has not any pointer type, it has no sense the expression *F.
A subscript operator [] is used to indicate an element inside an array. This means that we would have in E something like A[N]. Here, N must be an integer constant expression. However we note that A is also an operand, but it is an object of type array, which is not allowed in E. This implies that the array-subscript operator cannot appear in E.
If we have in E the operators . and ->, it implies they are used inside E as follows: S.memb pS->memb. Thus, we have the operand S whose type is struct or union and pS which is a pointer to struct or pointer to union. But these kind of "operands" are not allowed in E.
Compound literals are not allowed in E, because they are lvalues, which implies they will have an address when the program runs. Since such an address cannot be known by the compiler, the expression involving a compound literal is not considered a constant.
Do you think that my reasonings are right?
Do you know exceptional cases in that some of these operators or expressions can be [part of] an integer constant expression (as in the restricted case that I denoted E).
An ICE only has to have values (rvalues in the jargon) as primary expressions that constitute it, and no objects (lvalues).
If you build up from there to exclude operators you see that
none of the operators that need an lvalue as operand can be used (assignment, increment, decrement, unary &)
none of the operators that produce an lvalue can be used either (unary *, array member [], member ->)
the . operators that needs a struct as argument, since
there are no literals for struct
Compound literals are a misnomer, they are objects.
Function calls are not allowed either.
Some of these operators can appear in places when they are not evaluated (or not supposed not to be), in particular _Alignof, the macro offsetof and some appearances of sizeof.
With respect to the return statement, the Microsoft Visual Studio documentation says
Syntax:
return expression;
where expression is marked as optional. It continues
The value of expression, if present, is returned to the calling function. If expression is
omitted, the return value of the function is undefined.
This is quite clear, but on the other hand, there is the notion of an empty expression. This makes me confuse. Thinking of the empty expression not as nothing, but as an expression which is empty, I would think that if we have a function
void foo(void)
{
return;
}
then the expression foo() could be used wherever the empty expression is allowed. For example, the behaviour of the code
unsigned int i=0;
for(foo();i<10;i++) printf("%u",i);
would be defined.
I am aware that this is probably of little practical relevance, but I would like to understand why, in this context, the empty expression is not an expression.
It's called void expression. And you can use a void expression in for like you did. The void expression does nothing but its side effect, which is calling the function.
In fact, if the first clause of for is any type of expression, it's evaluated as a void expression:
C99 6.8.5.3 The for statement
... If clause-1 is an expression, it is evaluated as a void expression before the first evaluation of the controlling expression.
and
C99 6.3.2.2 void
The (nonexistent) value of a void expression(an expression that has type void) shall not
be used in any way, and implicit or explicit conversions (except to void) shall not be
applied to such an expression. If an expression of any other type is evaluated as a void
expression, its value or designator is discarded. (A void expression is evaluated for its
side effects.)
I don't know if there actually is such a thing in C as "the empty expression", but if there is, then you are confusing it with the type void, and at the same time you are confusing the act of returning "nothing" from a non-void function (which is illegal) with leaving out the initializer expression of a for loop (which is legal).