Need clarification about constant expressions - c

K&R c 2nd edition(section 2.3) mentions
A constant expression is an expression that involves only constants. Such expressions may be evaluated at during compilation rather than run-time, and accordingly may be used in any place that a constant can occur
however, I have several doubts regarding it:
Will this expression be considered as constant expression?
const int x=5;
const int y=6;
int z=x+y;
i.e using const keyword is considered constant expression or not?
Is there any technique by which we can check whether an expression was evaluated during compilation or during run-time?
Are there any cases where compile time evaluation produces different result than run-time evaluation?
Should I even care about it? (maybe I use it to optimize my programs)

Perhaps. A compiler can add more forms of constant expressions, so if it can prove to itself that the variable references are constant enough it can compute the expression at compile-time.
You can (of course) disassemble the code and see what the compiler did.
Not if the compiler is standards-compliant, no. The standard says "The semantic rules for the evaluation of a constant expression are the same as for nonconstant expressions" (§6.6 11 in the C11 draft).
Not very much, no. :) But do use const for code like that anyway!

using const keyword is considered constant expression or not?
>> No, it is not a constant. The variable using const is called const qualified, but not a compile time constant.
Is there any technique by which we can check whether an expression was evaluated during compilation or during run-time?
>> (as mentioned in Mr. Unwind's answer) Disassemble the code.
Are there any cases where compile time evaluation produces different result than run-time evaluation?
>> No, it will not. refer to Chapter §6.6 11, C11 standard.
FWIW, in case of usage with sizeof operator (compile time, though not constant expression), NULL pointer dereference will be ok. Compile time NULL pointer dereference invokes undefined behaviour.
Should I even care about it? (maybe I use it to optimize my programs)
>> Opinion-based, so won't answer.

x and y are const, z is not. compiller probably will substitute x and y , but will not substitute z. but probably compiller will calc 5 + 6 as well and will assign to z directly.
not sure you can check generated assembler code, but I do not know how this can be done.
not. compile time means expression is already calculated in run time.
I care :) but it appies only when you need fast execution.

In C, the const qualifier is just a guarantee given by the programmer to the compiler that he will not change the object. Otherwise it does not have special meanings as in C++. The initializer for such objects with file- or global scope has to be a constant expression.
As an extension, gcc has a builtin function (int __builtin_constant_p (exp)) to determine if a value is constant.
No, it shall not - unless you exploit implementation defined or undefined behaviour and compiler and target behave differently. [1]
As constant expressions are evaluated at compile-time, they safe processing time and often code space and possibly data space. Also, in some places (e.g. global initializers), only constant expressions are allowed. See the standard.
[1]: One example is right shifting a signed negative integer constant, e.g. -1 >> 24. As that is implementation defined, the compiler might yield a different result from a program run using a variable which holds the same value:
int i = -1;
(-1 >> 24) == (i >> 24)
^ ^--- run-time evaluated by target
+--- compile-time evaluated by compiler
The comparison might fail.

Related

Why bitwise-or doesn't result in a constant expression, but addition does

In one of my C files, I'm declaring an array foo. Then I'm assigning the address of that variable to an integer type, and I want to bitmask it with 3 to set the lowest two bits. However, the bitmask fails during compiling but adding +3 seems to work. Why?
uint64_t foo[1];
uint64_t bar = (uint64_t)foo | 3;
This fails with:
main.c:6:16: error: initializer element is not constant
uint64_t bar = (uint64_t)foo | 3;
But this works:
uint64_t foo[1];
uint64_t bar = (uint64_t)foo + 3;
As I understand it, the location of foo is not known at compile time because it's global (will be in the .data or .bss section). However, an entry is put into the relocation section so that the linker can patch the address in while linking.
How is it handling the the bitwise-or and the addition? Why does one work while the other doesn't?
Initial values for static objects must be constant expressions or string literals. (C 2018 6.7.9 3: “All the expressions in an initializer for an object that has static or thread storage duration shall be constant expressions or string literals.”)
6.6 7 specifies forms of constant expressions for initializers:
More latitude is permitted for constant expressions in initializers. Such a constant expression shall be, or evaluate to, one of the following:
— an arithmetic constant expression,
— a null pointer constant,
— an address constant, or
— an address constant for a complete object type plus or minus an integer constant expression.
Consider uint64_t bar = (uint64_t)foo + 3;. foo is nominally the static array declared earlier, which is automatically converted to a pointer to its first element. This qualifies as an address constant (6.6 9: “An address constant is … a pointer to an lvalue designating an object of static storage duration,… However, it is cast to uint64_t, which no longer qualifies as an address constant, an address constant plus or minus a constant expression, or a null pointer constant.
Is it an arithmetic constant expression? 6.6 8 excludes it:
… Cast operators in an arithmetic constant expression shall only convert arithmetic types to arithmetic types,…
Thus, (uint64_t)foo + 3 does not qualify as any form of constant expression required by the C standard. However, 6.6 10 says:
An implementation may accept other forms of constant expressions.
So a C implementation may accept (uint64_t) foo + 3 or (uint64_t) foo | 3 as a constant expression. Our question is then why does your C implementation accept the former but not the latter.
A common feature of linkers and object module formats is that the object module can record placeholders for certain expressions, and the linkers can evaluate these expressions and replace the placeholders with calculated values. A primary purpose of this feature is to allow for code in a program to refer to places in data or other code whose locations are not completely known during compilation but that will be decided (at least relative to some base reference point) during linking.
Places in data or code are measured relative to symbols (names) defined in the object modules (or relative to the starts of sections or segments). Thus, a place may be described, in effect, as “34 bytes after the start of routine bar” or “8 bytes after the start of object baz”. So the object module has support for placeholders that are composed of a displacement and a symbol name. After the linker assigns addresses to symbols, it reviews each placeholder, adds the displacement to the assigned address, and replaces the placeholder with the calculated result.
It appears your compiler, in spite of the uint64_t cast, is able to recognize that (uint64_t) foo is still the address of foo, and therefore (uint64_t) foo + 3 may be implemented by the regular use of one of these placeholders.
In contrast, the bitwise OR operator is not supported for use in these placeholders, and therefore the compiler is unable to implement (uint64_t) foo | 3. It cannot evaluate the expression itself (because it does not know the final address for foo), and it cannot write a placeholder for the expression. So it does not accept this as a constant expression.
When you say
sometype *p = f(x);
where p is a global variable (or one with static duration) and where f(x) is not an actual function call but rather, some sequence of compile-time operations involving the address of another symbol x which won't be known until link time, the compiler obviously can't compute the initial value immediately. It actually emits an assembly language directive which causes the assembler to construct a relocation record which causes the linker to evaluate f(x) once the final location of the symbol x is known.
So f(x) (whatever sequence of operations it actually is) has to be, in effect, a function that the linker knows how to evaluate (and that there's a relocation record for, and if necessary an assembly language directive for). And while conventional linkers are good at performing addition and subtraction (because they do it all the time), they don't necessarily know how to perform other kinds of arithmetic.
So in consequence of all this, there are some additional rules on what kinds of arithmetic you can do while constructing pointer constants.
I'm in a hurry this morning and don't have time to dig through the Standard, but I'm pretty sure there's a sentence in there somewhere stating that among other restrictions on constant expressions, when you're initializing a pointer, you're limited to an address plus or minus an integer constant expression (since that's all the C Standard is willing to assume the linker is going to know how to do).
Your question has the additional complication that you're not actually initializing a pointer variable, but rather, an integer. In that case you get, in effect, the worst of both worlds: you're either not allowed to do it at all, or if the compiler lets you, the initializer on the right (since it involves an address/pointer), is limited to the kinds of arithmetic you can do while constructing pointer constants, as described above. You don't get to do the arbitrary arithmetic you'd be able to get away with (perhaps with confounding casts) in an integer expression at run time.
According to the standard, the result of casting a pointer to an integer type is not a constant expression. So both of your examples may be rejected by a conforming compiler.
However there is the clause C11 6.6/10:
An implementation may accept other forms of constant expressions.
which unfortunately means that any particular compiler could accept none, one, or both of your examples.

Why can't a static initialization expression in C use an element of a constant array?

The following (admittedly contrived) C program fails to compile:
int main() {
const int array[] = {1,2,3};
static int x = array[1];
}
When compiling the above C source file with gcc (or Microsoft's CL.EXE), I get the following error:
error: initializer element is not constant
static int x = array[1];
^
Such simple and intuitive syntax is certainly useful, so this seems like it should be legal, but clearly it is not. Surely I am not the only person frustrated with this apparently silly limitation. I don't understand why this is disallowed-- what problem is the C language trying to avoid by making this useful syntax illegal?
It seems like it may have something to do with the way a compiler generates the assembly code for the initialization, because if you remove the "static" keyword (such that the variable "x" is on the stack), then it compiles fine.
However, another strange thing is that it compiles fine in C++ (even with the static keyword), but not in C. So, the C++ compiler seems capable of generating the necessary assembly code to perform such an initialization.
Edit:
Credit to Davislor-- in an attempt to appease the SO powers-that-be, I would seek following types of factual information to answer the question:
Is there any legacy code that supporting these semantics would break?
Have these semantics ever been formally proposed to the standards committee?
Has anyone ever given a reason for rejecting the allowance of these semantics?
Objects with static storage duration (read: variables declared at file scope or with the static keyword) must be initialized by compile time constants.
Section 6.7.9 of the C standard regarding Initialization states:
4 All the expressions in an initializer for an object that has static or thread storage duration shall be constant expressions or
string literals.
Section 6.6 regarding Constant Expressions states:
7 More latitude is permitted for constant expressions in initializers. Such a constant
expression shall be, or evaluate to, one of the following:
an arithmetic constant expression,
a null pointer constant,
an address constant, or
an address constant for a complete object type plus or minus an integer constant expression.
8 An arithmetic constant expression shall have arithmetic type and shall only have operands that are integer constants, floating
constants, enumeration constants, character constants, sizeof
expressions whose results are integer constants, and _Alignof
expressions. Cast operators in an arithmetic constant expression shall
only convert arithmetic types to arithmetic types, except as part of
an operand to a sizeof or
_Alignof operator.
9 An address constant is a null pointer, a pointer to an lvalue designating an object of static storage duration, or a pointer to a
function designator; it shall be created explicitly using the unary &
operator or an integer constant cast to pointer type, or implicitly by
the use of an expression of array or function type. The
array-subscript [] and member-access . and -> operators, the address &
and indirection * unary operators, and pointer casts may be used in
the creation of an address constant, but the value of an object shall
not be accessed by use of these operators.
By the above definition, a const variable does not qualify as a constant expression, so it can't be used to initialize a static object. C++ on the other had does treat const variables as true constants and thus allows them to initialize static objects.
If the C standard allowed this, then compilers would have to know what is in arrays. That is, the compiler would have to have a compile-time model of the array contents. Without this, the compiler has a small amount of work to do for each array: It needs to know its name and type (including its size), and a few other details such as its linkage and storage duration. But, where the initialization of the array is specified in the code, the compiler can just write the relevant information to the object file it is growing and then forget about it.
If the compiler had to be able to fetch values out of the array at compile time, it would have to remember that data. As arrays can be very large, that imposes a burden on the C compiler that the committee likely did not desire, as C is intended to operate in a wide variety of environments, including those with constrained resources.
The C++ committee made a different decision, and C++ is much more burdensome to translate.

GCC doesn't support simple integer constant expression?

GCC 4.9 and 5.1 reject this simple C99 declaration at global scope. Clang accepts it.
const int a = 1, b = a; // error: initializer element is not constant
How could such a basic feature be missing? It seems very straightforward.
C991 section 6.6 Constant expressions is the controlling section. It states in subsections 6 and 7:
6/ An integer constant expression shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, and floating constants that are the immediate operands of casts.
Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof operator.
The definition of integer and floating point constants is specified in 6.4.4 of the standard, and it's restricted to actual values (literals) rather than variables.
7/ More latitude is permitted for constant expressions in initializers. Such a constant expression shall be, or evaluate to, one of the following (a) an arithmetic constant expression, (b) a null pointer constant, (c) an address constant, or (d) an address constant for an object type plus or minus an integer constant expression.
Since a is none of those things in either subsection 6 or 7, it is not considered a constant expression as per the standard.
The real question, therefore, is not why gcc rejects it but why clang accepts it, and that appears to be buried in subsection 10 of that same section:
10/ An implementation may accept other forms of constant expressions.
In other words, the standard states what an implementation must allow for constant expressions but doesn't limit implementations to allowing only that.
1 C11 is much the same other than minor things like allowing _Alignof as well as sizeof.
This is just the rules of C. It has always been that way. At file scope, initializers must be constant expressions. The definition of a constant expression does not include variables declared with const qualifier.
The rationale behind requiring initializers computable at compile-time was so that the compiler could just put all of the initialized static data as a bloc in the executable file, and then at load time that bloc is loaded into memory as a whole and voila, the global variables all have their correct initial values without any code needing to be executed.
In fact if you could have executable code as initializer for global variables, it introduces quite a lot of complication regarding which order that code should be run in. (This is still a problem in modern C++).
In K&R C, there was no const. They could have had a rule that if a global variable is initialized by a constant expression, then that variable also counts as a constant expression. And when const was added in C89, they could have also added a rule that const int a = 5; leads to a constant expression.
However they didn't. I don't know why sure, but it seems likely that it has to do with keeping the language simple. Consider this:
extern const int a, b = a;
with const int a = 5; being in another unit. Whether or not you want to allow this, it is considerably more complication for the compiler, and some more arbitrary decisions.
If you look at the current C++ rules for constant expressions (which still are not settled to everyone's satisfaction!) you'll see that each time you add support for one more "obvious" thing then there are two other "obvious" things that are next in line and it is never-ending.
In the early days of C, in the 1970s, keeping the compiler simple was important so it may have been that making the compiler support this meant the compiler used too many system resources, or something. (Hopefully a coder from that era can step in and comment more on this!)
Finally, the C89 standardization was quite a contentious process since there were so many different C compilers that had each gone their own way with language evolution. Demanding that a compiler vendor who doesn't support this, change their compiler to support it might be met with opposition, lowering the uptake of the standard.
Because const doesn't make a constant expression -- it makes a variable that can't be assigned to (only initialized). You need constexpr to make a constant expression, which is only available in C++. C99 has no way of making a named constant expression (other than a macro, which is sort-of, but not really an expression at all).

Defining a constant in terms of other constants

sorry if this question seems naive, but I haven't been able to find a clear answer to it anywhere. I must define a constant in terms of previously defined constants, like
#define CONST_A 2
#define CONST_B 3
#define CONST_C CONST_A*CONST_B
The actual values of CONST_A and CONST_B are fed as defineflags to gcc, so I can't just write #define CONST_C 6.
If I understand correctly, this will tell the preprocessor to replace any appearance of CONST_C by 2*3 and not 6, right? I'm mainly worried about performance, so I would prefer the latter. I'm guessing this could be done by using static const instead of preprocessor #define. Is this the best option?
Thanks in advance!
Don't worry about performance of constant expressions like 2 * 3 in C. C compilers have been able to eliminate such expressions by evaluating them at compile-time for at least 20 years.
static const can be preferred for other reasons, such as type-safety or not having to worry about precedence (think what happens if CONST_A is defined as 2+2), but not for performance reasons.
C say that constant expressions can be evaluated at compile time and any today's decent compiler will evaluate constant expressions at compile time. This compiler operation is known as constant folding.
(C99, 6.6p2) "A constant expression can be evaluated during translation rather than runtime, and accordingly may be used in any place that a constant may be."

The const modifier in C

I'm quite often confused when coming back to C by the inability to create an array using the following initialisation pattern...
const int SOME_ARRAY_SIZE = 6;
const int myArray[SOME_ARRAY_SIZE];
My understanding of the problem is that the const operator does not guarantee const-ness but rather merely asserts that the value pointed to by SOME_ARRAY_SIZE will not change at runtime. But why can the compiler not assume that the value is constant at compile time? It says 6 right there in the source code...
I think I'm missing something core in my fundamental understanding of C. Somebody help me out here. :)
[UPDATE]After reading a bit more around C99 and variable length arrays I think I understand this a bit better. What I was trying to create was a variable length array - const does not create a compile time constant but rather a runtime constant. Therfore I was initialising a variable length array, which is only valid in C99 at a function/block scope. A variable length array at the file scope is impossible as the compiler cannot assign a fixed memory address to an unbounded array.[/UPDATE]
Well, in C++ the semantics are a bit different. In C++ your code would work fine. You must distinguish between 2 things, const and constant expression. Const means simply, as you described, that the value is read-only. constant expression, on the other hand, means the value is known compile time and is a compile-time constant. The semantics of const in C are always of the first type. The only constant expressions in C are literals, that's why #define is used for such kind of things.
In C++ however, any const object initialized with a constant expression is in itself a constant expression.
I don't know exactly WHY this is so in C, it's just the way it is
The problem is that the language syntax demands a integer value between the [ ]. SOME_ARRAY_SIZE is still a variable (even if you told the compiler nobody is allowed to vary it!)
The const keyword is basically a read-only indication. It does not, really, indicate the underlying value will not change, even though that is the case in your example.
When it comes to pointers, this is more clear:
void foo(int const * p)
{
if (*p == 100)
{
bar();
/* Here, the compiler can not assume that *p is 100 */
}
}
In this case, a compiler should not accept the code in your example, as it requires the array size to be constant. If it would accept it, the user could later run into trouble when porting the code a more strict compiler.
You can do this in C99, and some compilers prior to C99 also had support for this as an extension to C89 (e.g. gcc). If you're stuck with an old compiler that doesn't have C99 support though (e.g. MSVC) then you'll have to do it the old skool way and use a #define for the array size.
Note that that above comments apply only to such declarations at local scope (i.e. automatic variables). C99 still doesn't allow such declarations at global scope.
i just did a very quick test with my Xcode and Objective C file I currently had open on my machine and put this in the .m file:
const int arrs = 6;
const int arr[arrs];
This compiles without any issues.

Resources