The following code has variables used to initialize themselves, I have difficulty understanding when is a variable declaration completes and is some of them are illegal even though they compiles in gcc.
int main(void)
{
int a = a;
int b = (int) &b;
int c = c ? 1 : 0;
int d = sizeof(d);
}
In your code
int a = a;
is UB, because you're reading an indeterminate value.
int b = (int) &b;
will compile fine, because, the variable is already allocated memory, but it is not guaranteed by the standard that an int will be able to hold a value of a pointer. So, technically, this also will go to UB.
int c = c ? 1 : 0;
is UB for the same reason as first one.
int d = sizeof(d);
is fine, as in this case, sizeof gets evaluated at compile time and the value is a compile time constant.
This concept is better described in the C++ Standard (3.3.2 Point of declaration ) and has the same meaning in the C Standard
1 The point of declaration for a name is immediately after its
complete declarator (Clause 8) and before its initializer (if any),
except as noted below.
[ Example:
int x = 12;
{ int x = x; }
Here the second x is initialized with its own (indeterminate) value.
—end example ]
In the code example your showed in this declaration
int a = a;
variable a is initialized by itself. So it has indeterminate value.
This statement
int b = (int) &b;
is valid and variable b has an implementation-defined value.
This declaration
int c = c ? 1 : 0;
in fact is equivalent to the first declaration. Variable c has an indeterminate value.
This declaration
int d = sizeof(d);
is valid because the expression used in the operator sizeof is unevaluated.
See Section 6.2.1 Scopes of identifiers in the C11 specification. The scope of a variable begins just after the completion of its declarator. For the meaning of declarator see Section 6.7.6 Declarators. Note that the initializer (if present) comes after the declarator, so the variable being declared is in scope within the initializer. See Section 6.7 Declarations for the syntax of declarations, the initializer is part of an init-declarator which is defined as
init-declarator:
declarator
declarator = initializer
In
int a = a;
definition of a happens before the evaluation of a and the initialization. Same goes with second and fourth declarations except that first and third will invoke undefined behavior..
In case of
int c = c ? 1 : 0;
the problem is that variable c is used before initialization and may result in undefined behavior.
Related
#include <stdio.h>
#define a (1,2,3)
#define b {1,2,3}
int main()
{
unsigned int c = a;
unsigned int d = b;
printf("%d\n",c);
printf("%d\n",d);
return 0;
}
Above C code will print output as 3 and 1.
But how are #define a (1,2,3) and #define b {1,2,3} taking a=3 and b=1 without build warning, and also how () and {} are giving different values?
Remember, pre-processor just replaces macros. So in your case you code will be converted to this:
#include <stdio.h>
int main()
{
unsigned int c = (1,2,3);
unsigned int d = {1,2,3};
printf("%d\n",c);
printf("%d\n",d);
return 0;
}
In first case, you get result from , operator, so c will be equal to 3. But in 2nd case you get first member of initializer list for d, so you will get 1 as result.
2nd lines creates error if you compile code as c++. But it seems that you can compile this code in c.
In addition to other answers,
unsigned int d = {1,2,3};
(after macro substitution)
is not valid in C. It violates 6.7.9 Initialization:
No initializer shall attempt to provide a value for an object not contained within the entity being initialized.
With stricter compilation options (gcc -std=c17 -Wall -Wextra -pedantic test.c), gcc produces:
warning: excess elements in scalar initializer
unsigned int d = {1,2,3};
^
However, note that
unsigned int d = {1};
is valid because initializing scalar with braces is allowed. Just the extra initializer values that's the problem with the former snippet.
For c, the initializer is an expression, and its value is 3. For d, the initializer is a list in braces, and it provides too many values, of which only the first is used.
After macro expansion, the definitions of c and d are:
unsigned int c = (1,2,3);
unsigned int d = {1,2,3};
In the C grammar, the initializer that appears after unsigned int c = or unsigned int d = may be either an assignment-expression or { initializer-list } (and may have a final comma in that list). (This comes from C 2018 6.7.9 1.)
In the first line, (1,2,3) is an assignment-expression. In particular, it is a primary-expression of the form ( expression ). In that, the expression uses the comma operator; it has the form expression , assignment-expression. I will omit the continued expansion of the grammar. Suffice it to say that 1,2,3 is an expression built with comma operators, and the value of the comma operator is simply its right-hand operand. So the value of 1,2 is 2, and the value of 1,2,3 is 3. And the value of the parentheses expression is the value of the expression inside it, so the value of (1,2,3) is 3. Therefore, c is initialized to 3.
In contrast, in the second line, {1,2,3} is { initializer-list }. According to the text in C clause 6.7.9, the initializer-list provides values used to initialize the object being defined. The { … } form is provided to initialize arrays and structures, but it can be used to initialize scalar objects too. If we wrote unsigned int d = {1};, this would initialize d to 1.
However, 6.7.9 2 is a constraint that says “No initializer shall attempt to provide a value for an object not contained within the entity being initialized.” This means you may not provide more initial values than there are things to be initialized. Therefore, unsigned int d = {1,2,3}; violates the constraint. A compiler is required to produce a diagnostic message. Additionally, your compiler seems to have gone on and used only the first value in the list to initialize d. The others were superfluous and were ignored.
(Additionally, 6.7.9 11 says “The initializer for a scalar shall be a single expression, optionally enclosed in braces.”)
The following code is the basic implementation of the if - else conditional statements -
#include <stdio.h>
#include <stdlib.h>
int x;
int main()
{
if(x)
printf("hi");
else
printf("how r u \n");
return 0;
}
6.9.2 External object definitions
Semantics
1 If the declaration of an identifier for an object has file scope and an initializer, the
declaration is an external definition for the identifier.
2 A declaration of an identifier for an object that has file scope without an initializer, and
without a storage-class specifier or with the storage-class specifier static, constitutes a
tentative definition. If a translation unit contains one or more tentative definitions for an
identifier, and the translation unit contains no external definition for that identifier, then
the behavior is exactly as if the translation unit contains a file scope declaration of that
identifier, with the composite type as of the end of the translation unit, with an initializer
equal to 0.
So, Global variables are never left uninitialized, they are always initialized to 0. Hence the output how r u.
According to the C standard, your code does not trigger undefined behaviour:
If an object that has static or thread storage duration is not initialized explicitly, then:
- if it has pointer type, it is initialized to a null pointer;
- if it has arithmetic type, it is initialized to (positive or unsigned) zero;
This is taken from the C11 standard, but C89, and C99 both defined this behaviour, too:
If an object that has static storage duration is not initialized explicitly, it is initialized implicitly as if every member that has arithmetic type were assigned 0 and every member that has pointer type were assigned a null pointer constant.
Because you're declaring x as a global variable, it has static storage duration, and therefore x is guaranteed to be initialized to 0 (it being an int, it obviously has an arithmetic type).
Your main function, therefore reads like this:
int main(void)//use int main(void), not int main()
{
if(0)//x is 0, 0 is false
printf("hi");
else
printf("how r u \n");//because if (x) is false, this is executed
return 0;
}
That's why the output of your program will be "How r u".
//global
int x;
You have declared 'x' as global variable ... Therefore it's default value is 0 .
in C, I believe the following program is valid: casting a pointer to an allocated memory buffer to an array like this:
#include <stdio.h>
#include <stdlib.h>
#define ARRSIZE 4
int *getPointer(int num){
return malloc(sizeof(int) * num);
}
int main(){
int *pointer = getPointer(ARRSIZE);
int (*arrPointer)[ARRSIZE] = (int(*)[ARRSIZE])pointer;
printf("%d\n", sizeof(*arrPointer) / sizeof((*arrPointer)[0]));
return 0;
}
(this outputs 4).
However, is it safe, in C99, to do this using VLAs?
int arrSize = 4;
int *pointer = getPointer(arrSize);
int (*arrPointer)[arrSize] = (int(*)[arrSize])pointer;
printf("%d\n", sizeof(*arrPointer) / sizeof((*arrPointer)[0]));
return 0;
(also outputs 4).
Is this legit, according to the C99 standard?
It'd be quite strange if it is legit, since this would mean that VLAs effectively enable dynamic type creation, for example, types of the kind type(*)[variable].
Yes, this is legit, and yes, the variably-modified type system is extremely useful. You can use natural array syntax to access a contiguous 2-D array both of whose dimensions were not known until runtime.
It could be called syntactic sugar as there's nothing you can do with these types that you couldn't do without them, but it makes for clean code (in my opinion).
I would say it is valid. The Final version of the C99 standard (cited on Wikipedia) says in paragraph 7.5.2 - Array declarators alinea 5 :
If the size is an expression that is not an integer constant expression: ...
each time it is evaluated it shall have a value greater than zero. The size of each instance
of a variable length array type does not change during its lifetime.
It even explicitely says that it can be used in a sizeof provided the size never changes : Where a size
expression is part of the operand of a sizeof operator and changing the value of the
size expression would not affect the result of the operator, it is unspecified whether or not
the size expression is evaluated.
But the standard also says that this is only allowed in a block scope declarator or a function prototype : An ordinary identifier (as defined in 6.2.3) that has a variably modified type shall have
either block scope and no linkage or function prototype scope. If an identifier is declared
to be an object with static storage duration, it shall not have a variable length array type.
And an example later explains that it cannot be used for member fields, even in a block scope :
...
void fvla(int m, int C[m][m]); // valid: VLA with prototype scope
void fvla(int m, int C[m][m]) // valid: adjusted to auto pointer to VLA
{
typedef int VLA[m][m]; // valid: block scope typedef VLA
struct tag {
int (*y)[n]; // invalid: y not ordinary identifier
int z[n]; // invalid: z not ordinary identifier
};
...
I believe i am right, but just making sure
int c; declares c,
c = 5; initializes c to be equal to 5,
and
int c = 5; both declares and initializes c.
Am I correct on all of these? And initialization is just the first value the variable is set to correct?
int c;
declares and defines c.
c = 5;
is not an initializer, but it assigns the value 5 to c, which has the same effect.
An initializer is a syntactic construct, part of a declaration. An assignment is a different syntactic construct that does more or less the same thing.
This:
int c = 5;
declares and initializes c; the 5 is the initializer.
This:
int c;
c = 5;
has the same effect, but there is no initializer.
(You can informally say that assigning a value to a variable "initializes" it, but it does so without using an initializer.)
One case where the distinction is important:
const int c = 5;
This initializes c to 5. You can't do the same thing with an assignment because you can't assign to a const (read-only) object.
Initialization is the setting of the initial value of a variable, so you are correct.
This is the first line off the Wikipedia article on initialization:
In computer programming, initialization is the assignment of an initial value for a data object or variable.
All your statements are correct, but you are missing one definition, the difference between definition and declaration.
int c; both declares and defines c, but does not initialize it.
extern int c; will declare it but not define it. (It does not allocate storage.)
After reading the chapter about structures in the K&R book I decided to make some tests to understand them better, so I wrote this piece of code:
#include <stdio.h>
#include <string.h>
struct test func(char *c);
struct test
{
int i ;
int j ;
char x[20];
};
main(void)
{
char c[20];
struct {int i ; int j ; char x[20];} a = {5 , 7 , "someString"} , b;
c = func("Another string").x;
printf("%s\n" , c);
}
struct test func(char *c)
{
struct test temp;
strcpy(temp.x , c);
return temp;
}
My question is: why is c = func("Another string").x; working (I know that it's illegal, but why is it working)? At first I wrote it using strcpy() (because that seemed the most logical thing to do) but I kept having this error:
structest.c: In function ‘main’:
structest.c:16:2: error: invalid use of non-lvalue array
char c[20];
...
c = func("Another string").x;
This is not valid C code. Not in C89, not in C99, not in C11.
Apparently it compiles with the latest gcc versions 4.8 in -std=c89 mode without diagnostic for the assignment (clang issues the diagnostic). This is a bug in gcc when used in C89 mode.
Relevant quotes from the C90 Standard:
6.2.2.1 "A modifiable lvalue is an lvalue that does not have array type, does not have an incomplete type, does not have a const-qualified type. and if it is a structure or union. does not have any member (including. recursively, any member of all contained structures or unions) with a const-qualified type."
and
6.3.16 "An assignment operator shall have a modifiable lvalue as its left operand."
6.3.16 is a constraint and imposes at least for gcc to issue a diagnostic which gcc does not, so this is a bug.
It's a bug in gcc.
An expression of array type is, in most contexts, implicitly converted to a pointer to the first element of the array object. The exceptions are when the expression is (a) the operand of a unary sizeof operator; (b) when it's the operand of a unary & operator; and (c) when it's a string literal in an initializer used to initialize an array object. None of those exceptions apply here.
There's a loophole of sorts in that description. It assumes that, for any given expression of array type, there is an array object to which it refers (i.e., that all array expressions are lvalues). This is almost true, but there's one corner case that you've run into. A function can return a result of struct type. That result is simply a value of the struct type, not referring to any object. (This applies equally to unions, but I'll ignore that.)
This:
struct foo { int n; };
struct foo func(void) {
struct foo result = { 42 };
return result;
}
is no different in principle from this:
int func(void) {
int result = 42;
return result;
}
In both cases, a copy of the value of result is returned; that value can be used after the object result has ceased to exist.
But if the struct being returned has a member of array type, then you have an array that's a member of a non-lvalue struct -- which means you can have a non-lvalue array expression.
In both C90 and C99, an attempt to refer to such an array (unless it's the operand of sizeof) has undefined behavior -- not because the standard says so, but because it doesn't define the behavior.
struct weird {
int arr[10];
};
struct weird func(void) {
struct weird result = { 0 };
return result;
}
Calling func() gives you an expression of type struct weird; there's nothing wrong with that, and you can, for example, assign it to an object of type struct weird. But if you write something like this:
(void)func().arr;
then the standard says that the array expression func().arr is converted to a pointer to the first element of the non-existent object to which it refers. This is not just a case of undefined behavior by omission (which the standard explicitly states is still undefined behavior). This is a bug in the standard. In any case, the standard fails to define the behavior.
In the 2011 ISO C standard (C11), the committee finally recognized this corner case, and created the concept of temporary lifetime. N1570 6.2.4p8 says:
A non-lvalue expression with structure or union type, where the
structure or union contains a member with array type (including,
recursively, members of all contained structures and unions) refers to
an object with automatic storage duration and temporary lifetime
Its lifetime begins when the expression is evaluated and its initial
value is the value of the expression. Its lifetime ends when the
evaluation of the containing full expression or full declarator ends.
Any attempt to modify an object with temporary lifetime results in
undefined behavior.
with a footnote:
The address of such an object is taken implicitly when an array member is accessed.
So the C11 solution to this quandary was to create a temporary object so that the array-to-pointer conversion would actually yield the address of something meaningful (an element of a member of an object with temporary lifetime).
Apparently the code in gcc that handles this case isn't quite right. In C90 mode, it has to do something to work around the inconsistency in that version of the standard. Apparently it treats func().arr as a non-lvalue array expression (which might arguably be correct under C90 rules) -- but then it incorrectly permits that array value to be assigned to an array object. An attempt to assign to an array object, whatever the expression on the right side of the assignment happens to be, clearly violates the constraint section in C90 6.3.16.1, which requires a diagnostic if the LHS is not an lvalue of arithmetic, pointer, structure, or union type. It's not clear (from the C90 and C99 rules) whether a compiler must diagnose an expression like func().arr, but it clearly must diagnose an attempt to assign that expression to an array object, either in C90, C99, or C11.
It's still a bit of a mystery why this bug appears in C90 mode while it's correctly diagnosed in C99 mode, since as far as I know there was no significant change in this particular area of the standard between C90 and C99 (temporary lifetime was only introduced in C11). But since it's a bug I don't suppose we can complain too much about it showing up inconsistently.
Workaround: Don't do that.
This line
c = func("Another string").x;
with c being declared as
char c[20];
is not valid C in any version of C. If it "works" in your case, it is either a compiler bug or a rather weird compiler extension.
In case of strcpy
strcpy(c, func("Another string").x);
the relevant detail is the nature of func("Another string").x subexpression. In "classic" C89/90 this subexpression cannot be subjected to array-to-pointer conversion, since in C89/90 array-to-pointer conversion applied to lvalue arrays only. Meanwhile, your array is an rvalue, it cannot be converted to const char * type expected by the second parameter of strcpy. That's exactly what the error message is telling you.
That part of the language was changed in C99, allowing array-to-pointer conversion for rvalue arrays as well. So in C99 the above strcpy will compile.
In other words, if your compiler issues an error for the above strcpy, it must be an old C89/90 compiler (or a new C compiler run in strict C89/90 mode). You need C99 compiler to compile such strcpy call.
There are two error in you code:
main(void)
{
char c[20];
struct { int i ; int j ; char x[20];} a = {5 , 7 , "someString"} , b;
c = func("Another string").x;// here of course number one
printf("%s\n" , c);
}
struct test func(char *c)
{
struct test temp;
strcpy(temp.x , c);
return temp; // here is number two , when the func finished the memory of function func was freed, temp is freed also.
}
Write you code like this:
main(void)
{
struct test *c;
struct { int i ; int j ; char x[20];} a = {5 , 7 , "someString"} , b;
c = func("Another string");
printf("%s\n" , c->x);
free(c); //free memory
}
struct test * func(char *c)
{
struct test *temp = malloc(sizeof(struct test));//alloc memory
strcpy(temp->x , c);
return temp;
}
OP: but why is it working?
Because apparently when copying a field of a structure, only type and size matters.
I'll search for doc to back this up.
[Edit] Reviewing C11 6.3.2 concerning assignments, the LValue C, because it is an array, it is the address of that array that becomes the location to store the assignment (no shock there). It is that the result of the function is a value of an expression, and the sub-field reference is also a value of an expression. Then this strange code is allowed because it simple assigns the value of the expression (20-bytes) to the destination location&c[0], which is also a char[20].
[Edit2] The gist is that the result of the func().x is a value (value of an expression) and that is a legit assignment for a matching type char[20] on the left side. Whereas c = c fails for c on the right side (a char[20]), becomes the address of the array and not the entire array and thus not assignable to char[20]. This is so weird.
[Edit3] This fails with gcc -std=c99.
I tried a simplified code. Note the function func returns a structure. Typical coding encourages returning a pointer to a structure, rather than a whole copy of some big bad set of bytes.
ct = func("1 Another string") looks fine. One structure was copied en masse to another.
ct.x = func("2 Another string").x starts to look fishy, but surprisingly works. I'd expect the right half to be OK, but the assignment of an array to an array looks wrong.
c = func("3 Another string").x is simply like the previous. If the previous was good, this flies too. Interestingly, if c was size 21, the compilation fails.
Note: c = ct.x fails to compile.
#include <stdio.h>
#include <string.h>
struct test {
int i;
char x[20];
};
struct test func(const char *c) {
struct test temp;
strcpy(temp.x, c);
return temp;
}
int main(void) {
char c[20];
c[1] = '\0';
struct test ct;
ct = func("1 Another string");
printf("%s\n" , ct.x);
ct.x = func("2 Another string").x;
printf("%s\n" , ct.x);
c = func("3 Another string").x;
printf("%s\n" , c);
return 0;
}
1 Another string
2 Another string
3 Another string