initialized vs uninitialized global const variable in c - c

Pardon me, I am not very good in explaining questions. So I start with example directly
Look at following example
const int a=10;
int *ptr;
int main(){
ptr=&a;
*ptr=100; // program crashes
printf("%d",a);
}
But If I made a slightly change in above code as following
const int a; // uninitialized global variable
Then the above code works fine.
So my question is why compiler behaves differently for uninitialize and initialize global const variables?
I am using gcc for windows (mingw).

You are modifying a const object, and that is simply undefined behavior - so don't do it, and don't ignore compiler warnings.
Now, the actual reason for the different behavior in your particular case is that for const int a=10; the value 10 has to be stored somewhere. Since the variable is const, the linker places it in the .rodata or a similar read only section of the executable. When you're trying to write to a read-only location, you'll get a segmentation fault.
For the uninitialized case, const int a , the a needs to be initialized to zero since it's at file scope (or; a is a global variable). The linker then places the variable in the .bss section, together with other data that also is zero initialized at program startup. The .bss section is read/write and you get no segfault when you try to write to it.
All this is not something you can rely on , this could change with minor modification to the code, if you use another compiler or a newer/older version of your compiler etc.

Global and static variables are initialized implicitly if your code doesn't do it explicitly as mandated by the C standard.
From the doc:
const is a type qualifier. The other type qualifier is volatile. The
purpose of const is to announce objects that may be placed in
read-only memory, and perhaps to increase opportunities for
optimization.
In G++ you will receive the error for the second case ie, const int a;.
6.9.2 External object definitions
Semantics
1 If the declaration of an identifier for an object has file scope and
an initializer, the declaration is an external definition for the
identifier.
2 A declaration of an identifier for an object that has file scope
without an initializer, and without a storage-class specifier or with
the storage-class specifier static, constitutes a tentative
definition. If a translation unit contains one or more tentative
definitions for an identifier, and the translation unit contains no
external definition for that identifier, then the behavior is exactly
as if the translation unit contains a file scope declaration of that
identifier, with the composite type as of the end of the translation
unit, with an initializer equal to 0.

declares a constant integer variable. It means it’s value can’t be modified. It’s value is initially assigned to 10.
If you try to change its value later, the compiler will issue a warning, or an error, depending on your compiler settings.

Related

Initialization of extern variable warning in GCC (C18)

The question is not "why can't I initialize a variable declared as extern", because it's something completely possible with file scope variables (not with block scope variables). The thing is that GCC yields a warning (with -Wall switch) in this particular case:
extern int n = 10; // file scope declaration
GCC yields:
test.c:5:12: warning: ‘n’ initialized and declared ‘extern’
The code works perfectly, though.
Furthermore, note that the following definition is absolutely equivalent to the first one:
int n = 10; // file scope declaration
In both cases, the variable has the same linkage and storage type. The thing is that, being both absolutely equivalent, the second version doesn't yield any warning in GCC (with -Wall).
Why is that?
My guess is that you usually use extern to explicitly set a reminder about the fact that this is a declaration that refers to an external object defined elsewhere, so that you shouldn't (though you could) initialize the variable (bear in mind that the standard doesn't let you define a variable twice inside the same linkage, in this case, external).
So, is that a right guess, or perhaps there's more to it, which I'm not able to see?
A compiler can warn about anything it likes to. If it is attentive, it warns about things it considers as "suspicious".
So it does here.
My personal opinion about the reasoning agrees to yours:
My guess is that you usually use extern to explicitly set a reminder about the fact that this is a declaration that refers to an external object defined elsewhere so that you shouldn't (though you could) initialize the variable (bear in mind that the standard doesn't let you define a variable twice inside the same linkage, in this case, external).
That GCC finds it suspicious to initialize an explicit extern declared variable because it is usually more common to define the variable in one file and then in another file, which can depend on the context, cause an error at linking and indeed can be the reason but our assumptions aren't worth much.
The question for the actual "why" you need to ask the implementors of GCC itself.
The keyword extern is used to declare a variable but not define it (similar to function declarations). It is typically used in header files to export a variable from a module. However, it is often better to introduce a function which returns its value.
Example:
M.h
extern int M_n;
M.c
int M_n = 10;

External linkage of const in C

I was playing with extern keyword in C when I encountered this strange behaviour.
I have two files:
file1.c
#include<stdio.h>
int main()
{
extern int a;
a=10;
printf("%d",a);
return 0;
}
file2.c
const int a=100;
When I compile these files together, there is no error or warning and when I run them, output comes to be 10. I had expected that the compiler should report an error on line a=10;.
Moreover, if I change the contents of file2.c to
const int a;
that is, if I remove the initialization of global const variable a and then compile the files, there is still no error or warning but when I run them, Segmentation Fault occurs.
Why does this phenomenon happen? Is it classified under undefined behaviour? Is this compiler- or machine- dependent?
PS: I have seen many questions related to this one, but either they are for C++ or they discuss extern only.
Compilation and linking are two distinct phases. During compilation, individual files are being compiled into object files. Compiler will find both file1.c and file2.c being internally consistent. During linking phase, the linker will just point all the occurrence of the variable a to the same memory location. This is the reason you do not see any compilation or linker error.
To avoid exactly the problem which you have mentioned, it is suggested to put the extern in a header file and then include that header file in different C file. This way compiler can catch any inconsistency between the header and the C file
The following stackoverflow also speaks about linker not able to do type checking for extern variables.
Is there any type checking in C or C++ linkers?
Similarly, the types of global variables (and static members of classes and so on) aren't checked by the linker, so if you declare extern int test; in one translation unit and define float test; in another, you'll get bad results.
It is undefined behaviour but the compiler won't warn you. How could it? It has no idea how you declare a variable in another file.
Attempting to modify a variable declared const is undefined behaviour. It is possible (but not necessary) that the variable will be stored in read-only memory.
This is a known behavior of C compilers. It is one of the differences between C and C++ where strong compile time type checking is enforced.
The segmentation fault occurs when trying to assign a value to a const, because the linker puts the const values in a read-only elf segment and writing to this memory address is a runtime (segmentation) fault.
but during compile time, the compiler does not check any "externs", and the C linker, does not test types. therefore it passes compilation/linkage.
Your program causes undefined behaviour with no diagnostic required (whether or not const int a has an initializer). The relevant text in C11 is 6.2.7/2:
All declarations that refer to the same object or function shall have compatible type; otherwise, the behavior is undefined.
Also 6.2.2/2:
In the set of translation units and libraries that constitutes an entire program, each declaration of a particular identifier with external linkage denotes the same object or function.
In C, const int a = 100; means that a has external linkage. So it denotes the same object as extern int a;. However those two declarations have incompatible type (int is not compatible with const int, see 6.7.2 for the definition of "compatible type").

First executable statement in C

Is main really the first function or first executable statement in a C program? What if there is a global variable int a=0;?
I have always been taught that main is the starting point of a program. But what about global variable which is assigned some value and is an executable statement in my opinion?
The global variable and in general objects of static storage duration are initialized conceptually before program execution.
C11 (N1570) 5.1.2/1 Execution environments:
All objects with static storage duration shall be initialized (set to
their initial values) before program startup.
Given a hosted environment, function main is designated to be an required entry point, where program execution begins. It may be in one of two forms:
int main(void)
int main(int argc, char* argv[])
where parameters' names does not need to be the same as above (it is just a convention).
For a freestanding environment entry point is implementation-defined, that's why you can sometimes encounter void main() or any different form in C implementations for embedded devices.
C11 (N1570) 5.1.2.1/1 Freestanding environment:
In a freestanding environment (in which C program execution may take
place without any benefit of an operating system), the name and type
of the function called at program startup are implementation-defined.
main is not a starting point of the program. The starting point of the program is the entry point of the program, which is in most cases is transparent for a C programmer. Usually it is denoted by _start symbol, and defined in a startup code written in assembly or precompiled into a C runtime initialization library (like crt0.o). It is responsible for low-level initialization of stuff you are taking as given, like initializing the uninitialized static variables to zeros. After it is done, it is calling to a predefined symbol main, which is the main you know.
But what about global variable which is assigned some value and is an execuatable statement in my opinion
Your opinion is wrong.
In a global context, only a variable definition can exist, with an explicit initialization. All the executable statements (i.e, the assignment) have to reside inside a function.
To elaborate, in global context, you cannot have a statement like
int globalVar;
globalVar = 0; //error, assignement statement should be inside a function
however, the above would be perfectly valid inside a function, like
int main()
{
int localVar;
localVar = 0; //assignment is valid here.
Regarding the initialization, like
int globalVar = 0;
the initialization takes place before start of main(), so that's not really the part of execution, per se.
To elaborate the scenario of the initialization of a global variable, quoting the C11, chapter 6.2,
If the declarator or type specifier that declares the identifier
appears outside of any block or list of parameters, the identifier has file scope, which
terminates at the end of the translation unit.
and for flie scope variables,
If
the declaration of an identifier for an object has file scope and no storage-class specifier,
its linkage is external.
and for objects with external linkage,
An object whose identifier is declared without the storage-class specifier
_Thread_local, and either with external or internal linkage or with the storage-class
specifier static, has static storage duration. Its lifetime is the entire execution of the
program and its stored value is initialized only once, prior to program startup.
In a theoretical, C-standards-only program, it is.
In practice, it's usually more involved.
On Linux, AFAIK, the kernel loads your linked image into the a reserved address space and first calls the dynamic linker that the executable image specifies (unless the executable is compiled statically in which case there's no dynammic linking part).
The dynamic linker can load dependent libraries, such as the C library.
These libraries may register their own startup code, and so can you (on gcc mainly via __attribute__((constructorr))).
(User-supplied init code is especially needed for C++ where you need to run some startup code on C++ globals that have constructors.)
Then the linker calls the entry point of your image, which is _start by default (linkers allow you to choose a different name if you want to dig that deep) which is by default supplied by the C library. _start initializes the C library an continues by calling main.
In any case, simple global initializations such as int x = 42; should get compiled and linked into your executable and then get loaded by the OS (rather than your code) all at once, as part of loading the process image so there's no need for user-supplied initialization code for such variables.
If you use turbo c watch you would find that first global is declared and then execution of main starts that is at compile time data segment (giving memory to global and static variable) is initialized with 0.
So though assignment is not possible but declaration occurs at compile time.
Yes, when you declare a variable memory is allocated to it at compile time until and unless you don't use heap segment (allocating memory to pointer)i.e dynamic allocation which occurs at run time. But since global got its memory from data segment section of RAM variable is allocated memory at compile time.
Hope this helps.

C -- Accessing a non-const through const declaration

Is accessing a non-const object through a const declaration allowed by the C standard?
E.g. is the following code guaranteed to compile and output 23 and 42 on a standard-conforming platform?
translation unit A:
int a = 23;
void foo(void) { a = 42; }
translation unit B:
#include <stdio.h>
extern volatile const int a;
void foo(void);
int main(void) {
printf("%i\n", a);
foo();
printf("%i\n", a);
return 0;
}
In the ISO/IEC 9899:1999, I just found (6.7.3, paragraph 5):
If an attempt is made to modify an object defined with a const-qualified type through use
of an lvalue with non-const-qualified type, the behavior is undefined.
But in the case above, the object is not defined as const (but just declared).
UPDATE
I finally found it in ISO/IEC 9899:1999.
6.2.7, 2
All declarations that refer to the same object or function shall have compatible type;
otherwise, the behavior is undefined.
6.7.3, 9
For two qualified types to be compatible, both shall have the identically qualified
version of a compatible type; [...]
So, it is undefined behaviour.
TU A contains the (only) definition of a. So a really is a non-const object, and it can be accessed as such from a function in A with no problems.
I'm pretty sure that TU B invokes undefined behavior, since its declaration of a doesn't agree with the definition. Best quote I've found so far to support that this is UB is 6.7.5/2:
Each declarator declares one identifier, and asserts that when an
operand of the same form as the declarator appears in an expression,
it designates a function or object with the scope, storage duration,
and type indicated by the declaration specifiers.
[Edit: the questioner has since found the proper reference in the standard, see the question.]
Here, the declaration in B asserts that a has type volatile const int. In fact the object does not have (qualified) type volatile const int, it has (qualified) type int. Violation of semantics is UB.
In practice what will happen is that TU A will be compiled as if a is non-const. TU B will be compiled as if a were a volatile const int, which means it won't cache the value of a at all. Thus, I'd expect it to work provided the linker doesn't notice and object to the mismatched types, because I don't immediately see how TU B could possibly emit code that goes wrong. However, my lack of imagination is not the same as guaranteed behavior.
AFAIK, there's nothing in the standard to say that volatile objects at file scope can't be stored in a completely different memory bank from other objects, that provides different instructions to read them. The implementation would still have to be capable of reading a normal object through, say, a volatile pointer, so suppose for example that the "normal" load instruction works on "special" objects, and it uses that when reading through a pointer to a volatile-qualified type. But if (as an optimization) the implementation emitted the special instruction for special objects, and the special instruction didn't work on normal objects, then boom. And I think that's the programmer's fault, although I confess I only invented this implementation 2 minutes ago so I can't be entirely confident that it conforms.
In the B translation unit, const would only prohibit modifying the a variable within the B translation unit itself.
Modifications of that value from outside (other translation units) will reflect on the value you see in B.
This is more of a linker issue than a language issue. The linker is free to frown upon the differing qualifications of the a symbol (if there is such information in the object files) when merging the compiled translation units.
Note, however, that if it's the other way around (const int a = 23 in A and extern int a in B), you would likely encounter a memory access violation in case of attempting to modify a from B, since a could be placed in a read-only area of the process, usually mapped directly from the .rodata section of the executable.
The declaration that has the initialization is the definition, so your object is indeed not a const qualified object and foo has all the rights to modify it.
In B your are providing access to that object that has the additional const qualification. Since the types (the const qualified version and the non-qualified version) have the same object representation, read access through that identifier is valid.
Your second printf, though, has a problem. Since you didn't qualify your B version of a as volatile you are not guaranteed to see the modification of a. The compiler is allowed to optimize and to reuse the previous value that he might have kept in a register.
Declaring it as const means that the instance is defined as const. You cannot access it from a not-const. Most compilers will not allow it, and the standard says it's not allowed either.
FWIW: In H&S5 is written (Section 4.4.3 Type Qualifiers, page 89):
"When used in a context that requires a value rather than a designator, the qualifiers are eliminated from the type." So the const only has an effect when someone tries to write something into the variable.
In this case, the printf's use a as an rvalue, and the added volatile (unnecessary IMHO) makes the program read the variable anew, so I would say, the program is required to produce the output the OP saw initially, on all platforms/compilers.
I'll look at the Standard, and add it if/when I find anything new.
EDIT: I couldn't find any definite solution to this question in the Standard (I used the latest draft for C1X), since all references to linker behavior concentrate on names being identical. Type qualifiers on external declarations do not seem to be covered.
Maybe we should forward this question to the C Standard Committee.

Why and when to use static structures in C programming?

I have seen static structure declarations quite often in a driver code I have been asked to modify.
I tried looking for information as to why structs are declared static and the motivation of doing so.
Can anyone of you please help me understand this?
The static keyword in C has several effects, depending on the context it's applied to.
when applied to a variable declared inside a function, the value of that variable will be preserved between function calls.
when applied to a variable declared outside a function, or to a function, the visibility of that variable or function is limited to the "translation unit" it's declared in - ie the file itself. For variables this boils down to a kind of "locally visible global variable".
Both usages are pretty common in relatively low-level code like drivers.
The former, and the latter when applied to variables, allow functions to retain a notion of state between calls, which can be very useful, but this can also cause all kinds of nasty problems when the code is being used in any context where it is being used concurrently, either by multiple threads or by multiple callers. If you cannot guarantee that the code will strictly be called in sequence by one "user", you can pass a kind of "context" structure that's being maintained by the caller on each call.
The latter, applied to functions, allows a programmer to make the function invisible from outside of the module, and it MAY be somewhat faster with some compilers for certain architectures because the compiler knows it doesn't have to make the variable/function available outside the module - allowing the function to be inlined for example.
Something that apparently all other answers seem to miss: static is and specifies also a storage duration for an object, along with automatic (local variables) and allocated (memory returned by malloc and friends).
Objects with static storage duration are initialized before main() starts, either with the initializer specified, or, if none was given, as if 0 had been assigned to it (for structs and arrays this goes for each member and recursively).
The second property static sets for an identifier, is its linkage, which is a concept used at link time and tells the linker which identifiers refer to the same object. The static keyword makes an identifier have internal linkage, which means it cannot refer to identifiers of the same name in another translation unit.
And to be pedantic about all the sloppy answers I've read before: a static variable can not be referenced everyhere in the file it is declared. Its scope is only from its declaration (which can be between function definitions) to the end of the source file--or even smaller, to the end of the enclosing block.
struct variable
For a struct variable like static struct S s;, this has been widely discussed at: What does "static" mean in C?
struct definition: no effect:
static struct S { int i; int j; };
is the exact same as:
struct S { int i; int j; };
so never use it. GCC 4.8 raises a warning if you do it.
This is because struct definitions have no storage, and do no generate symbols in object files like variables and functions. Just try compiling and decompiling:
struct S { int i; int j; };
int i;
with:
gcc -c main.c
nm main.o
and you will see that there is no S symbol, but there is an i symbol.
The compiler simply uses definitions to calculate the offset of fields at compile time.
This is struct definitions are usually included in headers: they won't generate multiple separate data, even if included multiple times.
The same goes for enum.
C++ struct definition: deprecated in C++11
C++11 N3337 standard draft Annex C 7.1.1:
Change: In C ++, the static or extern specifiers can only be applied to names of objects or functions
Using these specifiers with type declarations is illegal in C ++. In C, these specifiers are ignored when used
on type declarations.
See also: https://stackoverflow.com/a/31201984/895245
If you declare a variable as being static, it is visible only in that translation unit (if globally declared) or retains its value from call to call (if declared inside a function).
In your case I guess it is the first case. In that case, probably the programmer didn't want the structure to be visible from other files.
The static modifier for the struct limits the scope of visibility of the structure to the current translation unit (i.e. the file).
NOTE: This answer assumes (as other responders have indicated) that your declaration is not within a function.

Resources