When does initialisation of global variables happen? - c

I know when a program is run, the main() function is executed first. But when does the initialization of global variables declared outside the main() happens? I mean if I declare a variable like this:
unsigned long current_time = millis();
void main() {
while () {
//some code using the current_time global variable
}
}
Here, the exact time when the global variable initializes is important. Please tell what happens in this context.

Since you didn't define the language you're talking about, I assumed it to be C++.
In computer programming, a global variable is a variable that is accessible in every scope (unless shadowed). Interaction mechanisms with global variables are called global environment (see also global state) mechanisms. The global environment paradigm is contrasted with the local environment paradigm, where all variables are local with no shared memory (and therefore all interactions can be reconducted to message passing). Wikipedia.
In principle, a variable defined outside any function (that is, global, namespace, and class static variables) is initialized before main() is invoked. Such nonlocal variables in a translation unit are initialized in their declaration order (§10.4.9). If such a variable has no explicit initializer, it is by default initialized to the default for its type (§10.4.2). The default initializer value for built-in types and enumerations is 0. [...] There is no guaranteed order of initialization of global variables in different translation units. Consequently, it is unwise to create order dependencies between initializers of global variables in different compilation units. In addition, it is not possible to catch an exception thrown by the initializer of a global variable (§14.7). It is generally best to minimize the use of global variables and in particular to limit the use of global variables requiring complicated initialization. See.

(Quick answer: The C standard doesn't support this kind of initialization; you'll have to consult your compiler's documentation.)
Now that we know the language is C, we can see what the standard has to say about it.
C99 6.7.8 paragraph 4:
All the expressions in an initializer for an object that has static
storage duration shall be constant expressions or string literals.
And the new 2011 standard (at least the draft I has) says:
All the expressions in an initializer for an object that has static
storage duration shall be constant expressions or string literals.
So initializing a static object (e.g., a global such as your current_time) with a function call is a constraint violation. A compiler can reject it, or it can accept it with a warning and do whatever it likes if it provides an language extension.
The C standard doesn't say when the initialization occurs, because it doesn't permit that kind of initialization. Basically none of your code can execute before the main() function starts executing.
Apparently your compiler permits this as an extension (assuming you've actually compiled this code). You'll have to consult your compiler's documentation to find out what the semantics are.
(Normally main is declared as int main(void) or int main(int argc, char *argv[]) or equivalent, or in some implementation-defined manner. In many cases void main() indicates a programmer who's learned C from a poorly written book, of which there are far too many. But this applies only to hosted implementations. Freestanding implementations, typically for embedded systems, can define the program's entry point any way they like. Since you're targeting the Arduino, you're probably using a freestanding implementation, and you should declare main() however the compiler's documentation tells you to.)

Related

Embedded C: Risk of Initializing global data

General question in c langage:
Is it safe to initialize data in the declaration?
example:
static unsigned char myVar =5u;
Is there any risk that this value will be overwritten by the startup code?
Generally, embedded systems microcontroller projects come in two flavours and the IDE often lets you pick one:
Standard C compliant (sometimes referred to as "ANSI" by confused tool vendors).
Minimized start-up.
The former, standard C compliant projects require that all variables with static storage duration, such as those declared at file scope and/or with the keyword static are initialized before main() is called. This initialization happens inside the start-up code ("C run-time"/"CRT"). On such a system, the myVar = 5u; is guaranteed to be written (not overwritten) by the start-up code. It copies down the value 5 from flash to RAM.
The latter, "mimizined"/"fast" start-up version is not strictly C standard compliant. In such projects all the initialization code of static storage duration variables is simply removed. This to reduce the time from reset to when main() is called. On such systems, nothing will execute the static unsigned char myVar =5u; code - your variable remains uninitialized and indeterminate even though you explicitly initialized it. You have to set it manually at "run-time", which is usually done from some init "constructor" code.
If you have static uint8_t foo_count; belonging to foo.c, then the foo module will have to provide a function foo_init() from where the code foo_count = 5; is executed.
Since the "minimized start-up" version is very common in embedded systems, it is usually considered dangerous to rely on default initialization of static storage duration variables, in case the code gets ported to such a system.
I hope your start up code runs before main. with that said, when you declare the variable static, the variables scope is bound to the scope of that translation unit (somefile.c), so yes you could overwrite it but that would be hard to do from other units (direct memory assignment) The golden rule for embedded systems is to avoid global variables but if you most, declare global variables as externs in a header file where it makes the most sense to put it.
Alternatively you could replace this with a #define.
#define MY_VAR 5U
The start up code is what is responsible for applying the initialisation (where did you imagine that was done?), so yes it is safe and with respect to initialisation best practice.
Without initialisation a static or global should have a value of zero. It is not unheard of for the startup code in some embedded systems to deliberately omit zero initialisation in order to minimise start up time. Normally that would not be the default startup code, and you would have to make a conscious decision to use it. It is unsafe and unfair to later maintainers who might be unaware is such non-standard behaviour. It is also in most cases an unnecessary and premature optimisation. Nonetheless you might initialise all such variables, even if the unit is zero, to defend against such non standard startup (whilst also rendering it pointless, which it generally is).
What is not best practice is the use of a global in the first instance (https://www.embedded.com/a-pox-on-globals/). Though to be fair the myVar in question is not truly global in this case.
This may not be directly related with the original question, but I think it's worth mentioning:
Initialization of a global variable using another global variable from a different translation unit is not safe. For example:
In a.c
unsigned var_from_a = 5u;
In b.c
extern unsigned var_from_a;
unsigned var_from_b = var_from_a;
There is no guarantee that var_from_b becomes 5u, because the initialization order of translation units is not defined. If you're unlucky, b.c is processed before a.c, and var_from_b may become 0 or some other garbage value, while a.c is processed later and var_from_a properly initialized to 5u.
I'm not sure if it would change this behavior if var_from_a was defined as const.

First executable statement in C

Is main really the first function or first executable statement in a C program? What if there is a global variable int a=0;?
I have always been taught that main is the starting point of a program. But what about global variable which is assigned some value and is an executable statement in my opinion?
The global variable and in general objects of static storage duration are initialized conceptually before program execution.
C11 (N1570) 5.1.2/1 Execution environments:
All objects with static storage duration shall be initialized (set to
their initial values) before program startup.
Given a hosted environment, function main is designated to be an required entry point, where program execution begins. It may be in one of two forms:
int main(void)
int main(int argc, char* argv[])
where parameters' names does not need to be the same as above (it is just a convention).
For a freestanding environment entry point is implementation-defined, that's why you can sometimes encounter void main() or any different form in C implementations for embedded devices.
C11 (N1570) 5.1.2.1/1 Freestanding environment:
In a freestanding environment (in which C program execution may take
place without any benefit of an operating system), the name and type
of the function called at program startup are implementation-defined.
main is not a starting point of the program. The starting point of the program is the entry point of the program, which is in most cases is transparent for a C programmer. Usually it is denoted by _start symbol, and defined in a startup code written in assembly or precompiled into a C runtime initialization library (like crt0.o). It is responsible for low-level initialization of stuff you are taking as given, like initializing the uninitialized static variables to zeros. After it is done, it is calling to a predefined symbol main, which is the main you know.
But what about global variable which is assigned some value and is an execuatable statement in my opinion
Your opinion is wrong.
In a global context, only a variable definition can exist, with an explicit initialization. All the executable statements (i.e, the assignment) have to reside inside a function.
To elaborate, in global context, you cannot have a statement like
int globalVar;
globalVar = 0; //error, assignement statement should be inside a function
however, the above would be perfectly valid inside a function, like
int main()
{
int localVar;
localVar = 0; //assignment is valid here.
Regarding the initialization, like
int globalVar = 0;
the initialization takes place before start of main(), so that's not really the part of execution, per se.
To elaborate the scenario of the initialization of a global variable, quoting the C11, chapter 6.2,
If the declarator or type specifier that declares the identifier
appears outside of any block or list of parameters, the identifier has file scope, which
terminates at the end of the translation unit.
and for flie scope variables,
If
the declaration of an identifier for an object has file scope and no storage-class specifier,
its linkage is external.
and for objects with external linkage,
An object whose identifier is declared without the storage-class specifier
_Thread_local, and either with external or internal linkage or with the storage-class
specifier static, has static storage duration. Its lifetime is the entire execution of the
program and its stored value is initialized only once, prior to program startup.
In a theoretical, C-standards-only program, it is.
In practice, it's usually more involved.
On Linux, AFAIK, the kernel loads your linked image into the a reserved address space and first calls the dynamic linker that the executable image specifies (unless the executable is compiled statically in which case there's no dynammic linking part).
The dynamic linker can load dependent libraries, such as the C library.
These libraries may register their own startup code, and so can you (on gcc mainly via __attribute__((constructorr))).
(User-supplied init code is especially needed for C++ where you need to run some startup code on C++ globals that have constructors.)
Then the linker calls the entry point of your image, which is _start by default (linkers allow you to choose a different name if you want to dig that deep) which is by default supplied by the C library. _start initializes the C library an continues by calling main.
In any case, simple global initializations such as int x = 42; should get compiled and linked into your executable and then get loaded by the OS (rather than your code) all at once, as part of loading the process image so there's no need for user-supplied initialization code for such variables.
If you use turbo c watch you would find that first global is declared and then execution of main starts that is at compile time data segment (giving memory to global and static variable) is initialized with 0.
So though assignment is not possible but declaration occurs at compile time.
Yes, when you declare a variable memory is allocated to it at compile time until and unless you don't use heap segment (allocating memory to pointer)i.e dynamic allocation which occurs at run time. But since global got its memory from data segment section of RAM variable is allocated memory at compile time.
Hope this helps.

Is a global or static declaration safer in an embedded environment?

I have a choice to between declaring a variable static or global.
I want to use the variable in one function to maintain counter.
for example
void count()
{
static int a=0;
for(i=0;i<7;i++)
{
a++;
}
}
My other choice is to declare the variable a as global.
I will only use it in this function count().
Which way is the safest solution?
It matters only at compile and link-time. A static local variable should be stored and initialised in exactly the same way as a global one.
Declaring a local static variable only affects its visibility at the language level, making it visible only in the enclosing function, though with a global lifetime.
A global variable (or any object in general) not marked static has external linkage and the linker will consider the symbol when merging each of the object files.
A global variable marked static only has internal linkage within the current translation unit, and the linker will not see such a symbol when merging the individual translation units.
The internal static is probably better from a code-readability point of view, if you'll only ever use it inside that function.
If it was global, some other function could potentially modify it, which could be dangerous.
Either using global or static variable within a function both are not safe because then your function will no longer be re-entrant.
However if you are not concerned with function being re-entrant then you can have either based on your choice.
If the variable is only to be accessed within the function count() then it is by definition local, so I cannot see why the question arises. As a rule, always use the most restrictive scope possible for any symbol.
You should really read Jack Ganssle's article A Pox on Globals, it will be enlightening.
Always reduce scope as far as possible. If a variable doesn't need to be visible outside a function, it should not be declared outside it either. The static keyword should be used whenever possible. If you declare a variable at file scope, it should always be static to reduce the scope to the file it was declared in. This is C's way of private encapsulation.
The above is true for all systems. For embedded there is another concern: all variables declared as static or global must be initialized before the program is started. This is enforced by ISO C. So they are always set either to the value the programmer wants them initialized to. If the programmer didn't set any value they are initialized to zero (or NULL).
This means that before main is called, there must be a snippet executed in your program that sets all these static/global values. In an embedded system, the initialization values are copied from ROM (flash, eeprom etc) to RAM. A standard C compiler handles this by creating this snippet and adding it to your program.
However, in embedded systems this snippet is often unfortunate, as it leads to a delay at program startup, especially if there is lots of statics/globals. A common non-standard optimization most embedded compilers support, is to remove this snippet. The program will then no longer behave as expected by the C standard, but it will be faster. Once you have done this optimization, initialization must be done in runtime, roughly static int x; x=0; rather than static int x=0;.
To make your program portable to such non-standard embedded compilers, it is a good habit to always set your globals/statics in runtime. And no matter if you intend to port to such compilers or not, it is certainly a good habit not to rely on the default zero initialization of globals/statics. Because most rookie C programmers don't even know that this static zero initialization rule exists and they will get very confused if you don't init your variables explicitly before using them.
i dont think is there is anything special with static & normal global with embedded domain ...!!
in one way static is good that if you are going to initialize your counter as o in starting then if you just declare with static then there is no need to initialize with it 0 because every static varaible is by default initialized with 0.
Edit :
After Clifford's comment i have checked and get to know that globals are also statically allocated and initialised to zero, so that advantage does not exist..
Pass a pointer to a "standard" variable instead
void count(int *a) {
int i;
for (i = 0; i < 7; i++)
{
(*a)++;
}
}
This way you do not rely neither on global variables nor on static local variables, which makes your program better.
I would say static is better than global if you want only one function to access it in which you declared it . Plus global variables are more prone to be accidentally accessed by other functions.
If you do want to use globals since it can be accessed by other functions in the program, make sure you declare them as volatile .
volatile int a = 0;
volatile makes sure it is not optimised by compilers in the wrong way.

Why and when to use static structures in C programming?

I have seen static structure declarations quite often in a driver code I have been asked to modify.
I tried looking for information as to why structs are declared static and the motivation of doing so.
Can anyone of you please help me understand this?
The static keyword in C has several effects, depending on the context it's applied to.
when applied to a variable declared inside a function, the value of that variable will be preserved between function calls.
when applied to a variable declared outside a function, or to a function, the visibility of that variable or function is limited to the "translation unit" it's declared in - ie the file itself. For variables this boils down to a kind of "locally visible global variable".
Both usages are pretty common in relatively low-level code like drivers.
The former, and the latter when applied to variables, allow functions to retain a notion of state between calls, which can be very useful, but this can also cause all kinds of nasty problems when the code is being used in any context where it is being used concurrently, either by multiple threads or by multiple callers. If you cannot guarantee that the code will strictly be called in sequence by one "user", you can pass a kind of "context" structure that's being maintained by the caller on each call.
The latter, applied to functions, allows a programmer to make the function invisible from outside of the module, and it MAY be somewhat faster with some compilers for certain architectures because the compiler knows it doesn't have to make the variable/function available outside the module - allowing the function to be inlined for example.
Something that apparently all other answers seem to miss: static is and specifies also a storage duration for an object, along with automatic (local variables) and allocated (memory returned by malloc and friends).
Objects with static storage duration are initialized before main() starts, either with the initializer specified, or, if none was given, as if 0 had been assigned to it (for structs and arrays this goes for each member and recursively).
The second property static sets for an identifier, is its linkage, which is a concept used at link time and tells the linker which identifiers refer to the same object. The static keyword makes an identifier have internal linkage, which means it cannot refer to identifiers of the same name in another translation unit.
And to be pedantic about all the sloppy answers I've read before: a static variable can not be referenced everyhere in the file it is declared. Its scope is only from its declaration (which can be between function definitions) to the end of the source file--or even smaller, to the end of the enclosing block.
struct variable
For a struct variable like static struct S s;, this has been widely discussed at: What does "static" mean in C?
struct definition: no effect:
static struct S { int i; int j; };
is the exact same as:
struct S { int i; int j; };
so never use it. GCC 4.8 raises a warning if you do it.
This is because struct definitions have no storage, and do no generate symbols in object files like variables and functions. Just try compiling and decompiling:
struct S { int i; int j; };
int i;
with:
gcc -c main.c
nm main.o
and you will see that there is no S symbol, but there is an i symbol.
The compiler simply uses definitions to calculate the offset of fields at compile time.
This is struct definitions are usually included in headers: they won't generate multiple separate data, even if included multiple times.
The same goes for enum.
C++ struct definition: deprecated in C++11
C++11 N3337 standard draft Annex C 7.1.1:
Change: In C ++, the static or extern specifiers can only be applied to names of objects or functions
Using these specifiers with type declarations is illegal in C ++. In C, these specifiers are ignored when used
on type declarations.
See also: https://stackoverflow.com/a/31201984/895245
If you declare a variable as being static, it is visible only in that translation unit (if globally declared) or retains its value from call to call (if declared inside a function).
In your case I guess it is the first case. In that case, probably the programmer didn't want the structure to be visible from other files.
The static modifier for the struct limits the scope of visibility of the structure to the current translation unit (i.e. the file).
NOTE: This answer assumes (as other responders have indicated) that your declaration is not within a function.

What is the difference between a static global and a static volatile variable?

I have used a static global variable and a static volatile variable in file scope,
both are updated by an ISR and a main loop and main loop checks the value of the variable. here during optimization neither the global variable nor the volatile variable are optimized. So instead of using a volatile variable a global variable solves the problem.
So is it good to use global variable instead of volatile?
Any specific reason to use static volatile??
Any example program would be appreciable.
Thanks in advance..
First let me mention that a static global variable, is the same as a global variable, except that you are limiting the variable to the scope of the file. I.e. you can't use this global variable in other files via the extern keyword.
So you can reduce your question to global variables vs volatile variables.
Now onto volatile:
Like const, volatile is a type modifier.
The volatile keyword was created to prevent compiler optimizations that may make code incorrect, specifically when there are asynchronous events.
Objects declared as volatile may not be used in certain optimizations.
The system always reads the current true value of a volatile object at the point it is used, even if a previous instruction asked for a value from the same object. Also, the value of the object is written immediately on assignment. That means there is no caching of a volatile variable into a CPU register.
Dr. Jobb's has a great article on volatile.
Here is an example from the Dr. Jobb's article:
class Gadget
{
public:
void Wait()
{
while (!flag_)
{
Sleep(1000); // sleeps for 1000 milliseconds
}
}
void Wakeup()
{
flag_ = true;
}
...
private:
bool flag_;
};
If the compiler sees that Sleep() is an external call, it will assume that Sleep() cannot possibly change the variable flag_'s value. So the compiler may store the value of flag_ in a register. And in that case, it will never change. But if another thread calls wakeup, the first thread is still reading from the CPU's register. Wait() will never wake-up.
So why not just never cache variables into registers and avoid the problem completely?
It turns out that this optimization can really save you a lot of time overall. So C/C++ allows you to explicitly disable it via the volatile keyword.
The fact above that flag_ was a member variable, and not a global variable (nor static global) does not matter. The explanation after the example gives the correct reasoning even if you're dealing with global variables (and static global variables).
A common misconception is that declaring a variable volatile is sufficient to ensure thread safety. Operations on the variable are still not atomic, even though they are not "cached" in registers
volatile with pointers:
Volatile with pointers, works like const with pointers.
A variable of type volatile int * means that the variable that the pointer points to is volatile.
A variable of type int * volatile means that the pointer itself is volatile.
They are different things. I'm not an expert in volatile semantics. But i think it makes sense what is described here.
Global
Global just means the identifier in question is declared at file-scope. There are different scopes, called function (where goto-labels are defined in), file (where globals reside), block (where normal local variables reside), and function prototype (where function parameters reside). This concept just exist to structure the visibility of identifiers. It doesn't have anything to do with optimizations.
Static
static is a storage duration (we won't look at that here) and a way to give a name declared within file scope internal linkage. This can be done for functions or objects only required within one translation unit. A typical example might be a help function printing out the accepted parameters, and which is only called from the main function defined in the same .c file.
6.2.2/2 in a C99 draft:
If the declaration of a file scope
identifier for an object or a function
contains the storage class specifier
static, the identifier has internal
linkage.
Internal linkage means that the identifier is not visible outside the current translation unit (like the help function of above).
Volatile
Volatile is a different thing: (6.7.3/6)
An object that has volatile-qualified
type may be modified in ways unknown to
the implementation or have other
unknown side effects. Therefore any
expression referring to such an object
shall be evaluated strictly according
to the rules of the abstract machine,
as described in 5.1.2.3. Furthermore,
at every sequence point the value last
stored in the object shall agree with
that prescribed by the abstract
machine, except as modified by the
unknown factors mentioned
previously.
The Standard provides an excellent example for an example where volatile would be redundant (5.1.2.3/8):
An implementation might define a
one-to-one correspondence between
abstract and actual semantics: at
every sequence point, the values of
the actual objects would agree with
those specified by the abstract
semantics. The keyword volatile
would then be redundant.
Sequence points are points where the effect of side effects concerning the abstract machine are completed (i.e external conditions like memory cell values are not included). Between the right and the left of && and ||, after ; and returning from a function call are sequence points for example.
The abstract semantics is what the compiler can deduce from seeing only the sequence of code within a particular program. Effects of optimizations are irrelevant here. actual semantics include the effect of side effects done by writing to objects (for example, changing of memory cells). Qualifying an object as volatile means one always gets the value of an object straight from memory ("as modified by the unknown factors"). The Standard doesn't mention threads anywhere, and if you must rely on the order of changes, or on atomicity of operations, you should use platform dependent ways to ensure that.
For an easy to understand overview, intel has a great article about it here.
What should i do now?
Keep declaring your file-scope (global) data as volatile. Global data in itself does not mean the variables' value will equal to the value stored in memory. And static does only make your objects local to the current translation unit (the current .c files and all other files #include'ed by it).
The "volatile" keyword suggests the compiler not to do certain optimizations on code involving that variable; if you just use a global variable, nothing prevents the compiler to wrongly optimize your code.
Example:
#define MYPORT 0xDEADB33F
volatile char *portptr = (char*)MYPORT;
*portptr = 'A';
*portptr = 'B';
Without "volatile", the first write may be optimized out.
The volatile keyword tells the compiler to make sure that variable will never be cached. All accesses to it must be made in a consistent way as to have a consistent value between all threads. If the value of the variable is to be changed by another thread while you have a loop checking for change, you want the variable to be volatile as there is no guarantee that a regular variable value won't be cached at some point and the loop will just assume it stays the same.
Volatile variable on Wikipedia
They may not be in different in your current environment, but subtle changes could affect the behavior.
Different hardware (more processors, different memory architecture)
A new version of the compiler with better optimization.
Random variation in timing between threads. A problem may only occur one time in 10 million.
Different compiler optimization settings.
It is much safer in the long run to use proper multithreading constructs from the beginning, even if things seem to work for now without them.
Of course, if your program is not multi-threaded then it doesn't matter.
I +1 friol's answer. I would like to add some precisions as there seem to be a lot of confusions in different answers: C's volatile is not Java's volatile.
So first, compilers can do a lot of optimizations on based on the data flow of your program, volatile in C prevents that, it makes sure you really load/store to the location every time (instead of using registers of wiping it out e.g.). It is useful when you have a memory mapped IO port, as friol's pointed out.
Volatile in C has NOTHING to do with hardware caches or multithreading. It does not insert memory fences, and you have absolutely no garanty on the order of operations if two threads do accesses to it. Java's volatile keyword does exactly that though: inserting memory fences where needed.
volatile variable means that the value assinged to it is not constant, i.e if a function containing a volatile variable "a=10" and the function is adding 1 in each call of that function then it will always return updated value.
{
volatile int a=10;
a++;
}
when the above function is called again and again then the variable a will not be re-initialised to 10, it will always show the updated value till the program runs.
1st output= 10
then 11
then 12
and so on.

Resources