Static variable inside a function - c

This is more of a theoretic question.
Say I have the following C program:
int a;
int f(){
double b;
static float c;
}
The question reads: For each of the variables (a, b, c), name the following: storage duration (lifetime), scope of identifier, the memory segment in which it is kept and its initial value.
As far as I've understood the theory so far:
For the variable a:
lifetime: static
scope of identifier: file level scope
memory segment: data segment
initial value: 0
For the variable b:
lifetime: automatic (local)
scope level: block level scope
memory segment: stack
initial value: undefined (random)
But the variable C is what confuses me.
As far as I understand its lifetime i static, its scope level is of block level scope, but I'm not sure about the memory segment or the initial value.
Usually, the local variables of a function are kept in the stack segment, but since the variable is static, should it then be kept in the data segment instead?

Normally you don't need to deal with concepts like "segment", it depends on the file format(ELF, Mach-O, etc.).
A static variable, no matter where it is defined, their lifetime and initialization rules are the same. The only difference is the visibility of this symbol to compiler and linker. In your particular example, static float c is also zero initialized, just as int a.
And technically, if you are dealing with linux and ELF format, static variable without explicit initialization is put in .bss segment, not .data segment. .bss segment has no physical size in the file, but will be zero-initialized when the ELF file is loaded to execute.
You can use nm command to see the symbols in your file if you are interested in.

This is a just a complement to you own analysis and #liliscent's answer. Variable a has external linkage, because it declared at file level with no static specifier. That means that it can be accessed from a different translation unit provided it is declared there as extern int a;. The other variables cannot be accessed from other translation units.

The concept of segment can refer to 2 different things :
Either the segments as seen by the CPU, which are references to a part of the memory pointed to by a segment register, Or a logical segment which is a name for some kind of data (as seen in assembler source code).
For an example, the .bss segment has no real existence. It only means : a part of the data segment which is initialized to zero and for this reason, doesn't need to be saved as data in the program file.
For the rest, one can assume there 3 kind of segments : Code, data and stack, with a special case for the heap, which is dynamically allocated in data segment, but this merely an implementation problem, which might vary according to the implementation.
However, for the purpose of simplification, one could consider as true that all static variables are allocated in the data segment, with just one specificity for data initialized to 0, which is in .bss (and thus, still in the data segment, but not imaged in the program file).
The only difference between global and local static, is it's visibility and its "name space" : you can have multiple static variables with the same name, local to different function and they will all be seen only in the function in which they were declared, but initialized at the beginning of the execution.
So on the contrary as automatic variables, which are allocated on the stack, each time the function is called - and thus, exists multiple times if the function is called recursively; static variable are shared by all simultaneous instances of the function. i.e. if a function calls itself and the called change the value of a static variable, the value will be changed for the caller too.

Related

Is it possible to access static local variable outside the function where it is declared?

For a static variable defined within a C function, like below:
int f1()
{
static int var2 = 42;
var2++;
printf("var2=%d\n", var2);
}
The var2 will be stored in the .data segment (because it is explicitly initialized to 42, thanks to #busybee pointing this out):
0000000000004014 l O .data 0000000000000004 var2.2316
The var2 will be stored in the .bss segment if I don't explicitly initialize it or initialize it to 0):
000000000000401c l O .bss 0000000000000004 var2.2316
There are 2 aspects about the var2:
Its lifetime is the same as the whole program.
But its scope is limited to within f1().
The bss section is meant for uninitialized global data. While the data section is meant for initialized global data. The var2 lives in bss so it must be global in a sense.
I think the reason that var2 can only be accessed within f1() is just some syntactical rule placed by the compiler. If we iterate through the bss section, the var2 must be accessible from outside the f1(). Am I right on this? Thanks.
Well you have raw access to memory, so the world's your oyster, but their limited access scope is exactly the whole point of using static local variables.
They're global state with controlled access, so you can apply local reasoning.
If you can access them externally, then local reasoning goes out the door. At that point, one should think: why not just use a regular global?
The following is more or less hacker stuff, in addition to other answers, not from the viewpoint of a language lawyer.
I think the reason that var2 can only be accessed within f1() is just some syntactical rule placed by the compiler.
To be picky, this is true only if the compiler is compliant to the standard. ;-) It is the standard that defines the rule. The technical term is "scope".
If we iterate through the bss section, the var2 must be accessible from outside the f1(). Am I right on this?
Yes. The section exists as long as the program runs. You can use a pointer into this section and access any variable living there. This holds true for the data section, too, of course.
You can also use a pointer to access any dynamic variable. Commonly these are allocated on the stack. To get a grip on a specific value can be quite tricky, though.
But all these accesses are application specific, compiler dependent, and system dependent, at least. You will probably break some rules. But in principle, nothing can stop you.
The bss section is meant for uninitialized global data. While the data section is meant for initialized global data.
This is only true if you mean "explicitly initialized to non-zero values".
Both sections are initialized before main() starts. All values in bss are zeroed, and all values in data are set to their non-zero values. A variable explicitly initialized to a zero value will commonly be allocated in bss.
The reason for the existence of separate sections is to save space in the executable. The bss section is commonly not stored, it is only defined. It does not makes sense to store a whole bunch of zeroes, the startup code will zero the complete section.
The var2 lives in bss
var2 does not live in the bss segment, but in the data segment, as the cutout clearly shows. Your var is initialized with 42, which is apparently non-zero.
Two execution environments are defined: freestanding and hosted. In
both cases, program startup occurs when a designated C function is
called by the execution environment. All objects with static storage
duration shall be initialized (set to their initial values) before
program startup. The manner and timing of such initialization are
otherwise unspecified. Program termination returns control to the
execution environment.
An object whose identifier is declared without the storage-class
specifier
_Thread_local, and either with external or internal linkage or with the storage-class specifier static, has static storage duration. Its
lifetime is the entire execution of the program and its stored value
is initialized only once, prior to program startup.
Your static local variable has static storage duration and will be initialized before program execution starts (ie before main is called). You can access it via a pointer to it. The pointer can be only obtained by calling the function.
int *func()
{
static int x;
return &x
}
There are 2 aspects about the var2:
Its lifetime is the same as the whole program.
But its scope is limited to within f1().
Yes and no. Lifetime is a property of objects. Scope is a property of identifiers (names). That the scope of var2 is from its declaration in f1() to the end of the function is about the region of the source wherein that name identifies the object in question. On the other hand, that the lifetime of the object identified by var2 inside that scope is the same as the whole program's is about the object itself.
The bss section is meant for uninitialized global data. While the data section is meant for initialized global data. The var2 lives in bss so it must be global in a sense.
Be very careful about trying to infer language semantics from implementation details. It is very easy to get that wrong, and very hard to get it right in all details. In this particular case, the object in question is global in exactly the sense that its lifetime is the same as the whole program's, which you already knew.
For the record, "global" is not a C-language term. When people say "global variable" in C context, they usually mean a variable more properly described as having external linkage, which necessarily identifies an object having static storage duration (i.e. the whole execution of the program). That's not what you're looking at in the case of var2.
I think the reason that var2 can only be accessed within f1() is just some syntactical rule placed by the compiler.
More or less yes. The object can be accessed by name only within the scope of that name. This is among the semantic rules of the C language. It is essentially the definition of "scope".
If we iterate through the bss section, the var2 must be accessible from outside the f1(). Am I right on this?
How do you propose to "iterate through the bss section"? In the first place, that's a characteristic of some executable file formats, not a runtime characteristic of the program. But perhaps you mean "iterate through memory", but even then, C does not define a way to do that.
With that said, if f1() published a pointer to its var2 variable, via an out variable, for example, that pointer could indeed be used outside the function to access that object. Like this, for example:
int f1(int **pptr) {
static int var2 = 42;
*pptr = &var2;
var2++;
return printf("var2=%d\n", var2);
}
// ...
void other_function() {
int *ptr;
int res = f1(&ptr);
printf("%d\n", *ptr);
}

How do static variables in C persist in memory?

We all know the common example to how static variable work - a static variable is declared inside a function with some value (let's say 5), the function adds 1 to it, and in the next call to that function the variable will have the modified value (6 in my example).
How does that happen behind the scene? What makes the function ignore the variable declaration after the first call? How does the value persist in memory, given the stack frame of the function is "destroyed" after its call has finished?
static variables and other variables with static storage duration are stored in special segments outside the stack. Generally, the C standard doesn't mention how this is done other than that static storage duration variables are initialized before main() is called. However, the vast majority of real-world computers work as described below:
If you initialize a static storage duration variable with a value, then most systems store it in a segment called .data. If you don't initialize it, or explicitly initialize it to zero, it gets stored in another segment called .bss where everything is zero-initialized.
The tricky part to understand is that when we write code such as this:
void func (void)
{
static int foo = 5; // will get stored in .data
...
Then the line containing the initialization is not executed the first time the function is entered (as often taught in beginner classes) - it is not executed inside the function at all and it is always ignored during function execution.
Before main() is even called, the "C run-time libraries" (often called CRT) run various start-up code. This includes copying down values into .data and .bss. So the above line is actually executed before your program even starts.
So by the time func() is called for the first time, foo is already initialized. Any other changes to foo inside the function will happen in run-time, as with any other variable.
This example illustrates the various memory regions of a program. What gets allocated on the stack and the heap? gives a more generic explanation.
The variable isn't stored in the stack frame, it's stored in the same memory used for global variables. The only difference is that the scope of the variable name is the function where the variable is declared.
Quoting C11, chapter 6.2.4
An object whose identifier is declared [...] with the storage-class specifier static, has static storage duration. Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup.
In a typical implementation, the objects with static storage duration are stored either in the data segment or the BSS (based on whether initialized or not). So every function call does not create a new variable in the stack of the called function, as you might have expected. There's a single instance of the variable in memory which is accessed for each iteration.

Where is the local static variable stored? If it is data segment, why its scope is not whole program?

If static local variable also stored in the data segment, why can't values are not persist for variable which is used in two different functions. example like this.
void func()
{
static int i=0;
i++;
}
void func1()
{
i++; // here i is stored in the data segment,
// then the scope should be available for entire program
}
Why the value 'i' is only accessible to block scope if is stored in data segment? it might be a silly question but I am trying to understand to concept. Please help me to understand concept. Thanks in advance.
You need to differentiate between the scope and the lifetime of a variable.
In simple words:
"scope" means the region of your source code where the variable is known to the compiler. If a variable is (by the rules) not visible to the compiler, it will refuse to compile accesses to it.
"lifetime" means the time beginning with the allocation of memory for the variable until the moment the memory is assigned to another variable or released. A static variable lives as long as the program runs. A non-static variable lives just as long as its scope is in control.
However, just because both scope and lifetime of a variable are "finished", that does not mean that the memory disappears. The physical cells are still there, and they keep their last contents. That's why you can program functions that return a pointer to some local variable, and retrieve that variables contents after both the scope and the lifetime of the variable are gone. This is a fine example of a beginner's confusing issue.
Consider a compiler for an embedded processor like the 8051. Granted, a quite old and simple machine, but a good example. This compiler will commonly put local variables in its data segment. But to use the limited memory space (128 bytes in total, including working registers and stack) the same memory locations are re-used for variables with non-overlapping lifetimes. Eventhough, you could access any memory from all of the program.
Now, language lawyers, start picking on me. ;-)
A variable in C consists of two things:
A name, called an identifier. An identifier has a scope, which is a region of the program source code in which it is visible (may be used).
A region of storage (memory), called an object. An object has a lifetime, which is a portion of program execution during which memory is reserved for it. This is also called storage duration.
For a variable declared inside a function, its identifier has block scope, and the identifier is visible only from its declaration to the } that closes the innermost block it is in. (A block is a list of statements and declarations inside { and }.)
Inside a function, declaring a variable with static makes its object have static storage duration, causing it to exist for all of program execution, but it does not change the scope of its identifier. The object exists throughout program execution, but the identifier is visible only inside the function.
When another function is called, the object still exists (and it can be used if the function has its address, perhaps because it has been passed as a parameter). However, the identifier for the variable is not known inside the source code of other functions, so they cannot use the identifier.

What uses up more space in FLASH? static variable or global variable

As the title says, what uses up more space in FLASH (in an STM32 µC for example)? Declaring a global variable or declaring a static variable inside a function? Or do they take equal space? Both variables are available throughout the whole runtime of the program in my understanding. Just their scopes are different.
You can have 0-initialized global and static variables. Those normally take up no flash, because they are placed in memory location which is allocated and zeroed when program starts and does not come from flash.
You can initialize the variables with value too. In that case they are placed in the initalized data segment, so take up space from flash according to size of the data type.
Static variables inside functions you can also initialize with code. That initializaton must happen at runtime, but can happen only once, so it actually generates more code, which will in almost any case take more space than the size of the data (not necessarily, at least if you initialize a large enough struct with a function return value). You can do almost same for non-const global variables too, you just need to leave them 0-initialized orignally and put assignment (for example) at the start of main(), where it takes the same space as initialization of function scope static variable by code takes elsewhere.
Conclusion, both global and function-scope static variables take up same amount of space.
Above assumes "global variable" in embedded context, or as a file-scope static variable. If it is exported global symbol in a dynamically linkable executable, then relocation information for that symbol will take some space in the executable binary. However, I don't think given example system supports or uses relocatable executables.
The formal term for "available throughout the whole runtime" is static storage duration. Variables declared at file scope ("global") as well as all variables declared with static both have static storage duration.
So there is a relation between scope and storage duration: scope can dictate what storage duration a variable gets. But there is no relation between scope and memory usage.
How much space a variable takes up only depends on how large that variable type is. Scope and storage duration has nothing to do with it.
On most compilers/linkers, there are usually two things required for a variable to end up in flash:
It must be declared as const, and
It must have static storage duration
If these conditions aren't met, the variable will not end up in flash/nvm, regardless of which scope it is declared at.
As the title says, what uses up more space in FLASH (in an STM32 µC for example)? Declaring a global variable or declaring a static variable inside a function? Or do they take equal space?
Using arm-none-eabi-gcc as the reference for an STM32 build, neither take any flash space at all.
Global and static variables that are not declared const go either into the .data section if they require startup initialisation or into .bss if they don't. Both of those segments are placed into SRAM by your linker script. If you're doing C++ then static C++ classes end up in .bss.
If you do declare them const then they'll be placed into the .rodata section which, if you consult your linker script you should find being located into a subsection of .text which is in flash. Flash is usually more plentiful than SRAM so do make use of const where you can.
Finally, the optimizer can come along and totally rearrange anything it sees fit, including the elimination of storage in favour of inlining.

How the static variable gets retrieved for every function call

We know that when the control exits from function the stack space will be freed. So what happens for static variables. Will they be saved in any memory and retrieved when the function gets called ??
The wiki says:
In the C programming language, static is used with global variables
and functions to set their scope to the containing file. In local
variables, static is used to store the variable in the statically
allocated memory instead of the automatically allocated memory. While
the language does not dictate the implementation of either type of
memory, statically allocated memory is typically reserved in data
segment of the program at compile time, while the automatically
allocated memory is normally implemented as a transient call stack.
and
Static local variables: variables declared as static inside a function
are statically allocated while having the same scope as automatic
local variables. Hence whatever values the function puts into its
static local variables during one call will still be present when the
function is called again.
Yes, static variables persist between function calls. They reside in data section of the program, like global variables.
You can (and probably should) read more about general memory layout of C applications here.
Adding some more information on top of previously given answers -
The memory for static objects is allocated at compile/link time. Their address is fixed by the linker based on the linker control file.
The linker file defines the physical memory layout (Flash/SRAM) and placement of the different program regions.
The static region is actually subdivided into two further sections, one for initial value, and the other for changes done in run time.
And finally, remember that if you will not specify otherwise, the value will be set to 0 during compilation.
You made an incorrect assumption that static variables are placed on the stack* when the function that uses them is running, so they need to be saved and retrieved.
This is not how C does it: static variables are allocated in an entirely different memory segment outside of stack, so they do not get freed when the function ends the scope of its automatic variables.
Typically, static data segment is created and initialized once upon entering the program. After that the segment stays allocated for as long as your program is running. All your global variables, along with the static variables from all functions, are placed in this segment by the compiler. That is why entering or leaving functions has no effect on these variables.
* The official name for "stack" is "automatic storage area".
Consider this example:
static int foo;
void f(void)
{
static int bar;
}
The only difference between foo and bar is that foo has file scope whereas bar has function scope. Both variables exist during the whole lifetime of the program.

Resources