How do static variables in C persist in memory?

How do static variables in C persist in memory? - c

We all know the common example to how static variable work - a static variable is declared inside a function with some value (let's say 5), the function adds 1 to it, and in the next call to that function the variable will have the modified value (6 in my example).
How does that happen behind the scene? What makes the function ignore the variable declaration after the first call? How does the value persist in memory, given the stack frame of the function is "destroyed" after its call has finished?

static variables and other variables with static storage duration are stored in special segments outside the stack. Generally, the C standard doesn't mention how this is done other than that static storage duration variables are initialized before main() is called. However, the vast majority of real-world computers work as described below:
If you initialize a static storage duration variable with a value, then most systems store it in a segment called .data. If you don't initialize it, or explicitly initialize it to zero, it gets stored in another segment called .bss where everything is zero-initialized.
The tricky part to understand is that when we write code such as this:
void func (void)
{
static int foo = 5; // will get stored in .data
...
Then the line containing the initialization is not executed the first time the function is entered (as often taught in beginner classes) - it is not executed inside the function at all and it is always ignored during function execution.
Before main() is even called, the "C run-time libraries" (often called CRT) run various start-up code. This includes copying down values into .data and .bss. So the above line is actually executed before your program even starts.
So by the time func() is called for the first time, foo is already initialized. Any other changes to foo inside the function will happen in run-time, as with any other variable.
This example illustrates the various memory regions of a program. What gets allocated on the stack and the heap? gives a more generic explanation.

The variable isn't stored in the stack frame, it's stored in the same memory used for global variables. The only difference is that the scope of the variable name is the function where the variable is declared.

Quoting C11, chapter 6.2.4
An object whose identifier is declared [...] with the storage-class specifier static, has static storage duration. Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup.
In a typical implementation, the objects with static storage duration are stored either in the data segment or the BSS (based on whether initialized or not). So every function call does not create a new variable in the stack of the called function, as you might have expected. There's a single instance of the variable in memory which is accessed for each iteration.

Related

Is it possible to access static local variable outside the function where it is declared?

For a static variable defined within a C function, like below:
int f1()
{
static int var2 = 42;
var2++;
printf("var2=%d\n", var2);
}
The var2 will be stored in the .data segment (because it is explicitly initialized to 42, thanks to #busybee pointing this out):
0000000000004014 l O .data 0000000000000004 var2.2316
The var2 will be stored in the .bss segment if I don't explicitly initialize it or initialize it to 0):
000000000000401c l O .bss 0000000000000004 var2.2316
There are 2 aspects about the var2:
Its lifetime is the same as the whole program.
But its scope is limited to within f1().
The bss section is meant for uninitialized global data. While the data section is meant for initialized global data. The var2 lives in bss so it must be global in a sense.
I think the reason that var2 can only be accessed within f1() is just some syntactical rule placed by the compiler. If we iterate through the bss section, the var2 must be accessible from outside the f1(). Am I right on this? Thanks.

Well you have raw access to memory, so the world's your oyster, but their limited access scope is exactly the whole point of using static local variables.
They're global state with controlled access, so you can apply local reasoning.
If you can access them externally, then local reasoning goes out the door. At that point, one should think: why not just use a regular global?

The following is more or less hacker stuff, in addition to other answers, not from the viewpoint of a language lawyer.
I think the reason that var2 can only be accessed within f1() is just some syntactical rule placed by the compiler.
To be picky, this is true only if the compiler is compliant to the standard. ;-) It is the standard that defines the rule. The technical term is "scope".
If we iterate through the bss section, the var2 must be accessible from outside the f1(). Am I right on this?
Yes. The section exists as long as the program runs. You can use a pointer into this section and access any variable living there. This holds true for the data section, too, of course.
You can also use a pointer to access any dynamic variable. Commonly these are allocated on the stack. To get a grip on a specific value can be quite tricky, though.
But all these accesses are application specific, compiler dependent, and system dependent, at least. You will probably break some rules. But in principle, nothing can stop you.
The bss section is meant for uninitialized global data. While the data section is meant for initialized global data.
This is only true if you mean "explicitly initialized to non-zero values".
Both sections are initialized before main() starts. All values in bss are zeroed, and all values in data are set to their non-zero values. A variable explicitly initialized to a zero value will commonly be allocated in bss.
The reason for the existence of separate sections is to save space in the executable. The bss section is commonly not stored, it is only defined. It does not makes sense to store a whole bunch of zeroes, the startup code will zero the complete section.
The var2 lives in bss
var2 does not live in the bss segment, but in the data segment, as the cutout clearly shows. Your var is initialized with 42, which is apparently non-zero.

Two execution environments are defined: freestanding and hosted. In
both cases, program startup occurs when a designated C function is
called by the execution environment. All objects with static storage
duration shall be initialized (set to their initial values) before
program startup. The manner and timing of such initialization are
otherwise unspecified. Program termination returns control to the
execution environment.
An object whose identifier is declared without the storage-class
specifier
_Thread_local, and either with external or internal linkage or with the storage-class specifier static, has static storage duration. Its
lifetime is the entire execution of the program and its stored value
is initialized only once, prior to program startup.
Your static local variable has static storage duration and will be initialized before program execution starts (ie before main is called). You can access it via a pointer to it. The pointer can be only obtained by calling the function.
int *func()
{
static int x;
return &x
}

There are 2 aspects about the var2:
Its lifetime is the same as the whole program.
But its scope is limited to within f1().
Yes and no. Lifetime is a property of objects. Scope is a property of identifiers (names). That the scope of var2 is from its declaration in f1() to the end of the function is about the region of the source wherein that name identifies the object in question. On the other hand, that the lifetime of the object identified by var2 inside that scope is the same as the whole program's is about the object itself.
The bss section is meant for uninitialized global data. While the data section is meant for initialized global data. The var2 lives in bss so it must be global in a sense.
Be very careful about trying to infer language semantics from implementation details. It is very easy to get that wrong, and very hard to get it right in all details. In this particular case, the object in question is global in exactly the sense that its lifetime is the same as the whole program's, which you already knew.
For the record, "global" is not a C-language term. When people say "global variable" in C context, they usually mean a variable more properly described as having external linkage, which necessarily identifies an object having static storage duration (i.e. the whole execution of the program). That's not what you're looking at in the case of var2.
I think the reason that var2 can only be accessed within f1() is just some syntactical rule placed by the compiler.
More or less yes. The object can be accessed by name only within the scope of that name. This is among the semantic rules of the C language. It is essentially the definition of "scope".
If we iterate through the bss section, the var2 must be accessible from outside the f1(). Am I right on this?
How do you propose to "iterate through the bss section"? In the first place, that's a characteristic of some executable file formats, not a runtime characteristic of the program. But perhaps you mean "iterate through memory", but even then, C does not define a way to do that.
With that said, if f1() published a pointer to its var2 variable, via an out variable, for example, that pointer could indeed be used outside the function to access that object. Like this, for example:
int f1(int **pptr) {
static int var2 = 42;
*pptr = &var2;
var2++;
return printf("var2=%d\n", var2);
}
// ...
void other_function() {
int *ptr;
int res = f1(&ptr);
printf("%d\n", *ptr);
}

Where is the local static variable stored? If it is data segment, why its scope is not whole program?

If static local variable also stored in the data segment, why can't values are not persist for variable which is used in two different functions. example like this.
void func()
{
static int i=0;
i++;
}
void func1()
{
i++; // here i is stored in the data segment,
// then the scope should be available for entire program
}
Why the value 'i' is only accessible to block scope if is stored in data segment? it might be a silly question but I am trying to understand to concept. Please help me to understand concept. Thanks in advance.

You need to differentiate between the scope and the lifetime of a variable.
In simple words:
"scope" means the region of your source code where the variable is known to the compiler. If a variable is (by the rules) not visible to the compiler, it will refuse to compile accesses to it.
"lifetime" means the time beginning with the allocation of memory for the variable until the moment the memory is assigned to another variable or released. A static variable lives as long as the program runs. A non-static variable lives just as long as its scope is in control.
However, just because both scope and lifetime of a variable are "finished", that does not mean that the memory disappears. The physical cells are still there, and they keep their last contents. That's why you can program functions that return a pointer to some local variable, and retrieve that variables contents after both the scope and the lifetime of the variable are gone. This is a fine example of a beginner's confusing issue.
Consider a compiler for an embedded processor like the 8051. Granted, a quite old and simple machine, but a good example. This compiler will commonly put local variables in its data segment. But to use the limited memory space (128 bytes in total, including working registers and stack) the same memory locations are re-used for variables with non-overlapping lifetimes. Eventhough, you could access any memory from all of the program.
Now, language lawyers, start picking on me. ;-)

A variable in C consists of two things:
A name, called an identifier. An identifier has a scope, which is a region of the program source code in which it is visible (may be used).
A region of storage (memory), called an object. An object has a lifetime, which is a portion of program execution during which memory is reserved for it. This is also called storage duration.
For a variable declared inside a function, its identifier has block scope, and the identifier is visible only from its declaration to the } that closes the innermost block it is in. (A block is a list of statements and declarations inside { and }.)
Inside a function, declaring a variable with static makes its object have static storage duration, causing it to exist for all of program execution, but it does not change the scope of its identifier. The object exists throughout program execution, but the identifier is visible only inside the function.
When another function is called, the object still exists (and it can be used if the function has its address, perhaps because it has been passed as a parameter). However, the identifier for the variable is not known inside the source code of other functions, so they cannot use the identifier.

How the static variable gets retrieved for every function call

We know that when the control exits from function the stack space will be freed. So what happens for static variables. Will they be saved in any memory and retrieved when the function gets called ??

The wiki says:
In the C programming language, static is used with global variables
and functions to set their scope to the containing file. In local
variables, static is used to store the variable in the statically
allocated memory instead of the automatically allocated memory. While
the language does not dictate the implementation of either type of
memory, statically allocated memory is typically reserved in data
segment of the program at compile time, while the automatically
allocated memory is normally implemented as a transient call stack.
and
Static local variables: variables declared as static inside a function
are statically allocated while having the same scope as automatic
local variables. Hence whatever values the function puts into its
static local variables during one call will still be present when the
function is called again.

Yes, static variables persist between function calls. They reside in data section of the program, like global variables.
You can (and probably should) read more about general memory layout of C applications here.

Adding some more information on top of previously given answers -
The memory for static objects is allocated at compile/link time. Their address is fixed by the linker based on the linker control file.
The linker file defines the physical memory layout (Flash/SRAM) and placement of the different program regions.
The static region is actually subdivided into two further sections, one for initial value, and the other for changes done in run time.
And finally, remember that if you will not specify otherwise, the value will be set to 0 during compilation.

You made an incorrect assumption that static variables are placed on the stack* when the function that uses them is running, so they need to be saved and retrieved.
This is not how C does it: static variables are allocated in an entirely different memory segment outside of stack, so they do not get freed when the function ends the scope of its automatic variables.
Typically, static data segment is created and initialized once upon entering the program. After that the segment stays allocated for as long as your program is running. All your global variables, along with the static variables from all functions, are placed in this segment by the compiler. That is why entering or leaving functions has no effect on these variables.
* The official name for "stack" is "automatic storage area".

Consider this example:
static int foo;
void f(void)
{
static int bar;
}
The only difference between foo and bar is that foo has file scope whereas bar has function scope. Both variables exist during the whole lifetime of the program.

Assignment to static variable is ignored

I have following piece of code:
#include <stdio.h>
int f1()
{
static int s=10;
printf("s=%d\n",s++);
}
int main()
{
f1();
f1();
return 0;
}
The output is:
s=10
s=11
Why is the line static int s=10 ignored at the second time, when f1 is called?

That is no assignment, but an initializer. Local static variables are only initialized once at program startup like global variables. They keep their last assigned value even between invocations of the function. Thus after your first call, it retains the value 11. In fact, they are like file-scope static variables, with their name only known in the scope of the block they are declared (but you can pass them by pointer).
Drawback is they only exist once. If you invoke the same function from multiple threads, they all share the same variable.
Try a third call: you will get 12.
Note: the initializer must be a constant expression. Try static int s = 10, t = s + 5; and read the compiler error message.

Initialization of a static variable is one-time (with the time of initialization guaranteed to occur before the first call, which could occur at compile time or at run time; compiler dependent). That's the main reason to use them.

The static variables are initialized only once, conceptually even before application has started.
From C11 (N1570) §5.1.2/p1 Execution environments:
All objects with static storage duration shall be initialized (set to
their initial values) before program startup.
along with §6.2.4/p3 Storage durations of objects:
Its lifetime is the entire execution of the program and its stored
value is initialized only once, prior to program startup.

As others have said, a static variable at function scope is initialized only once. So the assignment doesn't happen on subsequent calls to the function.
Unlike other local variables, a static local is not defined on the stack but in the data segment, probably in the same location as global variables. Globals are also initialized at application startup (they have to, since they don't live inside of a function and therefore can't be executable code), so conceptually you can think of a static variable as a global variable with limited visibility.

From the C89 Standard HTML version at 3.1.2.4 Storage durations of objects it specifies:
An object declared with external or internal linkage, or with the storage-class specifier static has static storage duration. For such an object, storage is reserved and its stored value is initialized only once, prior to program startup. The object exists and retains its last-stored value throughout the execution of the entire program
(The emphasis is mine)
So it says that everytime you use the static qualifier, that variable preserves its value across multiple function calls.
Local variables that are not static are initialized everytime you call the function that delcares them, so they do not preserve their value across function calls.
Hope this helped!

Is it correct to call a static variable local?

It is known that the C language supports two kinds of memory allocation through the variables in C programs:
1) Static allocation is what happens when you declare a static
variable. Each static variable defines one block of space, of a fixed
size. The space is allocated once, when your program is started, and
is never freed.
2) Automatic allocation happens when you declare an automatic
variable, such as a function argument or a local variable. The space
for an automatic variable is allocated when the compound statement
containing the declaration is entered, and is freed when that compound
statement is exited.
(this is a full quote from http://www.cs.utah.edu/dept/old/texinfo/glibc-manual-0.02/library_3.html)
The question is: is it correct to call a static variable in a function "local" in terms of memory allocation and why?
Thanks to everyone in advance.
P.S. any quotes from the C standard are welcome.

C standard doesn't define the term of local variable. Automatic and static refer to storage duration.
C11 (n1570), § 6.2.4 Storage durations of objects
An object has a storage duration that determines its lifetime.

You could call it a "function-local static variable" or something like that, but if you simply call it a "local variable" you may find that people are surprised when they find out it's actually static, and therefore has some of the properties of a global variable.

There are two types of static variables in C.
The global static variables, where the static states that these variables can only be seen in this translation-unit.
Static variables with a local scop (i.e. in function). These are initialized once and keep their value event after going out of scope.
And to you question: no, a variable can't be static and automatic at the same time.
If you check their addresses, you will se that the static variable does not live on the current stack frame.

In the context of variables, the term local most often denotes visibility and scope rather than the storage mechanism and lifetime.
Using the term local variables in C is in fact inaccurate as the standard never talks about that.
Informally, a static variable inside a function could be said to be local within the visible scope of the function, but not much more than that.
I would suggest against using the term local variables at all. Instead, one should talk about static variables within a function, automatic variables, static variables in the file scope and globals.

The question is: is it correct to call a static variable in a function "local" in terms of memory allocation and why?
Static variables are stored in the data section of the memory allocated to the program.
Even though if the scope of a static variable ends , it can still be accessed outside its
scope , this may indicate that , the contents of data segment , may be independent
of scope.
Example
#include <stdio.h>
int increment(void);
int main()
{
printf("\ni = %d",increment());
printf("\ni = %d",increment());
printf("\ni = %d",increment());
}
int increment(void)
{
static int i = 1;
return i++ ;
}
In the above example , after each function call to increment() , the static variable i inside the function goes out of scope every time the function returns but persistently
retains its value. This is only possible because the variable is not on the same same stack as the function , but it is present entirely in a different memory area , the data segment.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight