Is function pointer address always static? - c

If a function pointer scopes out before being used in another thread to run, will the pointer be invalid? Or are function pointers always valid since they point to executable code which doesn't "move around"?
I think my real question is whether what the pointer points to (the function) will ever change, or is that value static throughout lifetime of program
Pseudo-code:
static void func(void) { printf("hi\n"); }
int main(void)
{
start_thread();
{
void (*f)(void) = func;
// edit: void run_on_other_thread(void (*f)(void));
run_on_other_thread(f); // non-blocking.
}
join_thread();
}

In the C base language, the values of function pointers never become invalid. They point to functions, and functions exist for the entire time a program is executing. The value of a pointer is valid for the entire program.
An object that contains a pointer may have a limited lifetime. (Note: The question mentioned scope, but scope is where in the source code an identifier is visible. Lifetime is when during program execution an object exists.) In the question void (*f)(void) = func;, f is an object with automatic storage duration. Once execution of the block it is defined in ends, f no longer exists, and references to it have undefined behavior. However, the value that was assigned to f is still a valid value. For example, if we define int x = 37;, and the lifetime of x ends, that does not mean you can no longer use the value 37 in a program. In this case, the value that f had, which is the address of func, is still valid. The address of func can continue to be used throughout the program’s execution.
The situations discussed in Xypron’s answer regarding dynamically linked functions or dynamically created functions would be extensions to the C language. In these situations, it is not the lifetime of the pointer object that is in question but rather the fact that the function itself is being removed from memory that causes the pointer to be no longer a valid pointer to the original function.

Whether a function pointer remains valid depends on its usage.
If it points to a function in the source code of your process it stays valid during the runtime of the process.
If you use a function pointer to point to a function in a dynamic link library, the pointer becomes invalid when unloading the library.
Code can be written that relocates itself. E.g. when the Linux kernel is started it relocates itself changing the addresses of functions.
You could call a runtime compiler which creates functions in memory during program execution possibly reusing the memory when an object goes out of scope.
As said it depends.

Related

How do static variables in C persist in memory?

We all know the common example to how static variable work - a static variable is declared inside a function with some value (let's say 5), the function adds 1 to it, and in the next call to that function the variable will have the modified value (6 in my example).
How does that happen behind the scene? What makes the function ignore the variable declaration after the first call? How does the value persist in memory, given the stack frame of the function is "destroyed" after its call has finished?
static variables and other variables with static storage duration are stored in special segments outside the stack. Generally, the C standard doesn't mention how this is done other than that static storage duration variables are initialized before main() is called. However, the vast majority of real-world computers work as described below:
If you initialize a static storage duration variable with a value, then most systems store it in a segment called .data. If you don't initialize it, or explicitly initialize it to zero, it gets stored in another segment called .bss where everything is zero-initialized.
The tricky part to understand is that when we write code such as this:
void func (void)
{
static int foo = 5; // will get stored in .data
...
Then the line containing the initialization is not executed the first time the function is entered (as often taught in beginner classes) - it is not executed inside the function at all and it is always ignored during function execution.
Before main() is even called, the "C run-time libraries" (often called CRT) run various start-up code. This includes copying down values into .data and .bss. So the above line is actually executed before your program even starts.
So by the time func() is called for the first time, foo is already initialized. Any other changes to foo inside the function will happen in run-time, as with any other variable.
This example illustrates the various memory regions of a program. What gets allocated on the stack and the heap? gives a more generic explanation.
The variable isn't stored in the stack frame, it's stored in the same memory used for global variables. The only difference is that the scope of the variable name is the function where the variable is declared.
Quoting C11, chapter 6.2.4
An object whose identifier is declared [...] with the storage-class specifier static, has static storage duration. Its lifetime is the entire execution of the program and its stored value is initialized only once, prior to program startup.
In a typical implementation, the objects with static storage duration are stored either in the data segment or the BSS (based on whether initialized or not). So every function call does not create a new variable in the stack of the called function, as you might have expected. There's a single instance of the variable in memory which is accessed for each iteration.

Where is the local static variable stored? If it is data segment, why its scope is not whole program?

If static local variable also stored in the data segment, why can't values are not persist for variable which is used in two different functions. example like this.
void func()
{
static int i=0;
i++;
}
void func1()
{
i++; // here i is stored in the data segment,
// then the scope should be available for entire program
}
Why the value 'i' is only accessible to block scope if is stored in data segment? it might be a silly question but I am trying to understand to concept. Please help me to understand concept. Thanks in advance.
You need to differentiate between the scope and the lifetime of a variable.
In simple words:
"scope" means the region of your source code where the variable is known to the compiler. If a variable is (by the rules) not visible to the compiler, it will refuse to compile accesses to it.
"lifetime" means the time beginning with the allocation of memory for the variable until the moment the memory is assigned to another variable or released. A static variable lives as long as the program runs. A non-static variable lives just as long as its scope is in control.
However, just because both scope and lifetime of a variable are "finished", that does not mean that the memory disappears. The physical cells are still there, and they keep their last contents. That's why you can program functions that return a pointer to some local variable, and retrieve that variables contents after both the scope and the lifetime of the variable are gone. This is a fine example of a beginner's confusing issue.
Consider a compiler for an embedded processor like the 8051. Granted, a quite old and simple machine, but a good example. This compiler will commonly put local variables in its data segment. But to use the limited memory space (128 bytes in total, including working registers and stack) the same memory locations are re-used for variables with non-overlapping lifetimes. Eventhough, you could access any memory from all of the program.
Now, language lawyers, start picking on me. ;-)
A variable in C consists of two things:
A name, called an identifier. An identifier has a scope, which is a region of the program source code in which it is visible (may be used).
A region of storage (memory), called an object. An object has a lifetime, which is a portion of program execution during which memory is reserved for it. This is also called storage duration.
For a variable declared inside a function, its identifier has block scope, and the identifier is visible only from its declaration to the } that closes the innermost block it is in. (A block is a list of statements and declarations inside { and }.)
Inside a function, declaring a variable with static makes its object have static storage duration, causing it to exist for all of program execution, but it does not change the scope of its identifier. The object exists throughout program execution, but the identifier is visible only inside the function.
When another function is called, the object still exists (and it can be used if the function has its address, perhaps because it has been passed as a parameter). However, the identifier for the variable is not known inside the source code of other functions, so they cannot use the identifier.

How we can access auto and static variables outside their scope in C?

Auto and static variable have scope limited to the block in which they are defined. Since Auto variables are defined in the stack, if the function exits, stack is destroyed and memory for the auto variable is released. But I read somewhere "However, they can be accessed outside their scope as well using the concept of pointers given here by pointing to the very exact memory location where the variables reside." Is this correct?
Also, static variables are defined in the data section so it retains its existence till end of program. The scope is within the block in which it is defined. Is there any way through which we can access static variable from any other function? Also, Is there any way we can access static variable from any other file?
Here's a very simple example:
void print_msg(const char* msg) {
printf("The message is: %s\n", msg);
}
int main(void) {
char m[] = "Hello, world!";
print_msg(m);
}
Here, m is an automatic variable, which is not in scope in print_msg. But print_msg clearly has access to its value.
Don't confuse "scope" with "lifetime". The scope of a variable is that part of the program where the variable's name is visible (and thus can be used). The lifetime of a value is the period during program execution in which a value exists. Scope is about program text; it relates to compilation. Lifetime is about program execution.
As you said, static variables exist through out the life cycle of the program i.e memory allocated to them is not destroyed as long as the program is running. So, to access such a variable out side its scope, we can pass around the pointer to that memory location via pointer. A small example to show the same
#include <stdio.h>
#include <stdlib.h>
int* func()
{
static int a = 0;
a++;
printf("a in func = %d\n", a);
return &a;
}
int main()
{
int *p;
p = func();
printf("a in main from ptr : %d\n", *p);
*p++;
p = func();
return 0;
}
As you can see in the example, func() returns the pointer to the static variable it has declared, and any one who wishes to access the variable a, can use that pointer. NOTE: we can only do this because static variable's life is through out the program. Now irrespective of the static variable being in a different function or a different file, as long as you can some how get hold of the pointer to that static variable, you can use it.
Now coming to the case of auto variable.
What happens if you run the above program changing a from static to auto?
you will see that while compiling a warning warning: function returns address of local variable [-Wreturn-local-addr] is thrown and when executing, we get a segmentation fault.
What causes this is that the auto variable exists only in its scope, i.e as long as the function func() is being executed, the variable a has memory allocated for itself. As soon as the function exits, the memory allocated for variable a is freed and so the value pointed to by pointer p is at some unallocated memory location (resulting in segmentation fault).
Note, as comments rightly point out, I am making an assumption here, the assumption that the simplest case of calling another function is not what the question is about. This assumption was not (yet) confirmend or rejected by OP. This case is discussed e.g. in the answer by rici.
The existence of auto variables is not only exist to "within" their scope (simplified: only code between the same enclosing {} can use their identifier), they are also restricted to "during" their "chronological scope" i.e. their lifetime (simplified after starting the execution of the code in the function and finishing its execution). It is possible to access the memory location of a variable via a pointer, which was set to their address (which is only possible within their scope, because accessing via their identifier is necessary) as long as it is done during their lifetime, yes.
But how would that pointer be found from anywhere else?
Maybe by being written (from inside their scope and during their lifetime) to a global variable.
But which "other" code should then use that value? (remember I am putting the call of functions at the side here)
This requires multithreading/multitasking/multiwhatevering. Lets say there is an interrupt service routine doing it. It would have to see the same address space as the variables scope, i.e. no memory management units getting in the way with some virtual memory magic. This is not true for many multiwhatevering implementations, but admittedly for a few of them, so lets continue.
This imagined ISR would have to ensure that it only accesses the auto variable while it actually exists (i.e. during its lifetime), otherwise it would pretty much access what is effectively a meaningless random memory location. And this assumes that the ISR is actually allowed/able to access that memory. Even without MMUs, there are implementations which can/will have execeptions.
This introduces the need for synchronisation mechanisms, e.g. semaphores.
So in certain environments it would be possible, but completley pointless (global variables are still involved), expensive, hard to understand and next to impossible to port. (remember I am putting call of a function aside here)
Similar for static variables.
In the case of function local static variables, they would at least reliably exist, but accessing them would still need the pointer value to be somehow transported out of their scope. For static variables that could actually be done via the return value of the function as demonstrated in the answer by yashC.
In the case of "static" variables understood as file scope restricted variables, the pointer still would have to be transported out of the file scope.
This would merely defeat what is probably the point of a file scope restricted variable. But I could imagine some kind of access privilege scheme, as in "Here is the key to the vault. Handle with care."
As mentioned at the start of this answer, I am putting the call of other functions aside. Yes, the easiest way to leave the scope of a function is to call another one. If that other function has a pointer parameter, it can use it to read-access and write-access the auto variable of the calling function. That is the normal case of call-by-reference parameters as supported by C.
Calling a function also provides another, even simpler way of read-accessing the value of an auto variable of the calling function, though not write-accessing and not actually accessing the autovariable itself, only using its value. That way is the trivial mechanism of a call-by-value parameter, it does not even require a pointer. Both ways (call-by-reference parameter and call-by-value parameter) conveniently guarantee that the value does not change during the execution of the called function. (this time I am putting the multi-threaded case aside, because that is discussed in the main part of this answer).

What happens to initialization of static variable inside a function

After stumbling onto this question and reading a little more here (c++ but this issue works the same in C/C++ AFAIN) I saw no mention to what is realy happening inside the function.
void f(){
static int c = 0;
printf("%d\n",c++);
}
int main(){
int i = 10;
while(i--)
f();
return 0;
}
In this snippet, c lifetime is the entire execution of the program, so the line static int c = 0; has no meaning in the next calls to f() since c is already a defined (static) variable, and the assignment part is also obsolete (in the next calls to f()), since it only takes place at the first time.
So, what does the compiler do? does it split f into 2 functions - f_init, f_the_real_thing where f_init initializes and f_the_real_thing prints, and calls 1 time f_init and from that onward, only calls f_the_real_thing?
The first assignment is not "obsolete" - it ensures c is zero the first time f() is called. Admittedly that is the default for statics: if no initialiser is specified, it will be initialised to zero. But a static int c = 42 will ensure c has the value 42 the first time the function is called, and the sequence of values will continue from there.
The static keyword means that the variable has static storage duration. It is only initialised once (so will have that value the first time the function is called) but changes then persist - any time the value is retrieved, the value retrieved will be the last stored in the variable.
All the compiler does is place the variable c into an area of memory that will exist - and hold whatever value it was last set to - for as long as the program is running. The specifics of how that is achieved depends on the compiler.
However, I have never seen a compiler that splits the logic of the function into multiple parts to accommodate the static.
Although the standard does not dictate how compilers must implement behavior, most compilers do a much less sophisticated thing: they place c into static memory segment, and tell the loader to place zero into c's address. This way f comes straight to pre-initialized c, and proceeds to printing and incrementing as if the declaration line where not there.
In C++ it optionally adds code to initialize c to static initialization function, which initializes all static variables. In this case, no call is required.
In essence, this amounts to c starting its lifetime before the first call to f. You can think of c's behavior as if it were a static variable outside f() with its visibility constrained to f()'s scope.
The C standard doesn't specify how the required behaviour for static storage duration must be implemented.
If you're curious about how your particular implementation handles this, then you can always check the generated assembly.
(Note that in your particular case, your code is vulnerable to concurrency issues centred around c++ not necessarily being atomic; also its vulnerability to int overflow, although i-- does act as an adequate termination condition.)

What is the meaning of "statically allocated"?

http://linux.die.net/man/3/pthread_mutex_init
In cases where default mutex attributes are appropriate, the macro
PTHREAD_MUTEX_INITIALIZER can be used to initialize mutexes that are
statically allocated. The effect shall be equivalent to dynamic
initialization by a call to pthread_mutex_init() with parameter attr
specified as NULL, except that no error checks are performed.
I know about dynamic allocation. What is the meaning of "statically allocated"?
My question here is to understand the meaning of "statically" allocated. I posted the quote from the man page to provide a context, only.
Statically allocated means that the variable is allocated at compile-time, not at run-time. In C, this can be a global variable at the file scope or a static variable in a function.
A good overview is found here:
http://en.wikipedia.org/wiki/Static_memory_allocation
Variables on the stack (i.e., local variables in functions that do not have the static keyword) are allocated when the function is called, sometimes multiple times when a function is called recursively. So they are conceptually different from static memory allocation (which only happens once per program).

Resources