Declare global variable after function - c

Why do we need to declare global variables before function defining and declaring since we call this function after this variable declaration? Isn't it that compiler read line by line? I mean while calling the function compiler should know what's x.
void function()
{
x += 3
}
int x = 3;
int main(void)
{
function();
return 0;
}
And one more question. I know that we can define function after main function provided we declared this function before main. Then how does main function see these functions after main function? Does the compiler first read the whole file and then run main() or sth?

You can think at the job of the compiler like this; it reads the source file token by token (not exactly line by line) and when sufficient tokens are read, it outputs the translation. It repeats that, until the source file is (correctly) finished: the job of the compiler is done.
For every token the compiler reads, it needs to know what the token represents. If it doesn't know, an error is generated.
So, while compiling your function
void function()
{
x += 3
}
it encounters an "x" but does not know what it represents (an integer? A float? Something else?). -error-.
Why do we need to declare global variables before function defining and declaring
Declaration and Definition are two different things. The compiler needs a declaration in order to know how to manage an identifier; the real definition can be somewhere else, even in another source (or already compiled) file.
And one more question. I know that we can define function after main function provided we declared this function before main. Then how does main function see these functions after main function? Does the compiler first read the whole file and then run main() or sth?
As explained before, all the compiler needs is a declaration, so it can output correct (object) code. If you declare function(), then define main(), then define function(), the compiler has enough to generate correct output, which will consist of code for main() and code for function() (we can say "in this order"). The next step, the linker, will take care to connect these two functions.
The definition of function() could also be absent: the compiler still would generate correct code for main(); the linker would instead complain, unless you tell it where to find the definition/implementation of function().
Also note that a definition is also a declaration. So if in your source you declare function() and then main(), you don't need forward declaration.
In the comments I've read that perhaps you are confusing interpreters with compilers - this is true, if you try to compare Python with C: very different beasts. A big difference is compiler vs interpreter, the compiler generates data (object code) but does not link it (and neither runs it). An interpreter instead is a compiler+linker+runtime, all packed together. Normally a compiler generates code that is much faster than the equivalent interpreted program, but to do this it needs accurate informations (precise types and declarations) and often (always?) is less versatile. The interpreter is often more versatile but it can not exploit all the optimizazions a good compiler can do.

Why do we need to declare global variables before function defining
and declaring since we call this variablefunction after this
declaration?
The c language is a strictly typed language. When the compiler processes an identifier it needs to determine its type to generate correct object code.
It is not necessary that a global variable used in a function shall be declared exactly before the function definition. But in any case it shall be declared before its usage in a function.
Here is a demonstrative program.
#include <stdio.h>
void function( void )
{
extern int x;
x += 3;
}
int x = 3;
int main( void )
{
function();
printf( "x = %d\n", x );
}
The program output is
x = 6
Here the variable x declared within the function refers to the global variable x defined after the function.
Then how does main function see these functions after main function?
Does the compiler first read the whole file and then run main() or
sth?
C is a compilation language. It does not run programs. It generates object code that then after some processing by the linker can be run.
In the point where a function is used what the compiler is need is the type of the function that to check whether the function is used correctly. If the function is an inline function then it can substitute its call for the function body It can rebuild the object code when the inline definition in the translation unit will be known.

Related

Why function prototypes are they required in MISRA:2012?

I am wondering why function prototypes are required by MISRA:2012. In the example below, the two prototypes aren't really necessary.
#include <stdio.h>
#include <stdlib.h>
// >>> Truly useless in my opinion
void display(void);
int main(void);
// <<<
void display(void) {
printf("Hello World!\n");
}
int main() {
display();
return EXIT_SUCCESS;
}
The rationale I can read on SO such as here isn't very clear to me. For instance, if main tries to access display before it is declared, the compiler or the static analyzer will raise an error: function display used before declaration.
In other words is it a good idea to create a deviation for this MISRA rule?
void display(void); is a function forward declaration. It has prototype format.
As indicated in the link posted, a function prototype is a function declaration with the types of all parameters specified. If there are no parameters, then the parameter list must be (void) (no parameters) and not () (any parameter).
The exact rule 8.2 says:
Rule 8.2 Function types shall be in prototype form with named parameters
The rationale provided (read it, it is pretty good) mentions that this is to avoid old K&R and C90 programs where not all parameters are specified. C99 still allows this at some extent, as long as the parameter types in the function declaration don't collide with the parameter types in the function definition.
Essentially, the rule seeks to ban these kind of functions:
void func1 (x) // K&R style
int x;
{}
void func2(x) // sloppy style
{}
All parameters (if any) must have types and names specified.
However, I find nothing in MISRA-C that requires you to write a function declaration for each function. This means that your example code would conform to this MISRA rule with or without the function declaration.
Though as I mentioned in a previous answer, writing .c files without function declarations (in prototype format) is sloppy practice. If your functions need to be called in a certain order, it should be made obvious by the program design, function naming and comments/documentation. Not by the order that they happen to be declared inside the .c file.
There should be no tight coupling between the source code line where a function is declared in the .c file and that function's behavior/use.
Instead, functions should be defined in an order that makes sense logically. A common way to write .c files is to keep all public functions, that have their function declaration up in the .h file, at the top of the .c file. Then let the internal functions (those with static/internal linkage) sit at the bottom. This model requires function declarations of all the internal functions. Another option is to put all internal functions on top, and the public functions at the bottom. As long as you are consistent, either is fine.
What's most important is that if function definitions inside the .c file are re-ordered, it should not break the program or cause compiler errors. The easiest way to ensure this is to always provide function declarations for every single function in your program.
Note that the function declarations on top of the file is not "truly useless" at all, as they provide a quick summary of all functions present in the C file. It is a way to write self-documenting code.
Note that the C standard allows no prototype for main(), as a special case.
Note that in addition, rule 8.7 and 8.8 disallows you to use void display(void) without static, since the function is only used in one translation unit.
If you do not declare the function, any function call will call default argument promotions for each argument because it is considered that the function has the semantics of the C89 standard.
Case 1:
Consider a call f(5), where the parameter of the function is of type double. The code of f will treat 5 as double, while the default arith promotions will pass only an integer.
A loss of header file that contains declarations may have your code pass integers and in fact the function to treat them as pointers causing seg faults. Here is a concrete example:
Case 2:
Suppose you want to use the function strtok and you do not include the header string.h with the declaration -- char *strtok(char *str, const char *delim)
Now you make the mistake to consider separator 'A' instead of "A". So if you forget the signature but keep in mind the meaning of parameters, if you do not include the header the code will compile with no warning and of course the code from strtok will consider your char 'A' (that is converted in an integer (=95)) as the actual argument. The code considers it is a pointer to a string and will try to access the pointer from the location 95 finishing with segfault.
Case 3:
Here is another typical example of code that segfaults on some architecture -- even if you do not make any mistake it still segfaults.
char *subtoken;
subtoken = strtok(str, delim, &saveptr);
In this case the function strtok (missing the declaration from string.h) is considered to return an int, so an implicit conversion from int->char* is made. If int is represented on 32 bits and the pointer on 64 bits, clearly the value of subtoken will be errorneous and will produce seg fault.

Determining to which function a pointer is pointing in C?

I have a pointer to function, assume any signature.
And I have 5 different functions with same signature.
At run time one of them gets assigned to the pointer, and that function is called.
Without inserting any print statement in those functions, how can I come to know the name of function which the pointer currently points to?
You will have to check which of your 5 functions your pointer points to:
if (func_ptr == my_function1) {
puts("func_ptr points to my_function1");
} else if (func_ptr == my_function2) {
puts("func_ptr points to my_function2");
} else if (func_ptr == my_function3) {
puts("func_ptr points to my_function3");
} ...
If this is a common pattern you need, then use a table of structs instead of a function pointer:
typedef void (*my_func)(int);
struct Function {
my_func func;
const char *func_name;
};
#define FUNC_ENTRY(function) {function, #function}
const Function func_table[] = {
FUNC_ENTRY(function1),
FUNC_ENTRY(function2),
FUNC_ENTRY(function3),
FUNC_ENTRY(function4),
FUNC_ENTRY(function5)
}
struct Function *func = &func_table[3]; //instead of func_ptr = function4;
printf("Calling function %s\n", func->func_name);
func ->func(44); //instead of func_ptr(44);
Generally, in C such things are not available to the programmer.
There might be system-specific ways of getting there by using debug symbols etc., but you probably don't want to depend on the presence of these for the program to function normally.
But, you can of course compare the value of the pointer to another value, e.g.
if (ptr_to_function == some_function)
printf("Function pointer now points to some_function!\n");
The function names will not be available at runtime.
C is not a reflective language.
Either maintain a table of function pointers keyed by their name, or supply a mode of calling each function that returns the name.
The debugger could tell you that (i.e. the name of a function, given its address).
The symbol table of an unstripped ELF executable could also help. See nm(1), objdump(1), readelf(1)
Another Linux GNU/libc specific approach could be to use at runtime the dladdr(3) function. Assuming your program is nicely and dynamically linked (e.g. with -rdynamic), it can find the symbol name and the shared object path given some address (of a globally named function).
Of course, if you have only five functions of a given signature, you could compare your address (to the five addresses of them).
Notice that some functions don't have any ((globally visible) names, e.g. static functions.
And some functions could be dlopen-ed and dlsym-ed (e.g. inside plugins). Or their code be synthetized at runtime by some JIT-ing framework (libjit, gccjit, LLVM, asmjit). And the optimizing compiler can (and does!) inline functions, clone them, tail-call them, etc.... so your question might not make any sense in general...
See also backtrace(3) & Ian Taylor's libbacktrace inside GCC.
But in general, your quest is impossible. If you really need such reflective information in a reliable way, manage it yourself (look into Pitrat's CAIA system as an example, or somehow my MELT system), perhaps by generating some code during the build.
To know where a function pointer points is something you'll have to keep track of with your program. Most common is to declare an array of function pointers and use an int variable as index of this array.
That being said, it is nowadays also possible to tell in runtime which function that is currently executed, by using the __func__ identifier:
#include <stdio.h>
typedef const char* func_t (void);
const char* foo (void)
{
// do foo stuff
return __func__;
}
const char* bar (void)
{
// do bar stuff
return __func__;
}
int main (void)
{
func_t* fptr;
fptr = foo;
printf("%s executed\n", fptr());
fptr = bar;
printf("%s executed\n", fptr());
return 0;
}
Output:
foo executed
bar executed
Not at all - the symbolic name of the function disappears after compilation. Unlike a reflective language, C isn't aware of how its syntax elements were named by the programmer; especially, there's no "function lookup" by name after compilation.
You can of course have a "database" (e.g. an array) of function pointers that you can compare your current pointer to.
This is utterly awful and non-portable, but assuming:
You're on Linux or some similar, ELF-based system.
You're using dynamic linking.
The function is in a shared library or you used -rdynamic when linking.
Probably a lot of other assumptions you shouldn't be making...
You can obtain the name of a function by passing its address to the nonstandard dladdr function.
set your linker to output a MAP file.
pause the program
inspect the address contained in the pointer.
look up the address in the MAP file to find out which function is being pointed to.
A pointer to a C function is an address, like any pointer. You can get the value from a debugger. You can cast the pointer to any integer type with enough bits to express it completely, and print it. Any compilation unit that can use the pointer, ie, has the function name in scope, can print the pointer values or compare them to a runtime variable, without touching anything inside the functions themselves.

Apparently no meaning line in a program [duplicate]

I am reading the book "Programming in C" and found in Chapter 10 an example like this:
#include <stdio.h>
void test (int  *int_pointer)
{
     *int_pointer = 100;
}
int main (void)
{
     void test (int  *int_pointer);
     int  i = 50, *p = &i;
     printf ("Before the call to test i = %i\n", i);
     test (p);
     printf ("After the call to test i = %i\n", i);
     return 0;
}
I understand the example, but I don't understand the line void test (int *int_pointer); inside of main. Why do I define the signature of test again? Is that idiomatic C?
It's definitely not idiomatic C, despite being fully valid (multiple declarations are okay, multiple definitions are not). It's unnecessary, so the code will still work perfectly without it.
If at all, perhaps the author meant to do
void test (int *int_pointer);
int main (void) {
...
}
in case the function definition was put after main ().
void test (int *int_pointer); is just a declaration (or prototype) of function test. No need of this declaration in main because you already have function definition before main.
If the definition of test were after main then it would be worth of putting its declaration there to let the compiler know about the return type, number of arguments and arguments types of test before calling it.
It's not idomatic C, but still valid.
The line is a declaration of the function test, not definition. A function can't be defined multiple times, but it's valid to have multiple declarations.
It is perfectly idiomatic C, and it actually has a (limited) practical use - although not one that is demonstrated by this example.
When you declare a function or other name at the usual global level, it is brought into scope for all function bodies in the code following the declaration. A declaration cannot be removed from a scope once it has been introduced. The function is permanently visible to the rest of the translation unit.
When you declare a function or other name within a braced block, the scope of the declaration is limited to that block. Declaring a function within the scope of another function will limit its visibility, and not pollute the global namespace or make it visible to any other functions defined in the same translation unit.
This is meaningless in the case of the example, because the definition of test also brings it into scope for all following bodies - but if test were defined in another translation unit, or even if it were defined only at the very bottom of this TU, hiding the declaration inside main would protect any other functions defined afterwards from being able to see its name in their scope.
In practical terms this is of limited use - normally if you don't want a function to be visible, you put it in another translation unit (and preferably make it static) - but you can probably contrive a situation where you might want to use this ability for constructing a module-loading system that doesn't export the original declarations of its components, or something like that (and the fact that this doesn't rely on static/separate object files might potentially have some relevance to embedded/non-hosted target environments where the linking step might not work as it does on PC, allowing you to achieve a measure of namespace protection in a purely-#include-based build system).
Example:
struct module {
void * (* alloc)(size_t);
void (* dealloc)(void *);
} loaded_module;
int main(void) {
if (USE_GC) { // dynamically choose the allocator system
void * private_malloc_gc(size_t);
void private_free_noop(void *);
loaded_module = (struct module){ private_malloc_gc, private_free_noop };
} else {
void * private_malloc(size_t);
void private_free(void *);
loaded_module = (struct module){ private_malloc, private_free };
}
do_stuff();
//...
}
// cannot accidentally bypass the module and manually use the wrong dealloc
void do_stuff(void) {
int * nums = module.alloc(sizeof(int) * 32)
//...
module.dealloc(nums);
}
#include "allocator_implementations.c"
It's not idiomatic; you typically see it in code that has problems getting their header files in order.
Any function is either used in one file only, or it is used in multiple files. If it is only used in its own file, it should be static. If it is used in multiple files, its declaration should be in a header file, and anyone using it should include the header file.
What you see here is very bad style (the function should either be static, or the declaration should be taken from a header style), and also quite pointless because the compiler can see the declaration already. Since the function is in the same file, it's not dangerous; if the declaration and function don't match the compiler will tell you. I have often seen this kind of thing when the function was in a different file; that is dangerous. If someone changes the function, the program is likely to crash or misbehave.

Function declaration inside of function — why?

I am reading the book "Programming in C" and found in Chapter 10 an example like this:
#include <stdio.h>
void test (int  *int_pointer)
{
     *int_pointer = 100;
}
int main (void)
{
     void test (int  *int_pointer);
     int  i = 50, *p = &i;
     printf ("Before the call to test i = %i\n", i);
     test (p);
     printf ("After the call to test i = %i\n", i);
     return 0;
}
I understand the example, but I don't understand the line void test (int *int_pointer); inside of main. Why do I define the signature of test again? Is that idiomatic C?
It's definitely not idiomatic C, despite being fully valid (multiple declarations are okay, multiple definitions are not). It's unnecessary, so the code will still work perfectly without it.
If at all, perhaps the author meant to do
void test (int *int_pointer);
int main (void) {
...
}
in case the function definition was put after main ().
void test (int *int_pointer); is just a declaration (or prototype) of function test. No need of this declaration in main because you already have function definition before main.
If the definition of test were after main then it would be worth of putting its declaration there to let the compiler know about the return type, number of arguments and arguments types of test before calling it.
It's not idomatic C, but still valid.
The line is a declaration of the function test, not definition. A function can't be defined multiple times, but it's valid to have multiple declarations.
It is perfectly idiomatic C, and it actually has a (limited) practical use - although not one that is demonstrated by this example.
When you declare a function or other name at the usual global level, it is brought into scope for all function bodies in the code following the declaration. A declaration cannot be removed from a scope once it has been introduced. The function is permanently visible to the rest of the translation unit.
When you declare a function or other name within a braced block, the scope of the declaration is limited to that block. Declaring a function within the scope of another function will limit its visibility, and not pollute the global namespace or make it visible to any other functions defined in the same translation unit.
This is meaningless in the case of the example, because the definition of test also brings it into scope for all following bodies - but if test were defined in another translation unit, or even if it were defined only at the very bottom of this TU, hiding the declaration inside main would protect any other functions defined afterwards from being able to see its name in their scope.
In practical terms this is of limited use - normally if you don't want a function to be visible, you put it in another translation unit (and preferably make it static) - but you can probably contrive a situation where you might want to use this ability for constructing a module-loading system that doesn't export the original declarations of its components, or something like that (and the fact that this doesn't rely on static/separate object files might potentially have some relevance to embedded/non-hosted target environments where the linking step might not work as it does on PC, allowing you to achieve a measure of namespace protection in a purely-#include-based build system).
Example:
struct module {
void * (* alloc)(size_t);
void (* dealloc)(void *);
} loaded_module;
int main(void) {
if (USE_GC) { // dynamically choose the allocator system
void * private_malloc_gc(size_t);
void private_free_noop(void *);
loaded_module = (struct module){ private_malloc_gc, private_free_noop };
} else {
void * private_malloc(size_t);
void private_free(void *);
loaded_module = (struct module){ private_malloc, private_free };
}
do_stuff();
//...
}
// cannot accidentally bypass the module and manually use the wrong dealloc
void do_stuff(void) {
int * nums = module.alloc(sizeof(int) * 32)
//...
module.dealloc(nums);
}
#include "allocator_implementations.c"
It's not idiomatic; you typically see it in code that has problems getting their header files in order.
Any function is either used in one file only, or it is used in multiple files. If it is only used in its own file, it should be static. If it is used in multiple files, its declaration should be in a header file, and anyone using it should include the header file.
What you see here is very bad style (the function should either be static, or the declaration should be taken from a header style), and also quite pointless because the compiler can see the declaration already. Since the function is in the same file, it's not dangerous; if the declaration and function don't match the compiler will tell you. I have often seen this kind of thing when the function was in a different file; that is dangerous. If someone changes the function, the program is likely to crash or misbehave.

Why's initializing a global variable with return value of a function failing at declaration,but works fine at file scope?

An 80k reputation contributor R.. told me on SO that we can't initialize global variables with the return value of a function as that's not considered a constant,and global variables must be initialized with a constant.And true to his words,I get the following error for this program as expected-- initializer element is not a constant.Here is the program:
#include<stdio.h>
int foo();
int gvar=foo(); //ERROR
int main()
{
printf("%d",gvar);
}
int foo()
{
return 8;
}
But in this context,I just don't understand why the followed altered version of the above program shows no error at all and works fine.In this second program,I am initializing the same global variable with the return value of the same function foo().Can you tell me what is the rigorous technical reason for this variation in results?Why is initializing the global variable with the return value of a function at it's declaration causing error but the same initialization with the same return value works fine when done from within a function?
#include<stdio.h>
int foo();
int gvar;
int main()
{
gvar=foo();
printf("%d",gvar);
}
int foo()
{
return 8;
}
Output 8
The reason behind it is that in order to determine a value produced by a function one needs to execute code, and that there is no code execution done in C when initializing static and global variables.
Compiler and linker work together to prepare a byte image of the global memory segment: the compiler provides the values, and the linker performs their final layout. At runtime, the image of the segment is loaded in memory as is, without further modifications. This happens before any code gets executed, so no function calls can be made.
Note that this does not mean that it is not possible for some technical reason, only that C designers decided against doing it. For example, C++ compiler generates a code segment that calls constructors of global objects, which gets executed before the control is passed to main().
The second version doesn't have an initializer for gvar. gvar is declared and defined at global scope without an initializer. It has static storage duration, so it is initialized with zero.
The assignment in main is just that: an assignment, not an initialization.
In case 1, global variable is assigned with a variable while it is declared.
But in the second case, global variable is assigned(which is already declared) with return value of foo().
Forming of data section, text section all happens during compilation.
Global variables will be in data section(bss or initialized data section), so at compile time, foo() is not invoked right? and return value of foo() is not known during compilation.
But second case, when the text section get executed, gvar is assigned with return value of foo(). It is valid.
You can maybe think of it like this: when main() starts, all global variables must already have their initializer values. And they can't, as you've been told, get those by calling functions since main() is really where execution starts, in a C program.
we could not call any function from outer of the function.Not like shell script.function only allow to called from inside of function body.
In c first execution begins from main(), compiler don't know the function calling if that stands on outer of function it may taken as prototype if arg and return types provided.
we can putting return value of function by calling from main or others function block, to the variable,the function called then (that global) variable modified.
but we can use macro in global variable as needs.
as:
#define max() 12
int glob=max();
main()
{
}

Resources