Benefits and drawbacks of making all functions in main.c static? - c

I have heard that, when you have just 1 (main.c) file (or use a "unity build"), there are benefits to be had if you make all your functions static.
I am kind of confused why this (allegedly) isn't optimized by default, since it's not probable that you will include main.c into another file where you will use one of its functions.
I would like to know the benefits and dangers of doing this before implementing it.
Example:
main.c
static int my_func(void){ /*stuff*/ }
int main(void) {
my_func();
return 0;
}

You have various chunks of wisdom in the comments, assembled here into a Community Wiki answer.
Jonathan Leffler noted:
The primary benefit of static functions is that the compiler can (and will) aggressively inline them when it knows there is no other code that can call the function. I've had error messages from four levels of inlined function calls (three qualifying “inlined from” lines) on occasion. It's staggering what a compiler will do!
and:
FWIW: my rule of thumb is that every function should be static until it is known that it will be called from code in another file. When it is known that it will be used elsewhere, it should be declared in a header file that is included both where the function is defined and where it is used. (Similar rules apply to file scope variables — aka 'global variables'; they should be static until there's a proven need for them elsewhere, and then they should be declared in a header too.)
The main() function is always called from the startup code, so it is never static. Any function defined in the same file as an unconditionally compiled main() function cannot be reused by other programs. (Library code might contain a conditionally compiled test program for the library function(s) defined in the source file — most of my library code has #ifdef TEST / …test program… / #endif at the end.)
Eirc Postpischil generalized on that:
General rule: Anytime you can write code that says the use of something is limited, do it. Value will not be modified? Make it const. Name only needs to be used in a certain section? Declare it in the innermost enclosing scope. Name does not need to be linked externally? Make it static. Every limitation both shrinks the window for a bug to be created and may remove complications that interfere with optimization.

Related

is the purpose of header files in C only warning to users?

I'm a beginner into Linking, sorry if my questions are too basic. lets say I have two .c files
file1.c is
int main(int argc, char *argv[])
{
int a = function2();
return 0;
}
file2.c is
int function2()
{
return 2018;
}
I know the norm is, create a file2.h and include it in file1.c, and I have some questions:
Q1. #include in file1.c doesn't make too much difference or improve much to me, I can still compile file1.c without file2.h correctly, the compiler will just warn me 'implicit declaration of function 'function2', but does this warning help a lot? Programmers might know that function2 is defined in other .c file(if you use function2 but don't define it, you certainly know the definition is somewhere else) and linker will do its job to produce the final executable file? so the only purpose of include file2,c to me is, don't show any warning during compilation, is my understanding correct.
Q2. Image this scenario, a programmer define function2 in file1.c, he doesn't know that his function2 in conflict with the one in file2.c until the linker throws the error(obvious he can compile his file1.c alone correctly. But if we want him to know his mistake when he compiles his file1.c, adding file2.h still don't help, so what's the purpose of adding header file?
Q3. What should we add to let the programmer know he should choose a different name for function2 rather then be informed the error by linker in the final stage.
Per C89 3.3.2.2 Function calls emphasis mine:
If the expression that precedes the parenthesized argument list in a function call consists solely of an identifier, and if no declaration is visible for this identifier, the identifier is implicitly declared exactly as if, in the innermost block containing the function call, the declaration
extern int identifier();
appeared
Now, remember, empty parameter list (declared with nothing inside the () braces) declares a function that takes unspecified type and number of arguments. Type void inside braces to declare that a function takes no arguments, like int func(void).
Q1:
does this warning help a lot?
Yes and no. This is a subjective question. It helps those, who use it. As a personal note, always make this warning an error. Using gcc compiler use -Werror=implicit-function-declaration. But you can also ignore this warning and make the simplest main() { printf("hello world!\n"); } program.
linker will do its job to produce the final executable file? so the only purpose of include file2,c to me is, don't show any warning during compilation, is my understanding correct.
No. In cases the function is called using different/not-compatible pointer type. It invokes undefined behavior. If the function is declared as void (*function2(void))(int a); then calling ((int(*)())function2)() is UB as is calling function2() without previous declaration. Per Annex J.2 (informative):
The behavior is undefined in the following circumstances:
A pointer is used to call a function whose type is not compatible with the pointed-to type (6.3.2.3).
and per C11 6.3.2.3p8:
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.
So in your lucky case int function2() indeed this works. It also works for example for atoi() function. But calling atol() will invoke undefined behavior.
Q2:
the linker throws the error
This should happen, but is really linker dependent. If you compile all sources using a single stage with the gcc compiler it will throw an error. But if you create static libraries and then link them using gcc compiler without -Wl,-whole-archive then it will pick the first declaration is sees, see this thread.
what's the purpose of adding header file?
I guess simplicity and order. It is a convenient and standard way to share data structures (enum, struct, typedefs) and declarations (function and variable types) between developers and libraries. Also to share preprocessor directives. Image you are writing a big library with over 1000+ files that will work with over 100+ other libraries. In the beginning of each file would you write struct mydata_s { int member1; int member2; ... }; int printf(const char*, ...); int scanf(const char *, ...); etc. or just #include "mydata.h" and #include <stdio.h>? If you would need to change mydata_s structure, you would need to change all files in your project and all the other developers which use your library would need to change the definition too. I don't say you can't do it, but it would be more work to do it and no one will use your library.
Q3:
What should we add to let the programmer know he should choose a different name for function2 rather then be informed the error by linker in the final stage.
In case of name clashes you will by informed (hopefully) by the linker that it found two identifiers with the same name. You would need to create a tool to check your sources exactly for that. I don't know why the need for this, the linker is specifically made to resolve symbols so it naturally handles the cases when two symbols with the same identifier exists.
Short answer:
Take away: the earlier the compiler alert the better.
Q1: meaning of .h: consistency and early alerts. Alerting early on common ways of going wrong improves reliability of code and adds up to less debugging and production crashes.
Q2: Clashing Names bring early alerts to developers, which are usually easier to fix.
Q3: Early duplicate definition alerts are not baked into the C standard.
Exercises:
1. Define a function in one file that printf("%d\n",i) an int argument then call that function in another file with a float of 42.0.
2. Call with (double)42.0.
3. Define function with char *str argument printed under %.s then call with int argument.
Longer answers:
Popular convention: in typical use the name of the .h file is derived from the .c file, or files, it is associated with. file.h and file.c. For .h files with many definitions, say string.h, derive the file name from a hither perspective of what's within (as in the str... functions).
My big rule: it’s always better to structure your code so compilers can immediately alert on bugs at compile time rather than letting them slide through to debug or run time where they depend on code actually running in just the right way to find. Run time errors can be very difficult to diagnose, especially if they hit long after the program is in production, and expensive in maintenance and brings down your customer experience. See "yoda notation".
Q1: meaning of .h: consistency and early alerts and improved reliability of code.
C .h files allow developers of .c files compiled at different times to share common declarations. No duplicate code. .h files also allow functions to be consistently called from all files while identifying improper argument signatures (argument counts, bad clashes, etc.). Having.c files defining functions also #include the .h file helps assure the arguments in the definition are consistent with the calls; this may sound elementary, but without it all the human errors of signature clashes can sneak through.
Omitting .h files only works if the argument signatures of all callers perfectly match those in the definitions. This is often not the case so without .h files any clashing signatures would produce bad numbers unless you also had parallel externs in the calling file (bad bad bad). Things like int vs float can produce spectacularly wrong argument values. Bad pointers can produce segment faults and other total crashes.
Advantage: with externs in .h files compilers can correctly cast mismatching arguments to the correct type, assuring better calls. While you can still botch arguments it’s much less likely. It also helps avoid conditions where the mismatches work on one implementation but not another.
Implicit declaration warnings are hugely helpful to me as they usually indicate I’ve forgotten a .h file or spelled the name an external name wrong.
Q2: Clashing Names. Early alerts.
Clashing names are bad and it is the developers responsibility to avoid problems. C++ solves the issue with name spaces, which C, being a lower level language, does not have.
Use of .h files can allow can let compiler diagnostics alert developers where clashes care are early in the game. If compiler diagnostics don’t do this hopefully linkers will do so on multidefined symbol errors, but this is not guaranteed by the standard.
A common way to fake name spaces is by starting all potentially clashing definitions in a .h with some prefix (extern int filex_function1(int arg, char *string) or #define FILEX_DEF 42).
What to do if two different external libraries being used share the same names is beyond the scope of this answer.
Q3: early duplicate alerts. Sorry… early alerts are implementation dependent.
This would be difficult for the C standard to define. As C is an old language there are many creative different ways C programs are written and stored.
Hunting for clashing names before using them is up to the developer. Tools like cross reference programs can help. Even something stupid like ctags associated with vim or emacs can help.
you misunderstand usage of header files and function prototypes.
header files are needed to share common information between multiple code files. such information includes macro definition, data types, and, possibly, function prototypes.
function protoypes are needed for the compiler to correctly handle return data types and to give you early warnings of misuse of function return types and arguments.
function prototypes can be declared in header files or can be declared in the files which use them (more typing).
you have a very simple example, with just 2 files. Now imagine a project with hudreds of files and thousands of functions. You will be lost in linker errors.
'c' allows you to use an undeclared function due to legacy reasons. In this situation it assumes that the function has a return type of 'int'. However, modern data types has a bigger veriety than in early days. The function can return pointers, 64-bit data, structures. To express that you must use prototypes or nothing will work. The compiler has to know how to handle function returns correctly.
Also, it can give you warnings about incorrect use of argument types. Due to leagacy, those are still warnings, but they got addressed in early c++ and converted to errors.
Those warnings give you early debugging capabilities. Type mismatch warnings can save you days of debugging in some cases.
So, in your example you do not need the header file. You can prototype the function in the 'main' file using the 'extern' syntax. You can even do without prototyping. However, in real modern programming world you cannot allow the latter. In particular when you work in a team or want your program to be maintainable.
It is a good idea to store you funcion protypes in header files. This would be a good documentation source, in particular with good comments. BTW, function names must make sense to be maintainable.
Q1. Yes. C is a low level language, and was historically used to bind low level constructs into higher level concepts. For example, traditionally the label _end is at the last address in a program. The label is typeless but you can declare it as any type that is convenient to you. A "properly typed" language would make this sort of abuse difficult.
Q2. By convention, both file1.c and file2.c would include file2.h; one as consumer, the other as producer. Following this simple idiom will catch declaration vs definition errors; although again, the "warning" is not necessarily enforced.
Q3. Many software organizations take a "warnings are errors" rule to socially control their programmers.

is it possible to have only header file in C without source file

I would like to write a C library with fast access by including just header files without using compiled library. For that I have included my code directly in my header file.
The header file contains:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#ifndef INC_TEST_H_
#define INC_TEST_H_
void test(){
printf("hello\n");
}
#endif
My program doesn't compile because I have multiple reference to function test(). If I had a correct source file with my header it works without error.
Is it possible to use only header file by including code inside in a C app?
Including code in a header is generally a really bad idea.
If you have file1.c and file2.c, and in each of them you include your coded.h, then at the link part of the compilation, there will be 2 test functions with global scope (one in file1.c and the other one in file2.c).
You can use the word "static" in order to say that the function will be restricted so it is only visible in the .c file which includes coded.h, but then again, it's a bad idea.
Last but not least: how do you intend to make a library without a .so/.a file? This is not a library; this is copy/paste code directly in your project.
And when a bug is found in your "library", you will be left with no solution apart correcting your code, redispatch it in every project, and recompile every project, missing the very point of a dynamic library: The ability to "just" correct the library without touching every program using it.
If I understand what you're asking correctly, you want to create a "library" which is strictly source code that gets #incuded as necessary, rather than compiled separately and linked.
As you have discovered, this is not easy when you're dealing with functions - the compiler complains of multiple definitions (you will have the same problem with object definitions).
You have a couple of options at this point.
You could declare the function static:
static void test( void )
{
...
}
The static keyword limits the function's visibility to the current translation unit, so you don't run into multiple definition errors at link time. It means that each translation unit is creating its own separate "instance" of the function, leading to a bit of code bloat and slightly longer build times. If you can live with that, this is the easiest solution.
You could use a macro in place of a function:
#define TEST() (printf( "hello\n" ))
except that macros are not functions and do not behave like functions. While macro-based "libraries" do exist, they are not trivial to implement correctly and require quite a bit of thought. Remember that macro arguments are not evaluated, they're just expanded in place, which can lead to problems if you pass expressions with side effects. The classic example is:
#define SQUARE(x) ((x)*(x))
...
y = SQUARE(z++);
SQUARE(z++) expands to ((z++)*(z++)), which leads to undefined behavior.
Separate compilation is a Good Thing, and you should not try to avoid it. Doing everything in one source file is not scalable, and leads to maintenance headaches.
My program do not compiled because I have multiple reference to test() function
That is because the .h file with the function is included and compiled in multiple C source files. As a result, the linker encounters the function with global scope multiple times.
You could have defined the function as static, which means it will have scope only for the curent compilation unit, so:
static void test()
{
printf("hello\n");
}

Difference between static function and normal function in C?

In our project, we have pretty big C file of around 50K lines, written in 90's.
I wanted to split the file based on the functionality. But, all the functions in this file are declared as static. So, file scoped. If I split the file, then the function in file1 cannot call function in file2 and vice-versa.
But, My TL feels like that there could be memory optimization by using static functions.
I wrote some sample code to see if the stacks are different for different threads.
It seemed like it was. Could someone please enlighten me the difference between static function and a normal one other an file scope?
In C, while defining a function, the static keyword has the following 2 major consequences :
Prevents the function name from being exported (i.e. function does NOT have external linkage). Thus, preventing linkage / direct calls from other parts of the code.
As the function is clearly marked private to the file, the compiler is in a better position to generate a complete call-graph for the function. This may result in the compiler deciding to automatically in-line the function for better performance.
All functions are implicitly declared as extern, which means they're visible across translation units. But when we use static it restricts visibility of the function to the translation unit in which it's defined. So we can say Functions that are visible only to other functions in the same file are known as static functions.
The most important difference is you cannot call the static function in any other files. i think so ,yeah?

Does using a prototype + definition instead of just a definition speed up the program?

Im very new to C programing, but have limited experience with Python, Java, and Perl. I was just wondering what the advantage is of having the prototype of a function above main() and the definition of that function below main() rather than just having the definition of said function above main(). From what I have read, the definition can act as a prototype as well.
Thanks in advance,
The use of a prototype above main() (within a single module) is mostly a matter of personal preference. Some people like to see main() at the bottom of the file; other people like to see it at the top. I had a professor in university who complained that I wrote my programs "upside-down" because I put main() at the bottom (and avoided having to write and maintain prototypes for everything).
There is one situation where a prototype may be required:
void b(); // prototype required here
void a()
{
b();
}
void b()
{
a();
}
In this mutually recursive situation, you need at least one of the prototypes to appear prior to the definition of the other function.
When someone else wants to use your library, you can give them the header (containing the prototypes), without giving them your code.
Example? Windows SDK.
If two functions call each other, you will need to have a prototype for at least one of them.
There is no performance impact* on the compiled code.
*Note: Well, there could be, I guess. If the compiler decides to put the method somewhere else in the binary, then you could theoretically see locality issues affecting the performance of your application. It's not something to normally worry about, though.
There's no advantage in splitting a function into a standalone prototype and a definition. In fact, there's a clear and explicit disadvantage to that approach, since it requires extra maintenance efforts to keep the prototype and the definition in perfect sync. For this reason, a reasonable approach would be to provide a separate prototype only when you have to. For example, with external functions declared in header files you have no other choice but to keep the prototype and definition separate.
As for functions internal to a single translation unit (the ones that we'd normally declare static) there's absolutely no reason to create a standalone prototype, since the definition of the function itself includes a prototype. Just define your functions in the natural order: low level functions first, high level functions last, and that's it. If your program consists of a single translation unit, the main function will end up at the very bottom.
The only case you'd have to declare a standalone prototype in this approach would be if some form of recursion was involved.
If you write your code in the correct order, i.e. main last, then there is only one definition in your code for each function. This prevents having to change the code in two, or more, places if changes ever need to be made. That prevents errors and reduces the work to maintain it. The principle is called "single source of truth"

Reasons to use Static functions and variables in C

I wonder about the use of the static keyword as scope limiting for variables in a file, in C.
The standard way to build a C program as I see it is to:
have a bunch of c files defining functions and variables, possibly scope limited with static.
have a bunch of h files declaring the functions and possibly variables of the corresponding c file, for other c files to use. Private functions and variables are not published in the h file.
every c file is compiled separately to an o file.
all o files are linked together to an application file.
I see two reasons for declaring a gobal as static, if the variable is not published in the h file anyway:
one is for readability. Inform future readers including myself that a variable is not accessed in any other file.
the second is to prevent another c file from redeclaring the variable as extern. I suppose that the linker would dislike a variable being both extern and static. (I dislike the idea of a file redeclaring a variable owned by someone else as extern, is it ok practice?)
Any other reason?
Same goes for static functions. If the prototype is not published in the h file, other files may not use the function anyway, so why define it static at all?
I can see the same two reasons, but no more.
When you talk about informing other readers, consider the compiler itself as a reader. If a variable is declared static, that can affect the degree to which optimizations kick in.
Redefining a static variable as extern is impossible, but the compiler will (as usual) give you enough rope to hang yourself.
If I write static int foo; in one file and int foo; in another, they are considered different variables, despite having the same name and type - the compiler will not complain but you will probably get very confused later trying to read and/or debug the code. (If I write extern int foo; in the second case, that will fail to link unless I declare a non-static int foo; somewhere else.)
Global variables rarely appear in header files, but when they do they should be declared extern. If not, depending on your compiler, you risk that every source file which includes that header will declare its own copy of the variable: at best this will cause a link failure (multiply-defined symbol) and at worst several confusing cases of overshadowing.
By declaring a variable static on file level (static within function has a different meaning) you forbid other units to access it, e.g. if you try to the variable use inside another unit (declared with extern), linker won't find this symbol.
When you declare a static function the call to the function is a "near call" and in theory it performs better than a "far call". You can google for more information. This is what I found with a simple google search.
If a global variable is declared static, the compiler can sometimes make better optimizations than if it were not. Because the compiler knows that the variable cannot be accessed from other source files, it can make better deductions about what your code is doing (such as "this function does not modify this variable"), which can sometimes cause it to generate faster code. Very few compilers/linkers can make these sorts of optimizations across different translation units.
If you declare a variable foo in file a.c without making it static, and a variable foo in file b.c without making it static, both are automatically extern which means the linker may complain if you initialise both, and assign the same memory location if it doesn't complain. Expect fun debugging your code.
If you write a function foo () in file a.c without making it static, and a function foo () in file b.c without making it static, the linker may complain, but if it doesn't, all calls to foo () will call the same function. Expect fun debugging your code.
My favorite usage of static is being able to store methods that I wont have to Inject or create an object to use, the way I see it is, Private Static Methods are always useful, where public static you have to put some more time in thinking of what it is your doing to avoid what crazyscot defined as, getting your self too much rope and accidentally hanging ones self!
I like to keep a folder for Helper classes for most of my projects that mainly consist of static methods to do things quickly and efficiently on the fly, no objects needed!

Resources