For a large software developed by C, we first declare all the self-defined functions in a separate header file (e.g. myfun.h). After that, once we write a code (e.g. main.c) that uses the functions listed in myfun.h, we have to #include "myfun.h". I'm wondering how it works, because even if I include the function names declared in header file before the main body, the code cannot see the function details in main.c. I guess it will search the library to get the function details...Am I right?
When you say "it will search the library for the function details" you're not far off, but that isn't quite right. A function declaration, i.e.. a function prototype only contains enough information for the compiler to do two things:
First, the compiler will register the function as a known identifier so that it knows what you're taking about when you call it, as opposed to a random string of letters with parentheses (to the compiler, they are essentially the same thing without a function prototype for either - an error).
Second, the compiler uses the function prototype for checking code correctness. Correctness in this sense means that a function call will match the prototype in both arity and type. In other words a function call to int square(int a, int b); will have two arguments, both integers.
The program doesn't "search the library," though. Function names without parentheses are not function calls but rather function's address. Therefore, when you call a function, the processor jumps to the memory location of the function. (This assumes the function has not been inlined.)
Where is this function located though? It depends. If you wrote the function in the same module, i.e... a .c file that got compiled into an object linked with the main.c file into a single executable, then the location of the function will be somewhere in the .TEXT section of the executable. In other words, it's just a slight offset from the main function's entry point. In a huge project this offset won't be so slight, but it will be shorter than the offset of separate objects.
Having said that, if you compiled this hypothetical function into a DLL which you call from your main program, then the function's address will be determined in one of two ways:
Either you will have generated a .lib/.a? (depending on whether you're on Windows or Linux) file containing the function declaration's and addresses, or:
You will use run-time linking where the main program will calculate the function addresses when it loads the .dll/.so into its address space. First, it will determine where to load it. You can set DLL's to have preferred offsets to optimize load time. Otherwise, libraries will start loading from the first segment available and any additional libraries will need their function address recalculated using this new address, hampering initial load times. Once they are loaded into the program's memory though, there shouldn't be any performance hits thereafter.
Going back to the preprocessor, it's important to note two things. First, it runs before any compilation takes place. This is important. Since the program is not really being "compiled" when the preprocessor is doing its thing, macros are not type-safe. (Insert Haskell joke about C "type safety") This is why you don't -or shouldn't- see macros in C++. Anything that can be accomplished with macros in C can be accomplished by const and inline functions in C++, with the added benefit of type safety.
Second, the preprocessor is almost just a search and replace engine. For example, in the following code, nothing happens because the preprocessor if statement evaluates to false, since I never defined anything. The preprocessor removes the code in this section. Remember that since the compiler has not run in earnest yet, this removed code will not be compiled. This fact is usually utilized to implement functions for debugging or logging in debug builds. In release builds the preprocessor definition is then manipulated such that the debug code is not included.
#include <stdio.h>
#include <stdlib.h>
int main()
{
#if TRUE
printf("Hello, World!");
#endif
return EXIT_SUCCESS;
}
In fact, the EXIT_SUCCESS macro I used is defined in stdlib.h, and replaced by 0. (EXIT_FAILURE =1).
Back in the day, the preprocessor was used as duct tape, basically, to compensate for faults in C.
For example, since const values can't be used as array sizes, macros were used instead, like this:
// Not valid C89, possibly even C99
const int DEFAULT_BUFFER_SIZE = 128;
char user_input[DEFAULT_BUFFER_SIZE];
// Legal since the dawn of time
#define DEFAULT_BUFFER_SIZE 128
char user_input[DEFAULT_BUFFER_SIZE];
Another significant use of the preprocessor was for code portability, for example:
#ifdef WIN32
// Do windows things
#elif
// Handle other OS
#endif
One trick was to define a generic function and set it to the appropriate OS-dependent one (Remember that functions without the parentheses represent the function's address, not an actual function call), like this:
void RequestSomeKernelAction();
#ifdef WIN32
RequestSomeKernelAction = WindowsVersion;
#else
RequestSomeKernelAction = OtherOSFunction;
#endif
This is all to say that the code you see in header files follows these same rules. If I have the following header file:
#ifndef SRC_INCLUDES_TEST_H
#define SRC_INCLUDES_TEST_H
int square(int a);
#endif /** SRC_INCLUDES_TEST_H */
And I have this main.c file:
#define SRC_INCLUDES_TEST_H
#include "test.h"
int main()
{
int n = square(4);
}
This program will not compile. The square function will not be known to main.c because while I did include the header file where square is declared, my #define SRC_INCLUDES_TEST_H statement tells the preprocessor to copy all the header file contents over to main except those in the block where SRC_INCLUDES_TEST_H is defined, i.e... nothing.
These preprocessor commands can be nested, and there are several, which I highly recommend you look up, if only for historical or pedagogical reasons.
The last point I will make is that while the C preprocessor has its faults, it was a powerful tool in the right hands, and in fact, the first C++ compiler Bjarne Stroustroup wrote was essentially just a preprocessor.
Related
I have heard that, when you have just 1 (main.c) file (or use a "unity build"), there are benefits to be had if you make all your functions static.
I am kind of confused why this (allegedly) isn't optimized by default, since it's not probable that you will include main.c into another file where you will use one of its functions.
I would like to know the benefits and dangers of doing this before implementing it.
Example:
main.c
static int my_func(void){ /*stuff*/ }
int main(void) {
my_func();
return 0;
}
You have various chunks of wisdom in the comments, assembled here into a Community Wiki answer.
Jonathan Leffler noted:
The primary benefit of static functions is that the compiler can (and will) aggressively inline them when it knows there is no other code that can call the function. I've had error messages from four levels of inlined function calls (three qualifying “inlined from” lines) on occasion. It's staggering what a compiler will do!
and:
FWIW: my rule of thumb is that every function should be static until it is known that it will be called from code in another file. When it is known that it will be used elsewhere, it should be declared in a header file that is included both where the function is defined and where it is used. (Similar rules apply to file scope variables — aka 'global variables'; they should be static until there's a proven need for them elsewhere, and then they should be declared in a header too.)
The main() function is always called from the startup code, so it is never static. Any function defined in the same file as an unconditionally compiled main() function cannot be reused by other programs. (Library code might contain a conditionally compiled test program for the library function(s) defined in the source file — most of my library code has #ifdef TEST / …test program… / #endif at the end.)
Eirc Postpischil generalized on that:
General rule: Anytime you can write code that says the use of something is limited, do it. Value will not be modified? Make it const. Name only needs to be used in a certain section? Declare it in the innermost enclosing scope. Name does not need to be linked externally? Make it static. Every limitation both shrinks the window for a bug to be created and may remove complications that interfere with optimization.
For reasons out of my control, I have to implement this function in my C code:
double simple_round(double u)
{
return u;
}
When this function is called, is it ignored by the compiler, or does the call take place anyway? For instance:
int y;
double u = 3.3;
y = (int)simple_round(u*5.5); //line1
y = (int)u*5.5; //line2
Will both lines of code take the same time to be executed, or will the first one take longer?
Because the function is defined in a different C file from where it's used, if you don't use link-time optimization, when the compiler calls the function call it won't know what the function does, so it will have to actually compile the function call. The function will probably just have two instructions: copy the argument to the return value, then return.
The extra function call may or may not slow down the program, depending on the type of CPU and what else the CPU is doing (the other instructions nearby)
It will also force the compiler to consider that it might be calling a very complicated function that overwrites lots of registers (whichever ones are allowed to be overwritten by a function call); this will make the register allocation worse in the function that calls it, perhaps making that function longer and making it need to do more memory accesses.
When this function is called, is it ignored by the compiler, or does the call take place anyway?
It depends. If the function definition is in the same *.c file as the places where it's called then the compiler most probably automatically inlines it, because it has some criteria to inline very simple functions or functions that are called only once. Of course you have to specify a high enough optimization level
But if the function definition is in another compilation unit then the compiler can't help unless you use link-time optimization (LTO). That's because in C each *.c file is a separate compilation unit and will be compiled to a separate object (*.o) file and compilers don't know the body of functions in other compilation units. Only at the link stage the unresolved identifiers are filled with their info from the other compilation units
In this case the generated code in a *.c file calls a function that you can change in another *.c file then there are many more reliable solutions
The most correct method is to fix the generator. Provide evidences to show that the function the generated code calls is terrible and fix it
In case you really have no way to fix the generator then one possible way is to remove the generated *.c file from the compilation list (i.e. don't compile it into *.o anymore) and include it in your own *.c file
#define simple_round(x) (x)
#include "generated.c"
#undef simple_round
Now simple_round() calls in generated.c will be replaced with nothing
If the 'generated' code has to be compiled anyway, perhaps you can 'kludge' a macro, Macro, that redefines the call to the 'inefficient' rounding function made by that code.
Here's a notion (all in one file). Perhaps the #define can be 'shimmed in' (and documented!) into the makefile entry for that single source file.
int fnc1( int x ) { return 5 * x; }
void main( void ) {
printf( "%d\n", fnc1( 5 ) );
#define fnc1(x) (x)
printf( "%d\n", fnc1( 7 ) );
}
Output:
25
7
I would like to write a C library with fast access by including just header files without using compiled library. For that I have included my code directly in my header file.
The header file contains:
#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#ifndef INC_TEST_H_
#define INC_TEST_H_
void test(){
printf("hello\n");
}
#endif
My program doesn't compile because I have multiple reference to function test(). If I had a correct source file with my header it works without error.
Is it possible to use only header file by including code inside in a C app?
Including code in a header is generally a really bad idea.
If you have file1.c and file2.c, and in each of them you include your coded.h, then at the link part of the compilation, there will be 2 test functions with global scope (one in file1.c and the other one in file2.c).
You can use the word "static" in order to say that the function will be restricted so it is only visible in the .c file which includes coded.h, but then again, it's a bad idea.
Last but not least: how do you intend to make a library without a .so/.a file? This is not a library; this is copy/paste code directly in your project.
And when a bug is found in your "library", you will be left with no solution apart correcting your code, redispatch it in every project, and recompile every project, missing the very point of a dynamic library: The ability to "just" correct the library without touching every program using it.
If I understand what you're asking correctly, you want to create a "library" which is strictly source code that gets #incuded as necessary, rather than compiled separately and linked.
As you have discovered, this is not easy when you're dealing with functions - the compiler complains of multiple definitions (you will have the same problem with object definitions).
You have a couple of options at this point.
You could declare the function static:
static void test( void )
{
...
}
The static keyword limits the function's visibility to the current translation unit, so you don't run into multiple definition errors at link time. It means that each translation unit is creating its own separate "instance" of the function, leading to a bit of code bloat and slightly longer build times. If you can live with that, this is the easiest solution.
You could use a macro in place of a function:
#define TEST() (printf( "hello\n" ))
except that macros are not functions and do not behave like functions. While macro-based "libraries" do exist, they are not trivial to implement correctly and require quite a bit of thought. Remember that macro arguments are not evaluated, they're just expanded in place, which can lead to problems if you pass expressions with side effects. The classic example is:
#define SQUARE(x) ((x)*(x))
...
y = SQUARE(z++);
SQUARE(z++) expands to ((z++)*(z++)), which leads to undefined behavior.
Separate compilation is a Good Thing, and you should not try to avoid it. Doing everything in one source file is not scalable, and leads to maintenance headaches.
My program do not compiled because I have multiple reference to test() function
That is because the .h file with the function is included and compiled in multiple C source files. As a result, the linker encounters the function with global scope multiple times.
You could have defined the function as static, which means it will have scope only for the curent compilation unit, so:
static void test()
{
printf("hello\n");
}
I'm a beginner to C, but I've had a bit of experience with some other programing languages like Ruby and Python. I would very much like to create some of my own functions in C that I could use in any of my programs that just make life easier, however I'm a little bit confused about how to do this.
From what I understand the first part of this process is to create a header file that contains all of your prototypes, and I understand that, however from what I understand it is frowned upon to include anything other than declarations in your header files, so would you also need to create a .c file that contained the actual code and then #include that in all your programs along with the header file? But if so, why would you need a header file in the first place, since defining a function also declares it?
Finally, what should you put in the main() function of your header file? Do you just leave it blank, or do you not include it?
Thanks!
The declaration of a function lets the compiler know that at link time such a function will be available. The definition of the function provides that implementation, and additionally it also serves as the declaration. There is no harm in having multiple declarations, but only one implementation can be provided. Also, at least one declaration (or the only implementation) must come before any use of the function - this alone makes forward declarations necessary in cases where two functions call one another (both cannot be before the other).
So, if you have the implementation:
int foo(int a, int b) {
return a * b;
}
The corresponding declaration is simply:
int foo(int a, int b);
(The argument names do not matter in the declaration, i.e., they can be omitted or different than in the implementation. In fact you could declare only int foo(); and it would work for the above function, but this is mainly a legacy thing and not recommended. Note that to declare a function that takes no arguments, put void in the argument list, e.g., int bar(void);)
There are a number of reasons why you would want to have separate headers with only the declaration:
The implementation may be in a separate file, which allows for organisation of code into manageable pieces, and may be compiled by itself and need not be recompiled unless that file has changed - in large projects where the total compilation time can be an hour it would be absurd to re-compile everything for a small change.
The implementation source may not be available, e.g., in case of a closed-source proprietary library.
The implementation may be in a different language with a compatible calling convention.
For practical details on how to write code in multiple files and how to use libraries, please consult a book or tutorial on C programming. As for main, you need not declare it in a header unless you are specifically calling main from another function - the convention of C programs is to call main as int main(int, char**) at start of the execution.
When compiling, each .c-file (or .cpp-file) will be compiled to an own binary first.
If one binary file is using functions from another,
it just knows "there is something outside named xyz" at that time.
Then the linker will put them together in one file and rewrite the parts of each file
which are using functions of other files,
so that they actually know where to find the used functions.
What will happen if you put code in a .h file:
At compilation time, each included h-file in a c-file will be integrated in the c-file.
If you have code for xyz in a h-file and you´re including it in more thana one c-file,
each of this compiled c-files will have a xyz. Then, the linker will be confused...
So, function code have to be in a own c file.
Why use a h-file at all?
Because, if you call xyz in your code, how should the compiler know
if this is a function of another c-file (and which parameters...)
or an error because xyz does not exist?
The reason for header files in c are for when you need the same code in multiple scripts. So if you are just repeated the same code in one script then yes it would be easier to just use a function. Also for header files, yes you would need to include a .c file for all the computations.
As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
At first, I was writing my function in an .h file and then include it with #include "myheader.h". Then, someone said me that it's better to add to these files only functions prototypes and to put the real code in a separate .c file.
Now, I'm able to compile more .c files to generate only an executable, but at this point I can't understand why I should add the header files if the code is in another file.
Moreover, I had a look to standard C libraries (like stdlib.h) in the system and it seemed to me to store only structure definitions, constants and similar... I'm not so good with C (and to be honest, stdlib.h was almost Chinese to me, no offense for Chinese of course :) ), but I didn't spot any single line of 'operative' code. However, I always just include it without adding anything else and I compile my files as if the 'code' was actually there.
Can someone please explain me how do these things work? Or, at least, point me to a good guide? I searched on Google and SO, also, but didn't find anything that explains it clearly.
When you compile C code, the compiler has to actually know that a particular function exists with a defined name, parameter list, return type and optional modifiers. All these things are called function signature and the existence of a particular function is declared in the header file. Having this information, when the compiler finds a call to this function, it will know which type of parameters to look for, can control whether they have the appropriate type and prepare them to a structure that will be pushed to the stack before the code actually jumps to your function implementation. However the compiler does not have to know the actual implementation of the function, it simple puts a "placeholder" in your object file to all function calls. (Note: each c files compiles to exactly one object file). #include simple takes the header file and replaces the #include line with the contents of the file.
After the compilation the build script passes all object files to the linker. The linker will be that resolves all function "placeholders" finding the physical location of the function implementation, let them be among your object files, framework libraries or dlls. It simple places the information where the function implementation can be found to all function calls thus your program will know where to continue execution when it arrives to your function call.
Having said all this it should be clear why you can't put function definition in the header files. If later you would #include this header in to more then one c file, both of them would compile the function implementation into two separate object files. The compiler would work well, but when the linker wanted to link together everything, it would find two implementation of the function and would give you an error.
stdlib.h and friends work the same way. The implementation of the functions declared in them can be found in framework libraries which the compiler links to your code "automatically" even if you are not aware of it.
The C language (together with C++) uses a quite obsolete strategy for making the compiler know the functions defined elsewhere.
This strategy goes like this: the signatures of the functions etc. (this stuff is called declarations in C) go into a special file called header, and every other file which wants to use them is expected to almost literally include that header into the file (actually, #include directive just tells the compiler to include the literal text of the header), so that the compiler sees again the function declarations.
Other languages solve this problem in a different way: compiler sees all the source code, and remembers the metadata of the already compiled classes itself.
The strategy used in C shifts the task of finding all the dependencies from the compiler to the developer; it's a legacy from the old times when the computers were small, silly and slow, so this kind of help from the developer was really valuable.
Although this strategy has numerous drawbacks, and besides it's theoretically possible to change it now, the standard is not going to change, because gigabytes of code have been written in that style already.
tl;dr: it's a legacy from the 70's.
In C it is required that a function is declared before it is called. The reason this is required was that in the 70s it would just take too much time to first parse a file for all its symbols and then parse it a second time to actually compile the code. If all functions are declared before they are called one single parse is enough. However on modern system we no longer face those limitations and that is the reason why mondern languages don't have this requirement.
Imagine you have 2 files a.c b.c in your project. You implement a function foo which you want to use in both files. You can't just define the function in a.c and use it in b.c since you have to declare a function before you call it. So you would add a line void foo(); to b.c. But everytime you change the signature of your function in a.c you would have to change the declaration in b.c. To circumvent this issue it is standard strategy in C to declare all functions your file implements in a seperate header file (in this case a.h. The header file is then included by all other files who want to use that code (so b.c would use this: #include "a.h").
An #include is a preprocessor directive that causes the file to be textually inserted at the point where the #include occurs.
When linking multiple .c files that include the same header files, care must be taken to avoid multiple inclusions of the header files (textually inserting a header file more than once). The #ifndef, #define, and #endif preprocessor directives can be used to prevent multiple inclusions.
#ifndef MY_FILE_H
#define MY_FILE_H
/* This code will not be included more than once. */
#endif /* !MY_FILE_H */
I can't understand why I should add the header files if the code is in another file.
The header file contains the declarations for functions defined in the other file, which is necessary for the code that's calling the function to compile correctly.
For instance, suppose I write the following code:
int main(void)
{
double *foo = malloc(sizeof *foo * 10);
if (foo)
{
// do something with foo
free (foo);
}
return 0;
}
malloc is a standard library function that dynamically allocates memory and returns a pointer to it. The return type of malloc is void *, any value of which can be assigned to any other pointer type. free is another standard library function that deallocates memory allocated through malloc, and its return type is void (IOW, no return value).
However, the compiler doesn't automatically know what malloc or free return (or don't return); it needs to see the declarations for both functions in the current scope before it can correctly translate the function calls. Under the C89 standard and earlier, if a function is called without a declaration in scope, the compiler assumes that the function returns int; since int is not compatible with double * (you can't assign one to the other directly without a cast), you'll get an "incompatible assignment" diagnostic. Under C99 and later, implicit declarations aren't allowed at all. Either way the compiler won't translate the code.
I need to add the line
#include <stdlib.h>
which includes the declarations for malloc and free (and a bunch of other stuff) to the beginning of the file.
There are several reasons you don't want to put function definitions (or variable definitions) in header files. Suppose you define function foo in header a.h. You include a.h in files a.c and b.c. Each file will compile okay individually, but when you try to link them together to build a library or executable, you'll get a "multiple definition" error from the linker -- you've wound up creating two separate instances of a function with the same name, which is a no-no. Same goes for variable definitions.
It also doesn't scale well. If you put a bunch of functions in their own header files and include them in one source file, you're translating all those functions in one big glob. Furthermore, if you only change the code in the source file or one header file, you still wind up recompiling everything each time you recompile the .c file. By putting each function in it's own .c file, you can reduce your overall build times by only recompiling the files that need to be recompiled.