is the purpose of header files in C only warning to users? - c

I'm a beginner into Linking, sorry if my questions are too basic. lets say I have two .c files
file1.c is
int main(int argc, char *argv[])
{
int a = function2();
return 0;
}
file2.c is
int function2()
{
return 2018;
}
I know the norm is, create a file2.h and include it in file1.c, and I have some questions:
Q1. #include in file1.c doesn't make too much difference or improve much to me, I can still compile file1.c without file2.h correctly, the compiler will just warn me 'implicit declaration of function 'function2', but does this warning help a lot? Programmers might know that function2 is defined in other .c file(if you use function2 but don't define it, you certainly know the definition is somewhere else) and linker will do its job to produce the final executable file? so the only purpose of include file2,c to me is, don't show any warning during compilation, is my understanding correct.
Q2. Image this scenario, a programmer define function2 in file1.c, he doesn't know that his function2 in conflict with the one in file2.c until the linker throws the error(obvious he can compile his file1.c alone correctly. But if we want him to know his mistake when he compiles his file1.c, adding file2.h still don't help, so what's the purpose of adding header file?
Q3. What should we add to let the programmer know he should choose a different name for function2 rather then be informed the error by linker in the final stage.

Per C89 3.3.2.2 Function calls emphasis mine:
If the expression that precedes the parenthesized argument list in a function call consists solely of an identifier, and if no declaration is visible for this identifier, the identifier is implicitly declared exactly as if, in the innermost block containing the function call, the declaration
extern int identifier();
appeared
Now, remember, empty parameter list (declared with nothing inside the () braces) declares a function that takes unspecified type and number of arguments. Type void inside braces to declare that a function takes no arguments, like int func(void).
Q1:
does this warning help a lot?
Yes and no. This is a subjective question. It helps those, who use it. As a personal note, always make this warning an error. Using gcc compiler use -Werror=implicit-function-declaration. But you can also ignore this warning and make the simplest main() { printf("hello world!\n"); } program.
linker will do its job to produce the final executable file? so the only purpose of include file2,c to me is, don't show any warning during compilation, is my understanding correct.
No. In cases the function is called using different/not-compatible pointer type. It invokes undefined behavior. If the function is declared as void (*function2(void))(int a); then calling ((int(*)())function2)() is UB as is calling function2() without previous declaration. Per Annex J.2 (informative):
The behavior is undefined in the following circumstances:
A pointer is used to call a function whose type is not compatible with the pointed-to type (6.3.2.3).
and per C11 6.3.2.3p8:
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.
So in your lucky case int function2() indeed this works. It also works for example for atoi() function. But calling atol() will invoke undefined behavior.
Q2:
the linker throws the error
This should happen, but is really linker dependent. If you compile all sources using a single stage with the gcc compiler it will throw an error. But if you create static libraries and then link them using gcc compiler without -Wl,-whole-archive then it will pick the first declaration is sees, see this thread.
what's the purpose of adding header file?
I guess simplicity and order. It is a convenient and standard way to share data structures (enum, struct, typedefs) and declarations (function and variable types) between developers and libraries. Also to share preprocessor directives. Image you are writing a big library with over 1000+ files that will work with over 100+ other libraries. In the beginning of each file would you write struct mydata_s { int member1; int member2; ... }; int printf(const char*, ...); int scanf(const char *, ...); etc. or just #include "mydata.h" and #include <stdio.h>? If you would need to change mydata_s structure, you would need to change all files in your project and all the other developers which use your library would need to change the definition too. I don't say you can't do it, but it would be more work to do it and no one will use your library.
Q3:
What should we add to let the programmer know he should choose a different name for function2 rather then be informed the error by linker in the final stage.
In case of name clashes you will by informed (hopefully) by the linker that it found two identifiers with the same name. You would need to create a tool to check your sources exactly for that. I don't know why the need for this, the linker is specifically made to resolve symbols so it naturally handles the cases when two symbols with the same identifier exists.

Short answer:
Take away: the earlier the compiler alert the better.
Q1: meaning of .h: consistency and early alerts. Alerting early on common ways of going wrong improves reliability of code and adds up to less debugging and production crashes.
Q2: Clashing Names bring early alerts to developers, which are usually easier to fix.
Q3: Early duplicate definition alerts are not baked into the C standard.
Exercises:
1. Define a function in one file that printf("%d\n",i) an int argument then call that function in another file with a float of 42.0.
2. Call with (double)42.0.
3. Define function with char *str argument printed under %.s then call with int argument.
Longer answers:
Popular convention: in typical use the name of the .h file is derived from the .c file, or files, it is associated with. file.h and file.c. For .h files with many definitions, say string.h, derive the file name from a hither perspective of what's within (as in the str... functions).
My big rule: it’s always better to structure your code so compilers can immediately alert on bugs at compile time rather than letting them slide through to debug or run time where they depend on code actually running in just the right way to find. Run time errors can be very difficult to diagnose, especially if they hit long after the program is in production, and expensive in maintenance and brings down your customer experience. See "yoda notation".
Q1: meaning of .h: consistency and early alerts and improved reliability of code.
C .h files allow developers of .c files compiled at different times to share common declarations. No duplicate code. .h files also allow functions to be consistently called from all files while identifying improper argument signatures (argument counts, bad clashes, etc.). Having.c files defining functions also #include the .h file helps assure the arguments in the definition are consistent with the calls; this may sound elementary, but without it all the human errors of signature clashes can sneak through.
Omitting .h files only works if the argument signatures of all callers perfectly match those in the definitions. This is often not the case so without .h files any clashing signatures would produce bad numbers unless you also had parallel externs in the calling file (bad bad bad). Things like int vs float can produce spectacularly wrong argument values. Bad pointers can produce segment faults and other total crashes.
Advantage: with externs in .h files compilers can correctly cast mismatching arguments to the correct type, assuring better calls. While you can still botch arguments it’s much less likely. It also helps avoid conditions where the mismatches work on one implementation but not another.
Implicit declaration warnings are hugely helpful to me as they usually indicate I’ve forgotten a .h file or spelled the name an external name wrong.
Q2: Clashing Names. Early alerts.
Clashing names are bad and it is the developers responsibility to avoid problems. C++ solves the issue with name spaces, which C, being a lower level language, does not have.
Use of .h files can allow can let compiler diagnostics alert developers where clashes care are early in the game. If compiler diagnostics don’t do this hopefully linkers will do so on multidefined symbol errors, but this is not guaranteed by the standard.
A common way to fake name spaces is by starting all potentially clashing definitions in a .h with some prefix (extern int filex_function1(int arg, char *string) or #define FILEX_DEF 42).
What to do if two different external libraries being used share the same names is beyond the scope of this answer.
Q3: early duplicate alerts. Sorry… early alerts are implementation dependent.
This would be difficult for the C standard to define. As C is an old language there are many creative different ways C programs are written and stored.
Hunting for clashing names before using them is up to the developer. Tools like cross reference programs can help. Even something stupid like ctags associated with vim or emacs can help.

you misunderstand usage of header files and function prototypes.
header files are needed to share common information between multiple code files. such information includes macro definition, data types, and, possibly, function prototypes.
function protoypes are needed for the compiler to correctly handle return data types and to give you early warnings of misuse of function return types and arguments.
function prototypes can be declared in header files or can be declared in the files which use them (more typing).
you have a very simple example, with just 2 files. Now imagine a project with hudreds of files and thousands of functions. You will be lost in linker errors.
'c' allows you to use an undeclared function due to legacy reasons. In this situation it assumes that the function has a return type of 'int'. However, modern data types has a bigger veriety than in early days. The function can return pointers, 64-bit data, structures. To express that you must use prototypes or nothing will work. The compiler has to know how to handle function returns correctly.
Also, it can give you warnings about incorrect use of argument types. Due to leagacy, those are still warnings, but they got addressed in early c++ and converted to errors.
Those warnings give you early debugging capabilities. Type mismatch warnings can save you days of debugging in some cases.
So, in your example you do not need the header file. You can prototype the function in the 'main' file using the 'extern' syntax. You can even do without prototyping. However, in real modern programming world you cannot allow the latter. In particular when you work in a team or want your program to be maintainable.
It is a good idea to store you funcion protypes in header files. This would be a good documentation source, in particular with good comments. BTW, function names must make sense to be maintainable.

Q1. Yes. C is a low level language, and was historically used to bind low level constructs into higher level concepts. For example, traditionally the label _end is at the last address in a program. The label is typeless but you can declare it as any type that is convenient to you. A "properly typed" language would make this sort of abuse difficult.
Q2. By convention, both file1.c and file2.c would include file2.h; one as consumer, the other as producer. Following this simple idiom will catch declaration vs definition errors; although again, the "warning" is not necessarily enforced.
Q3. Many software organizations take a "warnings are errors" rule to socially control their programmers.

Related

Benefits and drawbacks of making all functions in main.c static?

I have heard that, when you have just 1 (main.c) file (or use a "unity build"), there are benefits to be had if you make all your functions static.
I am kind of confused why this (allegedly) isn't optimized by default, since it's not probable that you will include main.c into another file where you will use one of its functions.
I would like to know the benefits and dangers of doing this before implementing it.
Example:
main.c
static int my_func(void){ /*stuff*/ }
int main(void) {
my_func();
return 0;
}
You have various chunks of wisdom in the comments, assembled here into a Community Wiki answer.
Jonathan Leffler noted:
The primary benefit of static functions is that the compiler can (and will) aggressively inline them when it knows there is no other code that can call the function. I've had error messages from four levels of inlined function calls (three qualifying “inlined from” lines) on occasion. It's staggering what a compiler will do!
and:
FWIW: my rule of thumb is that every function should be static until it is known that it will be called from code in another file. When it is known that it will be used elsewhere, it should be declared in a header file that is included both where the function is defined and where it is used. (Similar rules apply to file scope variables — aka 'global variables'; they should be static until there's a proven need for them elsewhere, and then they should be declared in a header too.)
The main() function is always called from the startup code, so it is never static. Any function defined in the same file as an unconditionally compiled main() function cannot be reused by other programs. (Library code might contain a conditionally compiled test program for the library function(s) defined in the source file — most of my library code has #ifdef TEST / …test program… / #endif at the end.)
Eirc Postpischil generalized on that:
General rule: Anytime you can write code that says the use of something is limited, do it. Value will not be modified? Make it const. Name only needs to be used in a certain section? Declare it in the innermost enclosing scope. Name does not need to be linked externally? Make it static. Every limitation both shrinks the window for a bug to be created and may remove complications that interfere with optimization.

Ansi C - programming language book of K&R - header file inclusion

Going through the K&R ansi C programming language book (second version), on page 82 an example is given for a programming files/folders layout.
What I don't understand is, while calc.h gets included in main (use of functions), getop.c (definition of getop) and stack.c (definition of push and pop), it does not get included into getch.c, even though getch and ungetch are defined there.
Although it's a good idea to include the header file it's not required as getch.c doesn't actually use the function declared in calc.h, it could even get by if it only used those already defined in getch.c.
The reason it's a good idea to include the header file anyway is because it would provide some safety if you use modern style prototypes and definitions. The compiler should namely complain if for example getop isn't defined in getop.c with the same signature as in calc.h.
calc.h contains the declaration of getch() and ungetch(). It is included by files that want to use these functions (and, therefore, need their signature).
getch.c, instead, contains the definition of getch() and ungetch(). Therefore, there is no need of including their declaration (which is implicitly defined in the definition).
The omission you have so aptly discovered can be a source of a real problem. In order to benefit fully from C's static type checking across a multi-translation-unit program (which is almost anything nontrivial), we must ensure that the site which defines an external name (such as a function) as well as all the sites which refer to the name, have the same declaration in scope, ideally from a single source: one header file where that name is declared.
If the definition doesn't have the declaration in scope, then it is possible to change the definition so that it no longer matches the declaration. The program will still translate and link, resulting in undefined behavior when the function is called or the object is used.
If you use the GNU compiler, you can guard against this problem using -Wmissing-prototypes. Straight from the gcc manual page:
-Wmissing-prototypes (C and Objective-C only)
Warn if a global function is defined without a previous prototype
declaration. This warning is issued even if the definition itself
provides a prototype. The aim is to detect global functions that
fail to be declared in header files.
Without diagnosis, this kind of thing, such as forgetting a header file, can happen to the best of us.
One possible reason why the header was forgotten is that the example project uses the "one big common header" convention. The "one big common header" approach lets the programmer forget all about headers. Everything just sees everything else and the #include "calc.h" which makes it work is just a tiny footnote that can get swallowed up in the amnesia. :)
The other aspect is that the authors had spent a lot of time programming in pre-ANSI "Classic" C without prototype declarations. In Classic C, header files are mainly for common type declarations and macros. The habit is that if a source file doesn't need some type or macros that are defined in some header, then it doesn't need to include that header. A resurgence of that habit could be what is going on here.

Why doesn't the compiler infer function prototypes from function definitions?

I know that it's poor practice to not include function prototypes, but if you don't, then the compiler will infer a prototype based on what you pass into the function when you call it (according to this answer). My question is why does the compiler infer the prototype from what you pass into the function rather than the definition of the function itself? I can imagine some kind of preprocessing step where all declared functions are identified and checked to see if a prototype exists for each one. If one doesn't have a prototype, the first line of the function is copied and stuck under the existing prototypes. Why isn't this done?
Because the C compiler was designed as a single pass compiler, where any given file does not know about the other source files that make up the project.
Although compilers have gotten more sophisticated, and may do multiple passes, the general outline of the compilation process framework remains as it was in K&R's day:
Pre-process each source file(macro text replacement only).
Compile the processed source into an object file.
Link the objects into an executable or library.
Inferring prototypes would have to happen in the first step, but the compiler does not know about the existence of any other objects which may contain the function definition at that time.
It might be possible to make a compiler which did what you suggest, but not without breaking the existing rules for how to infer prototypes. A change with such big consequences would make the language no longer C.
The major use for prototypes is to declare a function and inform the compiler about the number and type of arguments in cases where the definition is not visible. Since C was originally compiled single-pass, the definition is not visible when it occurs later in the translation unit, but the more important case from a modern perspective is when the definition is not visible at all, due to lying in a separate translation unit, possibly even in a library file that exists only in compiled form and where no information about the function's type is recorded.

How do you create general personalized functions in C and then include them in your program?

I'm a beginner to C, but I've had a bit of experience with some other programing languages like Ruby and Python. I would very much like to create some of my own functions in C that I could use in any of my programs that just make life easier, however I'm a little bit confused about how to do this.
From what I understand the first part of this process is to create a header file that contains all of your prototypes, and I understand that, however from what I understand it is frowned upon to include anything other than declarations in your header files, so would you also need to create a .c file that contained the actual code and then #include that in all your programs along with the header file? But if so, why would you need a header file in the first place, since defining a function also declares it?
Finally, what should you put in the main() function of your header file? Do you just leave it blank, or do you not include it?
Thanks!
The declaration of a function lets the compiler know that at link time such a function will be available. The definition of the function provides that implementation, and additionally it also serves as the declaration. There is no harm in having multiple declarations, but only one implementation can be provided. Also, at least one declaration (or the only implementation) must come before any use of the function - this alone makes forward declarations necessary in cases where two functions call one another (both cannot be before the other).
So, if you have the implementation:
int foo(int a, int b) {
return a * b;
}
The corresponding declaration is simply:
int foo(int a, int b);
(The argument names do not matter in the declaration, i.e., they can be omitted or different than in the implementation. In fact you could declare only int foo(); and it would work for the above function, but this is mainly a legacy thing and not recommended. Note that to declare a function that takes no arguments, put void in the argument list, e.g., int bar(void);)
There are a number of reasons why you would want to have separate headers with only the declaration:
The implementation may be in a separate file, which allows for organisation of code into manageable pieces, and may be compiled by itself and need not be recompiled unless that file has changed - in large projects where the total compilation time can be an hour it would be absurd to re-compile everything for a small change.
The implementation source may not be available, e.g., in case of a closed-source proprietary library.
The implementation may be in a different language with a compatible calling convention.
For practical details on how to write code in multiple files and how to use libraries, please consult a book or tutorial on C programming. As for main, you need not declare it in a header unless you are specifically calling main from another function - the convention of C programs is to call main as int main(int, char**) at start of the execution.
When compiling, each .c-file (or .cpp-file) will be compiled to an own binary first.
If one binary file is using functions from another,
it just knows "there is something outside named xyz" at that time.
Then the linker will put them together in one file and rewrite the parts of each file
which are using functions of other files,
so that they actually know where to find the used functions.
What will happen if you put code in a .h file:
At compilation time, each included h-file in a c-file will be integrated in the c-file.
If you have code for xyz in a h-file and you´re including it in more thana one c-file,
each of this compiled c-files will have a xyz. Then, the linker will be confused...
So, function code have to be in a own c file.
Why use a h-file at all?
Because, if you call xyz in your code, how should the compiler know
if this is a function of another c-file (and which parameters...)
or an error because xyz does not exist?
The reason for header files in c are for when you need the same code in multiple scripts. So if you are just repeated the same code in one script then yes it would be easier to just use a function. Also for header files, yes you would need to include a .c file for all the computations.

What's the point of function prototyping?

I'm following a guide to learn curses, and all of the C code within prototypes functions before main(), then defines them afterward. In my C++ learnings, I had heard about function prototyping but never done it, and as far as I know it doesn't make too much of a difference on how the code is compiled. Is it a programmer's personal choice more than anything else? If so, why was it included in C at all?
Function prototyping originally wasn't included in C. When you called a function, the compiler just took your word for it that it would exist and took the type of arguments you provided. If you got the argument order, number, or type wrong, too bad – your code would fail, possibly in mysterious ways, at runtime.
Later versions of C added function prototyping in order to address these problems. Your arguments are implicitly converted to the declared types under some circumstances or flagged as incompatible with the prototype, and the compiler could flag as an error the wrong order and number of types. This had the side effect of enabling varargs functions and the special argument handling they require.
Note that, in C (and unlike in C++), a function declared foo_t func() is not the same as a function declared as foo_t func(void). The latter is prototyped to have no arguments. The former declares a function without a prototype.
In C prototyping is needed so that your program knows that you have a function called x() when you have not gotten to defining it, that way y() knows that there is and exists a x(). C does top down compilation, so it needs to be defined before hand is the short answer.
x();
y();
main(){
}
y(){
x();
}
x(){
...
more code ...
maybe even y();
}
I was under the impression that it was so customers could have access to the .h file for libraries and see what functions were available to them, without having to see the implementation (which would be in another file).
Useful to see what the function returns/what parameters.
Function prototyping is a remnant from the olden days of compiler writing. It used to be considered horribly inefficient for a compiler to have to make multiple passes over a source file to compile it.
In C, in certain contexts, referring to a function in one manner is syntactically equivalent to referring to a variable: consider taking a pointer to a function versus taking a pointer to a variable. In the compiler's intermediate representation, the two are semantically distinct, but syntactically, whether an identifier is a variable, a function name, or an invalid identifier cannot be determined from the context.
Since it's not determinable from the context, without function prototypes, the compiler would need to make an extra pass over each one of your source files each time one of them compiles. This would add an extra O(n) factor for any compilation (that is, if compilation were O(m), it would now be O(m*n)), where n is the number of files in your project. In large projects, where compilation is already on the order of hours, having a two-pass compiler is highly undesirable.
Forward declaring all your functions would allow the compiler to build a table of functions as it scanned the file, and be able to determine when it encountered an identifier whether it referred to a function or a variable.
As a result of this, C (and by extension, C++) compilers can be extremely efficient in compilation.
It allows you to have a situation in which say you can have an iterator class defined in a separate .h file which includes the parent container class. Since you've included the parent header in the iterator, you can't have a method like say "getIterator()" because the return type would have to be the iterator class and therefore it would require that you include the iterator header inside the parent header creating a cyclic loop of inclusions (one includes the other which includes itself which includes the other again, etc.).
If you put the iterator class prototype inside the parent container, you can have such a method without including the iterator header. It only works because you're simply saying that such an object exists and will be defined.
There are ways of getting around it like having a precompiled header, but in my opinion it's less elegant and comes with a slew of disadvantages. Of couurse this is C++, not C. However, in practice you might have a situation in which you'd like to arrange code in this fashion, classes aside.

Resources