Who detects misspelled function name? Compiler or Linker? - c

According to C How to Program (Deitel):
Standard library functions like printf and scanf are not part of the C programming language. For example, the compiler cannot find a spelling error in printf or scanf. When the compiler compiles a printf statement, it merely provides space in the object program for a “call” to the library function. But the compiler does not know where the library functions are—the linker does. When the linker runs, it locates the library functions and inserts the proper calls to these library functions in the object program. Now the object program is complete and ready to be executed. For this reason, the linked program is called an executable. If the function name is misspelled, it is the linker which will spot the error, because it will not be able to match the name in the C program with the name of any known function in the libraries.
These statements leave me doubtful because of the existence of header file. These files are included during the preprocessing phase, before the compiling one, and, as I read, there are used by the compiler.
So if I write print instead of printf how can't the compiler see that there is no function declared with that name and throw an error?
If it is as the book says, why can I declare function in header files if the compiler doesn't watch them?

So if I write print instead of printf how can't the compiler see that there is no function declared with that name and throw an error?
You are right. If you made a typo in any function name, any modern compiler should complain about it. For example, gcc complains for the following code:
$ cat test.c
int main(void)
{
unknown();
return 0;
}
$ gcc -c -Wall -Wextra -std=c11 -pedantic-errors test.c
test.c: In function ‘main’:
test.c:3:5: error: implicit declaration of function ‘unknown’ [-Wimplicit-function-declaration]
unknown();
^
However, in pre C99 era of C language, any function whose declaration isn't seen by the compiler, it'll assume the function returns an int. So, if you are compiling in pre-C99 mode then a compiler isn't required to warn about it.
Fortunately, this implicit int rule was removed from the C language since C99 and a compiler is required to issue a diagnostic for it in modern C (>= C99).
But if you provide only a declaration or prototype for the function:
$ cat test.c
int unknown(void); /* function prototype */
int main(void)
{
unknown();
return 0;
}
$ gcc -c -Wall -Wextra -std=c89 -std=c11 test.c
$
(Note: I have used -c flag to just compile without linking; but if you don't use -c then compiling & linking will be done in a single step and the error would still come from the linker).
There's no issue despite the fact, you do not have definition for unknown() anywhere. This is because the compiler assumes unknown() has been defined elsewhere and only when the linker looks to resolve the symbol unknown, it'll complain if it can't find the definition for unknown().
Typically, the header file(s) only provide the necessary declarations or prototypes (I have provided a prototype for unknown directly in the file itself in the above example -- it might as well be done via a header file) and usually not the actual definition. Hence, the author is correct in that sense that the linker is the one that spots the error.

So if I write print instead of printf how can't the compiler see that there is no function declared with that name and throw an error?
The compiler can see that there is no declaration in scope for the identifier designating the function. Most will emit a warning under those circumstances, and some will emit an error, or can be configured to do so.
But that's not the same thing as the compiler detecting that the function doesn't exist. It's the compiler detecting that the function name has not been declared. The compiler will exhibit the same behavior if you spell the function name correctly but do not include a prior declaration for it.
Furthermore, C90 and pre-standardization C permitted calls to functions without any prior declaration. Such calls do not conform to C99 or later, but most compilers still do accept them (usually with a warning) for compatibility purposes.
If it is as the book says, why can I declare function in header files if the compiler doesn't watch them?
The compiler does see them, and does use the declarations. Moreover, it relies on the prototype, if the declaration provides one, to perform appropriate argument and return value conversions when you call the function. Moreover, if you use functions whose argument types are altered by the default argument promotions, then your calls to such functions are non-conforming if no prototype is in scope at the point of the call. Undefined behavior results.

Related

How does GCC compiler find the header file corresponding to some implicitly declared functions?

In the case of an implicitly declared function, gcc will sometimes tell you the header file from which the function belongs. From this answer, it seems only some functions are built in - "some compilers contain built-in declarations for them so they can do some basic type checking".
Is this how gcc is able to tell you which header file corresponds to some implicitly declared functions and not others?
For example,
implicit printf usage will generate an additional comment:
compilation.c:4:5: note: include the header <stdio.h> or explicitly provide a declaration for 'printf'
but bsearch from stdlib does not:
compilation.c:5:5: error: implicit declaration of function 'bsearch' is invalid in C99 [-Werror|,-Wimplicit-function-declaration]
how gcc is able to tell you which header file corresponds to some implicitly declared functions and not others?
Gcc has a list of symbols and headers. When the symbol is encountered and it is not defined and it is in the list, then a message is displayed with the proposed header name.
See the list at https://github.com/gcc-mirror/gcc/blob/16e2427f50c208dfe07d07f18009969502c25dc8/gcc/c-family/known-headers.cc#L157 from gcc sources.
bsearch is not in the list, so the hint is not displayed. I like the hints, it would be nice for me to include all the symbols from C standard, including bsearch. It would also be a speedup if the list would be sorted and would use bsearch. You can contribute to gcc or donate to gcc and write about it to gcc mailing list.

How to find use of parameter-type-lists in a C code base?

I recently learned that there are parameter-type-lists which can be empty in C:
int print();
int main() {
print("hallo"); // << valid but highly unintuitive
}
int print() {
}
In this code someone might just have forgotten to write print(void) (maybe a C++ developer) but someone else provides a parameter. Compilation does not show any warnings or errors:
$ make test -Wstrict-prototypes -Wimplicit -Wimplicit-function-declaration -Wall
cc test.c -o test
I didn't find a compiler flag which warns about empty parameter-type-lists, only about implicit function declaration.
Is there something I can do which helps me finding all uses of parameter-type-lists in a given code base?
E.g.
letting a C++ compiler compile the C code as C++ and solve type issues (C++ does not allow arguments if the declaration does not list them)
let the compiler list all function declarations (don't know if possible) and searching manually for empty braces
greping for parameter-type-lists (too complex for me :))
disabling parameter-type-lists via compiler switch (didn't find any)
With gcc, using -Wstrict-prototypes will achieve what you expect:
-Wstrict-prototypes (C and Objective-C only)
Warn if a function is declared or defined without specifying the argument types. (An old-style function definition is permitted without a warning if preceded by a declaration that specifies the argument types.)
With your example, it gives:
hallo.c:1:5: warning: function declaration isn’t a prototype [-Wstrict-prototypes]
int print() {
^
However, you'll have to ensure that all non-strict function declarations in your existing code gets properly defined; in particular, functions without parameters should be declared with (void), like this:
int print(void);
Your example is not valid, neither in C nor in C++. Since your print() function does not accept any parameters -- which is well established from the function definition you provide (C2011, 6.7.6.3/14) -- it is non-conforming to call it with an argument, regardless of whether a prototype is visible at the point of the call (C2011, 6.5.2.2/6). If the call appears in a scope wherein neither the function definition nor any bona fide prototype is visible then the compiler might nevertheless accept the code, but that does not make it valid or guarantee that it will work.
Additionally, you seem to be using the wrong term, or possibly your idea is very wrong-headed. Parameter-type-lists are the modern way of declaring (and defining) functions, including function prototypes:
int my_func(int a, char *b);
// ^^^^^^^^^^^^^^----- parameter-type-list
You should not get rid of those, but I think you mean that you want to identify and fix K&R-style function definitions and also function declarations without parameter-type-lists. (The formal syntax's terms for the K&R style are "identifier list" and, if needed, an accompanying "declaration list".)
As for identifying the occurrences of K&R-style declarations automatically, that's a job for a C language parser, and the most common implementations of those are C compilers. Your compiler may have an option for just what you want. GCC, for example, has options -Wstrict-prototypes and -Wold-style-defintion, which, together, will signal both function declarations that are not prototypes and K&R-style function definitions. GCC has some other options that might be of interest to you too, such as -Wimplicit, -Wtraditional, and -Wc++-compat.
All you need to look for is () (with optional whitespace), followed by { (with optional whitespace, perhaps newline). And you can replace it with (void).
If you want an "industrial grade" solution you can use a compiler which emits something easier to parse, such as GCC-XML or Clang.

Implicit function declarations and linkage

Recently I've learnt about implicit function declarations in C. The main idea is clear but I have some troubles with understanding of the linkage process in this case.
Consider the following code ( file a.c):
#include <stdio.h>
int main() {
double someValue = f();
printf("%f\n", someValue);
return 0;
}
If I try to compile it:
gcc -c a.c -std=c99
I see a warning about implicit declaration of function f().
If I try to compile and link:
gcc a.c -std=c99
I have an undefined reference error. So everything is fine.
Then I add another file (file b.c):
double f(double x) {
return x;
}
And invoke the next command:
gcc a.c b.c -std=c99
Surprisingly everything is linked successfully. Of course after ./a.out invocation I see a rubbish output.
So, my question is: How are programs with implicitly declared functions linked? And what happens in my example under the hood of compiler/linker?
I read a number of topics on SO like this, this and this one but still have problems.
First of all, since C99 , implicit declaration of a function is removed from the standard. compilers may support this for compilation of legacy code, but it's nothing mandatory. Quoting the standard foreword,
remove implicit function declaration
That said, as per C11, chapter §6.5.2.2
If the function is defined with a type that does not include a prototype, and the types of
the arguments after promotion are not compatible with those of the parameters after
promotion, the behavior is undefined.
So, in your case,
the function call itself is implicit declaration (which became non-standard since C99),
and due to the mismatch of the function signature [Implicit declaration of a function were assumed to have an int return type], your code invokes undefined behavior.
Just to add a bit more reference, if you try to define the function in the same compilation unit after the call, you'll get a compilation error due to the mismatch signature.
However, your function being defined in a separate compilation unit (and missing prototype declaration), compiler has no way to check the signatures. After the compilation, the linker takes the object files and due to the absence of any type-checking in the linker (and no info in object files either), happily links them. Finally, it will end up in a successful compilation and linking and UB.
Here is what is happening.
Without a declaration for f(), the compiler assumes an implicit declaration like int f(void). And then happily compiles a.c.
When compiling b.c, the compiler does not have any prior declaration for f(), so it intuits it from the definition of f(). Normally you would put some declaration of f() in a header file, and include it in both a.c and b.c. Because both the files will see the same declaration, the compiler can enforce conformance. It will complain about the entity that does not match the declaration. But in this case, there is no common prototype to refer to.
In C, the compiler does not store any information about the prototype in the object files, and the linker does not perform any checks for conformance (it can't). All it sees is a unresolved symbol f in a.c and a symbol f defined in b.c. It happily resolves the symbols, and completes the link.
Things break down at run time though, because the compiler sets up the call in a.c based on the prototype it assumed there. Which does not match what the definition in b.c looks for. f() (from b.c) will get a junk argument off the stack, and return it as double, which will be interpreted as int on return in a.c.
How are programmes with implicitly declared functions are linked? And what happens in my example under the hood of compiler/linker?
The implicit int rule has been outlawed by the C standard since C99. So it's not valid to have programs with implicit function declarations.
It's not valid since C99. Before that, if a visible prototype is not available then the compiler implicitly declares one with int return type.
Surprisingly everything is linked successfully. Of course after
./a.out invocation I see a rubbish output.
Because you didn't have prototype, compiler implicitly declares one with int type for f(). But the actual definition of f() returns a double. The two types are incompatible and this is undefined behaviour.
This is undefined even in C89/C90 in which the implicit int rule is valid because the implicit prototype is not compatible with the actual type f() returns. So this example is (with a.c and b.c) is undefined in all C standards.
It's not useful or valid anymore to have implicit function declarations. So the actual detail of how compiler/linker handles is only of historic interest. It goes back to the pre-standard times of K&R C which didn't have function prototypes and the functions return int by default. Function prototypes were added to C in C89/C90 standard. Bottom line, you must have prototypes (or define functions before use) for all functions in valid C programs.
After compiling, all type information is lost (except maybe in debug info, but the linker doesn't pay attention to that). The only thing that remains is "there is a symbol called "f" at address 0xdeadbeef".
The point of headers is to tell C about the type of the symbol, including, for functions, what arguments it takes and what it returns. If you mismatch the real ones with the ones you declare (either explicitly or implicitly), you get undefined behavior.

How to turn "implicit declaration" warnings in $CC into errors?

Preamble: My C may be fairly rusty; I first started writing C programs in somewhere around 1993 -- compilers may have been different back then, but I recall that when one attempted to refer to a C function that was not declared, the compiler would abort. This is from memory.
Currently, I am perplexed as to why GCC (4.4.3) is so forgiving on me when I [intentionally] mismatch or omit declaration of bar below, with its definition in bar.c. Because the compiler does not warn me, the program proceeds to a fatal addressing error at runtime -- since bar wants an address and is given an integer, it ends up de-referencing that integer as an address.
A strict compiler, or so I would think, would abort on me with an error. Am I missing something? My build command line is as follows:
cc -o foobar -g -Wall -std=c99 -fexec-charset=ISO-8859-1 -DDEBUG foo.c bar.c
With foo.c:
int main() {
int a;
bar(a);
return 0;
}
and bar.c:
void bar(int * a) {
*a = 1;
}
I have intentionally omitted declaration of bar and, as mentioned, intentionally pass it an integer (could be anything really) instead of an address that its actual definition would otherwise mandate. Because $(CC) does not stop me, I end up with a segmentation fault (x86, Ubuntu 10.04). I am aware that a compliant C (C99?) compiler would implicitly create an int bar(void) declaration for bar if none otherwise found, but in this case that's obviously not what I want at all!
I want to protect myself from the kind of errors -- where I make the human mistake of mismatching declarations and definitions or omitting the former altogether.
I tried to instead just invoke the compiler without the linking step (with the -c switch) -- but it doesn't matter as compiling still succeeds with warnings. The linker might complain though, but I want the compiler to stop me before that happens.
I do not actually want to turn all my warnings into errors (e.g. with -Werror), because:
I could have included the wrong float bar(double a); at the top of foo.c, which would eliminate the warning altogether, but doesn't change the fact that the resulting program crashes; alas, a program that compiles successfully without warnings (even with the -Wall switch) but still is faulty
I have and will have other types of warnings that should stay warnings and not prevent successfully building the program
It would be dealing with the effect of the problem, rather than the problem itself
It's not just the types of warnings, but also particular instances thereof; I wouldn't want to turn a specific warning into an error because in some instances that would not be applicable; this would be too "coarse" of a solution which doesn't take into account the specifics of and the context in which the warning occurred
To turn this warning into an error when compiling with gcc, pass the switch -Werror=implicit-function-declaration to the compiler.
Trying to answer your "why" question: yes, it might look odd that this is by default a warning and not an error. This is for historical reasons. For details, see e.g. Why does/did C allow implicit function and typeless variable declarations?, or read it in Ritchie's own words at http://cm.bell-labs.com/who/dmr/chist.html.
You could probably force additional warnings for gcc:
-Wmissing-prototypes
-Wmissing-declarations
Using both (along with -Werror) will probably help you to avoid such situations, but require some more code writing.
Inspired by this.
EDIT: Example
// file: mis1.c
int main(void)
{
int a;
bar(a);
return 0;
}
// file: mis2.c
#include <stdio.h>
double bar(double a)
{
printf("%g\n", a);
return a;
}
Compiling with gcc 3.3.4 (DJGPP) as:
gcc -Wall -Wmissing-prototypes -Wmissing-declarations -Werror mis2.c mis1.c -o mis.exe
Compiler output:
mis2.c:5: warning: no previous prototype for `bar'
mis1.c: In function `main':
mis1.c:6: warning: implicit declaration of function `bar'
Fix? #Include the following file in both files:
// file: mis.h
extern int bar(int);
Recompiling you get:
mis2.c:6: error: conflicting types for `bar'
mis.h:3: error: previous declaration of `bar'
Fix? Define and declare bar everywhere in the same way, correct, for example, mis.h:
// file: mis.h
extern double bar(double);
Likewise you could change bar() in mis2.c to match that of mis.h.
From the gcc docs on warnings:
-Wimplicit-function-declaration (C and Objective-C only)
Give a warning whenever a function is used before being declared. In C99 mode (-std=c99 or -std=gnu99), this warning is enabled by default and it is made into an error by -pedantic-errors. This warning is also enabled by -Wall.
...
-pedantic-errors (my emphasis) Like -pedantic, except that errors are produced rather than warnings.
...
-pedantic
Issue all the warnings demanded by strict ISO C and ISO C++; reject all programs that use forbidden extensions, and some other programs that do not follow ISO C and ISO C++. For ISO C, follows the version of the ISO C standard specified by any -std option used.
It looks to me that -pedantic-errors will do what you want (turn these warnings into errors), however it sounds like it will also turn on a host of other checks you may or may not want. =/
The closest I found to a solution to my problem was to simply use the flag -combine which indirectly causes the compiler to abort compilation when attempting to call a function that is missing a prototype or where prototypes mismatch or do not match the definition.
I am afraid it has drawbacks though. Since input files now are combined in one compilation run, one ends up with a single object file, which has some implications of its own. In short, -combine does something more than just fix my problem, and that may be a problem in itself.
You can turn all warning to error with
cc [..] -Werror [..]
. This will partially solve your problem.
I could have included the wrong float bar(double a); at the top of foo.c, which eliminates the warning altogether, but doesn't change
the fact that the resulting program crashes. Alas, a program that
compiles successfully without warnings (even with the -Wall switch)
and beautifully crashes at runtime.
Herefore it is essential, additional to other measures, to include the same header file (including the prototype) in foo.c and bar.c. This ensures that the correct prototype is applied at both places.

What are the implications of having an "implicit declaration of function" warning in C?

As the question states, what exactly are the implications of having the 'implicit declaration of function' warning? We just cranked up the warning flags on gcc and found quite a few instances of these warnings and I'm curious what type of problems this may have caused prior to fixing them?
Also, why is this a warning and not an error. How is gcc even able to successfully link this executable? As you can see in the example below, the executable functions as expected.
Take the following two files for example:
file1.c
#include <stdio.h>
int main(void)
{
funcA();
return 0;
}
file2.c
#include <stdio.h>
void funcA(void)
{
puts("hello world");
}
Compile & Output
$ gcc -Wall -Wextra -c file1.c file2.c
file1.c: In function 'main':
file1.c:3: warning: implicit declaration of function 'funcA'
$ gcc -Wall -Wextra file1.o file2.o -o test.exe
$ ./test.exe
hello world
If the function has a definition that matches the implicit declaration (ie. it returns int and has a fixed number of arguments, and does not have a prototype), and you always call it with the correct number and types of arguments, then there are no negative implications (other than bad, obsolete style).
ie, in your code above, it is as if the function was declared as:
int funcA();
Since this doesn't match the function definition, the call to funcA() from file1.c invokes undefined behaviour, which means that it can crash. On your architecture, with your current compiler, it obviously doesn't - but architectures and compilers change.
GCC is able to link it because the symbol representing the function entry point doesn't change when the function type changes (again... on your current architecture, with your current compiler - although this is quite common).
Properly declaring your functions is a good thing - if for no other reason than that it allows you to give your function a prototype, which means that the compiler must diagnose it if you are calling it with the wrong number or types of arguments.
It has the same behaviour as using a non-prototype function declaration at block scope with an int return type, because the return type can't be specified it defaults to int, like all declarations in C that do not specify a type, everything is an int.
The reason that functions can be implicitly declared is because they can only be defined at file scope, but it is unclear whether an undefined variable is block scope or file scope, therefore it is disallowed as opposed to selecting one and providing an implicit tentative definition at file or block scope. Indeed, the actual implict declaration is a block scope one, so you'll get a warning for the first reference to the function in each function it is referenced in.

Resources