Related
I strangely found that C allows linking of function where argument list doesn't match:
//add.c
int add(int a, int b, int c) {
return a + b + c;
}
//file.c
int add (int,int); //Note: only 2 arguments
void foo() {
add(1,2);
}
I compiled add.c first, then compiled file.c, both got compiled successfully. Strangely, linker didn't give any sort of error or warning, probably the reason is C linker doesn't compare arguments while linking. I'm not 100% sure about it though. Someone please comment on this.
Now, the question is what is the good practice to avoid this situation or get some sort of warning during compilation, because in my project there are lot of functions in different files, and now & then we have to add some extra argument in the function.
Use your header files correctly.
Configure your compiler to emit as many warnings as possible.
Mind the warnings!
add.h
#ifndef ADD_H_INCLUDED
#define ADD_H_INCLUDED
int add(int a, int b, int c);
#endif
add.c
#include "add.h"
int add(int a, int b, int c) {
return a + b + c;
}
file.c
#include "add.h"
void foo() {
add(1, 2);
}
C linker doesn't compare arguments while linking.
That is correct. Unlike C++ linker which considers argument types to be part of a function signature, C linker considers only function names. That is why it is possible to create a situation with undefined behavior simply by supplying a wrong function prototype the way that you show.
what is the good practice to avoid this situation or get some sort of warning during compilation?
Always put prototypes of functions that you are intended to share into a header, and include that header from the places where the function is used and the place where the function is defined. This would ensure that compiler issues a diagnostic message. Treat all compiler warnings as errors. C compilers are often rather forgiving, so their warnings usually indicate something very important.
Calling a function with to few arguments leads to undefined behavior as the value of those arguments will be indeterminate.
I have this weird thing:
in a file1.c there's
extern void foo(int x, int y);
..
..
int tmp = foo(1,2);
in the project I could find only this foo():
in file2.c :
int foo(int x, int y, int z)
{
....
}
in file2.h :
int foo(int x, int y, int z);
file2.h isn't included in file1.c (this is why who wrote it used extern, i guess).
this project compiles fine, I think that's because in file1.c foo() will be looked for only during linkage, am I right?
but my real question is : why is the linkage succssful ?
after all, there is no such function as foo with 2 parameters....
and i'm in c .. so there's no overloading..
so what's going on ?
Because there is no overloading, the C compiler does not decorate the function names. The linker finds in file2.c a reference to function foo and in file1.c it finds a function foo. It cannot know their parameter lists do not match and happily use them.
Of course, when the function foo runs the value of z is garbage and the behavior of the program becomes unpredictable from that point on.
Calling a function with the wrong number (or types) of arguments is an error.
The standard requires the implementation to detect some, but not all of them.
What the standard calls an implementation, is typically a compiler with a separate linker (and some other things), where a compiler translates single translation units (that is, a preprocessed source file) into object files, which later get linked together.
While the standard doesn't distinct between them, its authors of course wrote it with the typical setup in mind.
C11 (n1570) 6.5.2.2 "Function calls", p2:
If the expression that denotes the called function has a type that includes a prototype, the number of arguments shall agree with the number of parameters. Each argument shall have a type such that its value may be assigned to an object with the unqualified version of the type of its corresponding parameter.
This is in a "constraints" section, which means, the implementation (in this case, that's the compiler) must complain and may abort translation if a "shall" requirement is violated.
In your case, there was a prototype visible, so the arguments of the function call must match with the prototype.
Similar requirements apply for a function definition with a prototype declaration in scope; if your function definition doesn't match the prototype, your compiler must tell you. In other words, as long as you ensure that all calls to a function and that function's definition are in the scope of the same prototype, you are told if there is a mismatch. This can be ensured if the prototype is in a header file which is included by all files with calls to that function and by the file containing its definition. We use header files with prototypes exactly for that reason.
In the code shown, this checking is by-passed by providing a non-matching prototype and not including the header file2.h.
Ibid. p9:
If the function is defined with a type that is not compatible with the type (of the expression) pointed to by the expression that denotes the called function, the behavior is undefined.
Undefined behaviour means, the compiler is free to assume it doesn't happen, and is not required to detect if it does.
And in fact, on my machine, the generated object files from file2.c (I inserted a return 0; to have some function body), don't differ if I remove one of the function arguments, which means, the object file doesn't contain any information about the arguments und thus a compiler seeing only file2.o and file1.c hasn't got any chance to detect the violation.
You've mentioned overloading, so let's compile file2.c (with two and three arguments) as C++ and look at the object files:
$ g++ file2_three_args.cpp -c
$ g++ file2_two_args.cpp -c
$ nm file2_three_args.o
00000000 T _Z3fooiii
$ nm file2_two_args.o
00000000 T _Z3fooii
Function foo has its arguments incorporated into the symbol created for it (a process called name mangling), the object file indeed carries some information about the function types. Accordingly, we get an error at link time:
$ cat file1.cpp
extern void foo(int x, int y);
int main(void) {
foo(1,2);
}
$ g++ file2_three_args.o file1.cpp
In function `main':
file1.cpp:(.text+0x19): undefined reference to `foo(int, int)'
collect2: error: ld returned 1 exit status
This behaviour would also be allowed for a C implementation, aborting translation is a valid manifestation of undefined behaviour at compile or link time.
The way overloading in C++ is usually done actually allows such checks at link time. That C doesn't have built-in support for function overloading, and that the behaviour is undefined for the cases where the compiler cannot see the type mismatches, allows to generate symbols for functions without any type information.
First of all
extern void foo(int x, int y);
means exactly the same thing as
void foo(int x, int y);
The former is just an overly explicit way to write the same thing. extern fills no other purpose here. It is like writing auto int x; instead of int x, it means the very same thing.
In your case, the "foo" module (which you call file2) contains the function prototype as well as the definition. This is proper program design in C. What file1.c should be doing is to #include the foo.h.
For reasons unknown, whoever wrote file1.c didn't do this. Instead they are just saying "elsewhere in the project, there is this function, do not care about its definition, that's handled elsewhere".
This is bad programming practice. file1.c shouldn't concern itself with how things are defined elsewhere: this is spaghetti programming which creates a needless tight coupling between the caller and the module. There is also the chance that the actual function doesn't match the local prototype, in which case you would hopefully get linker errors. But there are no guarantees.
The code must be fixed like this:
file1.c
#include "foo.h"
...
int tmp = foo(1,2);
foo.h
#ifndef FOO_H
#define FOO_H
int foo(int x, int y, int z);
#endif
foo.c
#include "foo.h"
int foo(int x, int y, int z)
{
....
}
With the following code:
int main(){
printf("%f\n",multiply(2));
return 0;
}
float multiply(float n){
return n * 2;
}
When I try to compile I get one warning: "'%f' expects 'double', but argument has type 'int'" and two errors: "conflicting types for 'multiply'", "previous implicit declaration of 'multiply' was here."
Question 1: I am guessing that it's because, given the compiler has no knowledge of function 'multiply' when he comes across it the first time, he will invent a prototype, and invented prototypes always assume 'int' is both returned and taken as parameter. So the invented prototype would be "int multiply(int)", and hence the errors. Is this correct?
Now, the previous code won't even compile. However, if I break the code in two files like this:
#file1.c
int main(){
printf("%f\n",multiply(2));
return 0;
}
#file2.c
float multiply(float n){
return n * 2;
}
and execute "gcc file1.c file2.c -o file" it will still give one warning (that printf is expecting double but is getting int) but the errors won't show up anymore and it will compile.
Question 2: How come when I break the code into 2 files it compiles?
Question 3: Once I run the program above (the version split into 2 files) the result is that 0.0000 is printed on the screen. How come? I am guessing the compiler again invented a prototype that doesn't match the function, but why is 0 printed? And if I change the printf("%f") to printf("%d") it prints a 1. Again, any explanation of what's going on behind the scenes?
Thanks a lot in advance.
So the invented prototype would be "int multiply(int)", and hence the errors. Is this correct?
Absolutely. This is done for backward compatibility with pre-ANSI C that lacked function prototypes, and everything declared without a type was implicitly int. The compiler compiles your main, creates an implicit definition of int multiply(int), but when it finds the real definition, it discovers the lie, and tells you about it.
How come when I break the code into 2 files it compiles?
The compiler never discovers the lie about the prototype, because it compiles one file at a time: it assumes that multiply takes an int, and returns an int in your main, and does not find any contradictions in multiply.c. Running this program produces undefined behavior, though.
Once I run the program above (the version split into 2 files) the result is that 0.0000 is printed on the screen.
That's the result of undefined behavior described above. The program will compile and link, but because the compiler thinks that multiply takes an int, it would never convert 2 to 2.0F, and multiply will never find out. Similarly, the incorrect value computed by doubling an int reinterpreted as a float inside your multiply function will be treated as an int again.
An unspecified function has a return type of int (that's why you get the warning, the compiler thinks it returns an integer) and an unknown number of unspecified arguments.
If you break up your project in multiple files, just declare a function prototype before you call the functions from the other files, and all will work fine.
Question1:
So the invented prototype would be "int multiply(int)", and hence the
errors. Is this correct?
Not exactelly yes because it it depends of your Cx (C89, C90, C99,...)
for function return values, prior to C99 it was explicitly specified that if no function declaration was visible the translator provided one. These implicit declarations defaulted to a return type of int
Justification from C Standard (6.2.5 page 506)
Prior to C90 there were no function prototypes. Developers expected to
be able to interchange argu-ments that had signed and unsigned
versions of the same integer type. Having to cast an argument, if the
parameter type in the function definition had a different signedness,
was seen as counter to C’s easy-going type-checking system and a
little intrusive. The introduction of prototypes did not completely do
away with the issue of interchangeability of arguments. The ellipsis
notation specifies that nothing is known about the 1590 ellipsis
supplies no information expected type of arguments. Similarly, for
function return values, prior to C99 it was explicitly specified that
if no function declaration was visible the translator provided one.
These implicit declarations defaulted to a return type of int . If the
actual function happened to return the type unsigned int , such a
default declaration might have returned an unexpected result. A lot of
developers had a casual attitude toward function declarations. The
rest of us have to live with the consequences of the Committee not
wanting to break all the source code they wrote. The
interchangeability of function return values is now a moot point,
because C99 requires that a function declaration be visible at the
point of call (a default declaration is no longer provided)
Question 2:
How come when I break the code into 2 files it compiles?
it will compile and it will be treated like indicated in the first question exactelly the same
Question 1: Yes you are correct. If there is no function prototype, the default type is int
Question 2: When you are compiling this code as one file, the compiler see that there is already function named multiply and it has a different type than supposed (double instead of int). Thus compilation doesn't work.
When you separate this in two files, the compiler makes two .o files. In the first one it suppose that the multiply() function will be in other file. Then linker links both files into a binary and according to the name multiply inserts call of float multiply() on the place of int multiply() supposed by the compiler in the first .o file.
Question 3: If you read int 2 as a float, you will get a very small number (~1/2^25), so after that you multiply it by 2 and it still remains too small for format %f. That's why you see 0.00000.
The following test.c program
int main() {
dummySum(1, 2);
return 0;
}
int dummySum(int a, int b) {
return a + b;
}
...doesn't generate any warning when compiled with gcc -o test test.c, whereas the following one does:
int main() {
dummySum(1, 2);
return 0;
}
void dummySum(int a, int b) {
a + b;
}
Why?
When faced with an undeclared function, the compiler assumes a function that accepts the given number of arguments (I think) and returns int (that part I'm sure of). Your second one doesn't, and so you get the redefinition warning.
I believe, based on a very quick scan of the foreward, that C99 (PDF link) removed this. No great surprise that GCC still allows them (with a warning), though; I can't imagine how much code would start failing to compile...
Recommend using -Wall (turning on all warnings) so you get a huge amount of additional information (you can turn off specific warnings when you have a really good reason for whatever you're doing that generates them if need be).
A function cannot be used before it has been declared. When a function declaration is not visible, the implementation assumes in C89 that the function:
takes an unspecified (but fixed) number of parameters
returns an int
This is called an implicit function declaration.
In C99, implicit declarations of function have been removed of the language and the implementation is free to refuse to translate the source code.
#include<stdio.h>
int f();
int main()
{
f(1);
f(1,2);
f(1,2,3);
}
f(int i,int j,int k)
{
printf("%d %d %d",i,j,k);
}
it is running fine(without any error) ...can u plz explain how it executes ?
how f(1) and f(1,2) links to f(int,int,int) ?
You must have a different definition of "error" to me :-) What gets printed the first two times you call your f function? I get
1 -1216175936 134513787
1 2 134513787
1 2 3
for my three function calls.
What you're seeing is a holdover from the very early days of C when people played footloose and fancy-free with their function calls.
All that is happening is that you are calling a function f and it's printing out three values from the stack (yes, even when you only give it one or two). What happens when you don't provide enough is that your program will most likely just use what was there anyway, usually leading to data issues when reading and catastrophic failure when writing.
This is perfectly compilable, though very unwise, C. And I mean that in a very real, "undefined behaviour", sense of the word (referring specifically to C99: "If the expression that denotes the called function has a type that does not include a prototype, ... if the number of arguments does not equal the number of parameters, the behaviour is undefined").
You should really provide fully formed function prototypes such as:
void f(int,int,int);
to ensure your compiler picks up this problem, and use ellipses (...) in variable parameter functions.
As an aside, what usually happens under the covers is that the calling function starts with a stack like:
12345678
11111111
and pushes (for example) two values onto a stack, so that it ends up like:
12345678
11111111
2
1
When the called function uses the first three values on the stack (since that's what it wants), it finds that it has 1, 2 and 11111111.
It does what it has to do then returns and the calling function clears those two values off the stack (this is called a caller-makes-good strategy). Woe betide anyone who tries this with a callee-makes-good strategy :-) although that's pretty unusual in C since it makes variable argument functions like printf a little hard to do.
This declaration:
int f();
...tells the compiler "f is a function that takes some fixed number of arguments, and returns int". You then try to call it with one, two and three arguments - C compilers are conceptually one-pass (after preprocessing), so at this point, the compiler doesn't have the information available to argue with you.
Your actual implementation of f() takes three int arguments, so the calls which only provide one and two arguments invoke undefined behaviour - it's an error which means that the compiler isn't required to give you an error message, and anything could happen when you run the program.
int f();
In C this declares a function which take a variable number of arguments i.e. it's equivalent to the following in C++
int f(...);
To check this use the following instead of int f();
int f(void);
This will cause the compiler to complain.
Please note: A C linker quirk is also involved here...the C linker does not validate the arguments being passed to a function at the point of invocation and simply links to the first public symbol with the same name. Thus the use of f() in main is allowed because of the declaration of int f(). But the linker binds the function f(int, int, int) during link time at the invocation sites. Hope that makes some sense (please let me know if it doesn't)
It runs fine since int f() means what other answer has already said: it means unspecified number of arguments. This mean you can call it with the number of arguments that you want (also more than 3), without the compiler saying anything about it.
The reason why it works "under the cover", is that arguments are pushed on the stack, and then accessed "from" the stack in the f function. If you pass 0 arguments, the i, j, k of the function "corresponds" to values on the stack that, from the function PoV, are garbage. Nonetheless you can access their values. If you pass 1 argument, one of the three i j k accesses the value, the others get garbage. And so on.
Notice that the same reasoning works if the arguments are passed in some other way, but anyway these are the convention in use. Another important aspect of these conventions is that the callee is not responsible for adjusting the stack; it is up to the caller, that knows how many argument are pushed for real. If it would be not so, the definition of f could suggest that it has to "adjust" the stack to "release" three integer, and this would cause a crash of some kind.
What you've written is fine for the current standard (on gcc compiles with no warnings even with -std=c99 -pedantic; there's a warning, but it's about the missing int in front of the f definition), even though many people finds it disgusting and call that an "obsolescent feature". For sure, your usage in the example code does not show any usefulness, and likely it can help busting bugs a more binding usage of prototypes! (But still, I prefer C to Ada)
add
A more "useful" usage of the "feature" that does not trigger the "undefined behaviour" issue, could be
#include<stdio.h>
int f();
int main()
{
f(1);
f(2,2);
f(3,2,3);
}
int f(int i,int j,int k)
{
if ( i == 1 ) printf("%d\n", i);
if ( i == 2 ) printf("%d %d\n", i, j);
if ( i == 3 ) printf("%d %d %d\n", i, j, k);
}
When you compile the same program using g++ compiler you see the following errors -
g++ program.c
program.c: In function `int main()':
program.c:2: error: too many arguments to function `int f()'
program.c:6: error: at this point in file
program.c:2: error: too many arguments to function `int f()'
program.c:7: error: at this point in file
program.c:2: error: too many arguments to function `int f()'
program.c:8: error: at this point in file
program.c: At global scope:
program.c:12: error: ISO C++ forbids declaration of `f' with no type
Using gcc with option -std=c99 just gives a warning
Compile the same program with the same standard which g++ is having by default, gives the following message:
gcc program.c -std=c++98
cc1: warning: command line option "-std=c++98" is valid for C++/ObjC++ but not for C
My answer then would be or c compilers conform to a different standard which is not as restrictive as the one which c++ conforms to.
In C a declaration has to declare at least the return type. So
int f();
declares a function that returns the type int. This declaration doesn't include any information about the parameters the function takes. The definition of the function is
f(int i,int j,int k)
{
printf("%d %d %d",i,j,k);
}
Now it is known, that the function takes three ints. If you call the function with arguments that are different from the definition you will not get a compile-time error, but a runtime error (or if you don't like the negative connotation of error: "undefined behavior"). A C-compiler is not forced by the standard to catch those inconsistencies.
To prevent those errors, you should use proper function prototypes such as
f(int,int,int); //in your case
f(void); //if you have no parameters