multiple function declarations in different compilation units - c

zero.c:
int sq();
one.c:
int sq(int i) { return i*i; }
two.c:
int sq(int i, int j);
main.c:
int main() {
printf("%d\n",sq());
printf("%d\n",sq(2));
printf("%d\n",sq(2,3));
}
then I compile each file individually and gcc zero.o one.o two.o main.o -o main
./main gives
1
4
4
I'm a little confused as how this would work successfully. what really happens when I call sq() or sq(2) or sq(2,3)

If you want to know what really happens, have gcc output the assembly for main.o and take a look. I think you will find that when you call sq() the arguments are loaded into the base registers on your machine, and then sq(int i) will do a multiply instruction on the first register. If you pass additional arguments they won't have any affect, and if you don't pass any arguments it will just work on whatever value was previously loaded into that register.

zero.c & two.c do not have any function definition. It's only prototype declaration. Thus, it will not create any assembly code having function definition. (hint: compile with gcc -s flag to verify.)
Only two.c has function definition. Thus two.s will have a function sq, which takes the first argument (generally passed on the stack or the first register of the processor, like eax on intel or r0 in arm) & returns its square.
Since you have not given any prototype in main.c, the compiler (.c -> .s) will not complain. It may probably treat it as int sq(...), but I am not sure about it.
Thus, for 3 different inputs:
sq(), sq(2), sq(2,3) will all call to the same function, which is declared in two.c.
Now, the outputs for sq(2) & sq(2,3) are obvious - return square of the first argument passed. The output for sq() will depend on what is on the stack/eax/r0 as seen in sq's stack. Seems that it was 1. hint: run under gdb to verify.

According to the C spec, your code invokes undefined behavior in multiple ways (and possibly more):
Two declarations of the same function in the same scope use different return or argument types/counts
For a call to a function without a function prototype in scope, the number of arguments does not equal the number of parameters
Since this is undefined behavior, you have no way to predict what will happen. The result doesn't even necessarily have to be consistent. If you aren't seeing compile-time warnings/errors for this code, then you need to turn up your compiler's warning level.
Adding a prototype in main.c will probably resolve the compiler's warnings with this code. The linker may still have issues, though, because you have multiple functions with the same name in the same scope and it's not exactly clear which one you want the code to use.

So, I wrote an answer earlier based on what I read in the post. Which was wrong. Here's the correct answer.
zero.c doesn't generate any code. two.c doesn't generate any code.
main.c and one.c are the only files that actually generate code.
Calling a function with one argument, sq(int i) in one.c, with no arguments is undefined behaviour (so "anything can happen", including something resembling what you expect in some cases). Calling with two arguments is also undefined behaviour - again, it will not necessarily "go wrong" when you do this, but you it is not guaranteed to work (or do what you expect) - it could for example just as well return 9 because it puts arguments into registers from the last to first.

Related

C function that only returns input parameter

For reasons out of my control, I have to implement this function in my C code:
double simple_round(double u)
{
return u;
}
When this function is called, is it ignored by the compiler, or does the call take place anyway? For instance:
int y;
double u = 3.3;
y = (int)simple_round(u*5.5); //line1
y = (int)u*5.5; //line2
Will both lines of code take the same time to be executed, or will the first one take longer?
Because the function is defined in a different C file from where it's used, if you don't use link-time optimization, when the compiler calls the function call it won't know what the function does, so it will have to actually compile the function call. The function will probably just have two instructions: copy the argument to the return value, then return.
The extra function call may or may not slow down the program, depending on the type of CPU and what else the CPU is doing (the other instructions nearby)
It will also force the compiler to consider that it might be calling a very complicated function that overwrites lots of registers (whichever ones are allowed to be overwritten by a function call); this will make the register allocation worse in the function that calls it, perhaps making that function longer and making it need to do more memory accesses.
When this function is called, is it ignored by the compiler, or does the call take place anyway?
It depends. If the function definition is in the same *.c file as the places where it's called then the compiler most probably automatically inlines it, because it has some criteria to inline very simple functions or functions that are called only once. Of course you have to specify a high enough optimization level
But if the function definition is in another compilation unit then the compiler can't help unless you use link-time optimization (LTO). That's because in C each *.c file is a separate compilation unit and will be compiled to a separate object (*.o) file and compilers don't know the body of functions in other compilation units. Only at the link stage the unresolved identifiers are filled with their info from the other compilation units
In this case the generated code in a *.c file calls a function that you can change in another *.c file then there are many more reliable solutions
The most correct method is to fix the generator. Provide evidences to show that the function the generated code calls is terrible and fix it
In case you really have no way to fix the generator then one possible way is to remove the generated *.c file from the compilation list (i.e. don't compile it into *.o anymore) and include it in your own *.c file
#define simple_round(x) (x)
#include "generated.c"
#undef simple_round
Now simple_round() calls in generated.c will be replaced with nothing
If the 'generated' code has to be compiled anyway, perhaps you can 'kludge' a macro, Macro, that redefines the call to the 'inefficient' rounding function made by that code.
Here's a notion (all in one file). Perhaps the #define can be 'shimmed in' (and documented!) into the makefile entry for that single source file.
int fnc1( int x ) { return 5 * x; }
void main( void ) {
printf( "%d\n", fnc1( 5 ) );
#define fnc1(x) (x)
printf( "%d\n", fnc1( 7 ) );
}
Output:
25
7

C calling a function with empty declaration arglist, defined with args

I just resolved an absolute headbanger of a problem, and the issue was so simple, yet so elusive. So frustratingly hidden behind a lack of compiler feedback and an excess of compiler complacency (which is rare!). During writing this post, I found a few similar questions, but none that quite match my scenario.
Calling method without typeless argument produces a compiler error when the definition includes strongly typed args.
Why does gcc allow arguments to be passed to a function defined to be with no arguments? and C function with incomplete declaration both pass excess arguments to an argumentless function.
Why does an empty declaration work for definitions with int arguments but not for float arguments? does contain a successfully building declaration/definition mismatch, but has no invocation, where I would expect to see a too few arguments message.
I have a function declaration with no args, a call to that function with no args, and the function definition below with args. Somehow, C manages to successfully call the function, no warning, no error, but very undefined behaviour. Where does the function get the missing argument from? Why don't I get a linker error since the no-arg function isn't defined? Why don't I get a compiler error because I'm redefining a function with a different signature? Why, oh why, is this allowed?
Compiling as C++ code (gcc -x c++, enabling Compile To Binary on Godbolt) I get a linker error as expected, because of course C++ allows overloading, and the no-arg overload isn't defined. By checking with Godbolt, compiling with Clang and MSVC as C code also both build successfully, with only MSVC spitting out a minor warning.
Here is my reduced example for Godbolt.
// Compile with GCC or Clang -x c -Wall -Wextra
// Compile with MSVC /Wall /W4 /Tc
#include <stdio.h>
#include <stdlib.h>
// This is just so Godbolt can do an MSVC build
#ifndef _MSC_VER
# include <unistd.h>
#else
# define read(file, output, count) (InputBuffer[count] = count, fd)
#endif
static char InputBuffer[16];
int ReadInput(); // <-- declared with no args
int main(void)
{
int count;
count = ReadInput(); // <-- called with no args
printf("%c", InputBuffer[0]); // just so the results, and hence the entire function call,
printf("%d", count); // don't get optimised away by not being used (even though I'm
return 0; // not using any optimisation... just being cautious)
};
int ReadInput(int fd) // <-- defined with args!
{
return read(fd, InputBuffer, 1); // arg is definitely used, it's not like it's optimised away!
};
Where does the function get the missing argument from?
Typically, the called function is compiled to get its parameters from the places the arguments would be passed according to the ABI (Application Binary Interface) being used. This is necessarily true when the called function is in a separate translation unit (and there is no link-time optimization), so the compiler cannot adjust it according to the calling code. If the call and the called function are in the same translation unit, the compiler could do other things.
For example, if the ABI says the first int class parameter is passed in processor register r4, then the called function will get its parameter from register r4. Since the caller has not put an argument there, the called function gets whatever value happens to be in r4 from previous use.
Why don't I get a linker error since the no-arg function isn't defined?
C implementations generally resolve identifiers by name only. Type information is not part of the name or part of resolution. A function declared as int ReadInput() has the same name as a function declared as int ReadInput(int fd), and, as far as the linker is concerned, a definition of one will satisfy a reference to the other.
Why don't I get a compiler error because I'm redefining a function with a different signature?
The definitions are compatible. In C, the declaration int ReadInput() does not mean the function has no parameters. It means “There is a function named ReadInput that returns int, and I am not telling you what its parameters are.
The declaration int ReadInput(int fd) means “There is a function named ReadInput that returns int, and it takes one parameter, an int. These declarations are compatible; neither says anything inconsistent with the other.
Why, oh why, is this allowed?
History. Originally, C did not supply parameter information in function declarations, just in definitions. The prototype-less declarations are still allowed so that old software continues to work.
Other answers explained why it is legal to call a function that was declared without a prototype (but that it is your responsibility to get the arguments right). But you might be interested in the -Wstrict-prototypes warning option accepted by both GCC and clang, which is documented to "Warn if a function is declared or defined without specifying the argument types." Your code then yields warning: function declaration isn't a prototype.
Try it on godbolt.
(I'm kind of surprised this warning isn't enabled with -Wall -Wextra.)
In C, unlike in C++, declaring a function with no arguments means that the function may have as many arguments as you'd like. If you want to make it really not have any arguments, you just have to explicitly declare that:
int ReadInput(void);

Why is '-lm' used explicitly only when passing variables to 'math.h' functions?

First of all, I have read this post Why do you need an explicit `-lm` compiler option & this gcc: why is the -lm flag needed to link the math library?. I wanna know why It doesn't happen in case of constants (when I say constants, I mean random floats/doubles)? If you're confused, call it floating-point literals.
Why do we have to use -lm to tell the linker to use math.h functions only when using variables as parameters but not constants? If I use sqrt(N)(N is some number), it compiles fine without any errors but when I pass some variable, let's say sqrt(var), it doesn't. It says:
/usr/bin/ld: /tmp/cc5P9o72.o: in function `main':
sq.c:(.text+0x1b): undefined reference to `sqrt'
collect2: error: ld returned 1 exit status
It should behave the same all the time (I think so, but I am wrong, of-course) as I am using the same function from the same library. Either its variable or constant. I first thought its some kind of compiler optimization (if it's the same value every time, why not calculate it while compiling by some other way, i.e not using the library, as it's not working) but it's doesn't work even if I pass some variable that has a fixed value from beginning to the end. So, I was wrong there. What is actually happening here?
Following is the snippet I tried:
#include <stdio.h>
#include <math.h>
int main () {
float a=9;
printf("%f",sqrt(a));
return 0;
}
It is very simple. When you pass the constants many compilers will evaluate it (in such a trivial example when the result is not float inaccuracies and implementation differences prone) compile time without calling the math.h functions.
Even if you do not pass the constants values and compile it with no math error checks and fast math, the compiler will generate the direct float machine code instructions without calling the library ones
Before asking check the generated code for example using the godbolt.org, and usually it will answer all of your questions

debugging the assembly equivalent of a c code to understand the function call

Just for my curiosity, I was looking on how the values passed to a function are actually operated by the called function. To make my doubt clear, I have an understanding that a compiler generates a code for a c code compiling it sequentially(Please correct if I am wrong). My doubt is how is the parameter value accessed in the called function? I mean parameter must be some a part of the calling function (like main() in my given example). How the compiler arrange that value passed in the calling function is same as the value accessed in the called function. To make my point clear, please look at the following code:
#include <stdio.h>
void check(int);
int main()
{
check(9999);
}
void check(int a)
{
int b;
b = a;
}
In the above code after the execution of code, value of b = 9999; but how come the value of a in the function check() attain the value of 9999 at assembly level when the function check() is called from main(). Is it like parameters are stored in certain registers and accessed accordingly using those registers in the check(). I hope you understood my question.
calling conventions depend upon the ABI and the target processor (for your compiler).
There is a wikipage on x86 calling convention. On Linux, you should read the x86-64 ABI. (Your a formal argument to check is passed in a register).
The compiler is allowed to optimize your check to a nop (since its argument a has no observable effect)
Read also wikipage on evaluation strategy: the C programming language requires a call-by-value semantics.
Consider (if using GCC) compiling with gcc -fverbose-asm -S (perhaps also with optimizing options like -O2) and look inside the produced *.s assembly code.
you can use gcc -S source.c or gcc -S source.c -O2 Both commands generates assembler code, but the 1-st command generates code without optimization, and parameters will be pass over stack, then the 2-st cammand with optimization, and parameters will be pass over registers

Why don't we get a compile time error even if we don't include stdio.h in a C program?

How does the compiler know the prototype of sleep function or even printf function, when I did not include any header file in the first place?
Moreover, if I specify sleep(1,1,"xyz") or any arbitrary number of arguments, the compiler still compiles it.
But the strange thing is that gcc is able to find the definition of this function at link time, I don't understand how is this possible, because actual sleep() function takes a single argument only, but our program mentioned three arguments.
/********************************/
int main()
{
short int i;
for(i = 0; i<5; i++)
{
printf("%d",i);`print("code sample");`
sleep(1);
}
return 0;
}
Lacking a more specific prototype, the compiler will assume that the function returns int and takes whatever number of arguments you provide.
Depending on the CPU architecture arguments can be passed in registers (for example, a0 through a3 on MIPS) or by pushing them onto the stack as in the original x86 calling convention. In either case, passing extra arguments is harmless. The called function won't use the registers passed in nor reference the extra arguments on the stack, but nothing bad happens.
Passing in fewer arguments is more problematic. The called function will use whatever garbage happened to be in the appropriate register or stack location, and hijinks may ensue.
In classic C, you don't need a prototype to call a function. The compiler will infer that the function returns an int and takes a unknown number of parameters. This may work on some architectures, but it will fail if the function returns something other than int, like a structure, or if there are any parameter conversions.
In your example, sleep is seen and the compiler assumes a prototype like
int sleep();
Note that the argument list is empty. In C, this is NOT the same as void. This actually means "unknown". If you were writing K&R C code, you could have unknown parameters through code like
int sleep(t)
int t;
{
/* do something with t */
}
This is all dangerous, especially on some embedded chips where the way parameters are passed for a unprototyped function differs from one with a prototype.
Note: prototypes aren't needed for linking. Usually, the linker automatically links with a C runtime library like glibc on Linux. The association between your use of sleep and the code that implements it happens at link time long after the source code has been processed.
I'd suggest that you use the feature of your compiler to require prototypes to avoid problems like this. With GCC, it's the -Wstrict-prototypes command line argument. In the CodeWarrior tools, it was the "Require Prototypes" flag in the C/C++ Compiler panel.
C will guess int for unknown types. So, it probably thinks sleep has this prototype:
int sleep(int);
As for giving multiple parameters and linking...I'm not sure. That does surprise me. If that really worked, then what happened at run-time?
This is to do with something called 'K & R C' and 'ANSI C'.
In good old K & R C, if something is not declared, it is assumed to be int.
So any thing that looks like a function call, but not declared as function
will automatically take return value of 'int' and argument types depending
on the actuall call.
However people later figured out that this can be very bad sometimes. So
several compilers added warning. C++ made this error. I think gcc has some
flag ( -ansic or -pedantic? ) , which make this condition an error.
So, In a nutshell, this is historical baggage.
Other answers cover the probable mechanics (all guesses as compiler not specified).
The issue that you have is that your compiler and linker have not been set to enable every possible error and warning. For any new project there is (virtually) no excuse for not doing so. for legacy projects more excuse - but should strive to enable as many as possible
Depends on the compiler, but with gcc (for example, since that's the one you referred to), some of the standard (both C and POSIX) functions have builtin "compiler intrinsics". This means that the compiler library shipped with your compiler (libgcc in this case) contains an implementation of the function. The compiler will allow an implicit declaration (i.e., using the function without a header), and the linker will find the implementation in the compiler library because you're probably using the compiler as a linker front-end.
Try compiling your objects with the '-c' flag (compile only, no link), and then link them directly using the linker. You will find that you get the linker errors you expect.
Alternatively, gcc supports options to disable the use of intrinsics: -fno-builtin or for granular control, -fno-builtin-function. There are further options that may be useful if you're doing something like building a homebrew kernel or some other kind of on-the-metal app.
In a non-toy example another file may include the one you missed. Reviewing the output from the pre-processor is a nice way to see what you end up with compiling.

Resources