Just wanted to verify that in VC++, unused member functions which are never called are by default considered as inlined functions by the compiler? If so why it is so, why not completely discard (since it will never be called) this function instead of in-lining it?
What is the advantage?
Update
The question is why even inline it when it will never be called? Why not simply discard it forever, just like some unused variables are discarded.
Member functions are considered inline without use of the inline keyword if they are defined in the body of the class definition. Whether they are called or not has nothing to do with it.
Unused member functions can't generally be discarded because their names have external linkage -- that is to say, some other translation unit or executable might call them, that hasn't even been written at the time this translation unit is compiled or this executable is linked.
Once you get to link-time, if the implementation somehow knows that this cannot happen then it could discard the code for the function. For example because the OS has no means to look up symbols in an executable, or because you've told the linker to strip them out using some implementation-defined option.
Relating this to VC++ in particular: on Windows you can look up symbols in executables if they're dllexport. So those functions won't generally be discarded even at link time, and other unused functions can't be discarded at compile time just because this TU doesn't use it. For most classes defined in the usual way, with a header file that declares the member functions and a source file that defines them, the functions are unused in that source file. So if the compiler discarded them because they were unused in that TU, nothing would ever work.
I think (I'm not sure) that whether the function is inline or not is relevant to whether it can be discarded, but might not mean that it can be entirely discarded. It's true that if it's inline, and someone calls it, then that someone must have the definition of the function in their TU. So in some sense the function is "not needed". However, any static local variables must be shared no matter what TU it's called from, and the address of the function itself must be the same no matter what TU it's taken in. So there may still have to be "something" there even if it's not the full code for the function.
But as I said -- even if inline functions can be discarded when unused, not all unused functions are inline.
Inline it where? It's never called, so it's impossible to inline it into any call site.
The Standard mandates whether a function is or is not considered inline. Whether or not it is called is irrelevant.
Related
I have heard that, when you have just 1 (main.c) file (or use a "unity build"), there are benefits to be had if you make all your functions static.
I am kind of confused why this (allegedly) isn't optimized by default, since it's not probable that you will include main.c into another file where you will use one of its functions.
I would like to know the benefits and dangers of doing this before implementing it.
Example:
main.c
static int my_func(void){ /*stuff*/ }
int main(void) {
my_func();
return 0;
}
You have various chunks of wisdom in the comments, assembled here into a Community Wiki answer.
Jonathan Leffler noted:
The primary benefit of static functions is that the compiler can (and will) aggressively inline them when it knows there is no other code that can call the function. I've had error messages from four levels of inlined function calls (three qualifying “inlined from” lines) on occasion. It's staggering what a compiler will do!
and:
FWIW: my rule of thumb is that every function should be static until it is known that it will be called from code in another file. When it is known that it will be used elsewhere, it should be declared in a header file that is included both where the function is defined and where it is used. (Similar rules apply to file scope variables — aka 'global variables'; they should be static until there's a proven need for them elsewhere, and then they should be declared in a header too.)
The main() function is always called from the startup code, so it is never static. Any function defined in the same file as an unconditionally compiled main() function cannot be reused by other programs. (Library code might contain a conditionally compiled test program for the library function(s) defined in the source file — most of my library code has #ifdef TEST / …test program… / #endif at the end.)
Eirc Postpischil generalized on that:
General rule: Anytime you can write code that says the use of something is limited, do it. Value will not be modified? Make it const. Name only needs to be used in a certain section? Declare it in the innermost enclosing scope. Name does not need to be linked externally? Make it static. Every limitation both shrinks the window for a bug to be created and may remove complications that interfere with optimization.
I'm a beginner into Linking, sorry if my questions are too basic. lets say I have two .c files
file1.c is
int main(int argc, char *argv[])
{
int a = function2();
return 0;
}
file2.c is
int function2()
{
return 2018;
}
I know the norm is, create a file2.h and include it in file1.c, and I have some questions:
Q1. #include in file1.c doesn't make too much difference or improve much to me, I can still compile file1.c without file2.h correctly, the compiler will just warn me 'implicit declaration of function 'function2', but does this warning help a lot? Programmers might know that function2 is defined in other .c file(if you use function2 but don't define it, you certainly know the definition is somewhere else) and linker will do its job to produce the final executable file? so the only purpose of include file2,c to me is, don't show any warning during compilation, is my understanding correct.
Q2. Image this scenario, a programmer define function2 in file1.c, he doesn't know that his function2 in conflict with the one in file2.c until the linker throws the error(obvious he can compile his file1.c alone correctly. But if we want him to know his mistake when he compiles his file1.c, adding file2.h still don't help, so what's the purpose of adding header file?
Q3. What should we add to let the programmer know he should choose a different name for function2 rather then be informed the error by linker in the final stage.
Per C89 3.3.2.2 Function calls emphasis mine:
If the expression that precedes the parenthesized argument list in a function call consists solely of an identifier, and if no declaration is visible for this identifier, the identifier is implicitly declared exactly as if, in the innermost block containing the function call, the declaration
extern int identifier();
appeared
Now, remember, empty parameter list (declared with nothing inside the () braces) declares a function that takes unspecified type and number of arguments. Type void inside braces to declare that a function takes no arguments, like int func(void).
Q1:
does this warning help a lot?
Yes and no. This is a subjective question. It helps those, who use it. As a personal note, always make this warning an error. Using gcc compiler use -Werror=implicit-function-declaration. But you can also ignore this warning and make the simplest main() { printf("hello world!\n"); } program.
linker will do its job to produce the final executable file? so the only purpose of include file2,c to me is, don't show any warning during compilation, is my understanding correct.
No. In cases the function is called using different/not-compatible pointer type. It invokes undefined behavior. If the function is declared as void (*function2(void))(int a); then calling ((int(*)())function2)() is UB as is calling function2() without previous declaration. Per Annex J.2 (informative):
The behavior is undefined in the following circumstances:
A pointer is used to call a function whose type is not compatible with the pointed-to type (6.3.2.3).
and per C11 6.3.2.3p8:
A pointer to a function of one type may be converted to a pointer to a function of another type and back again; the result shall compare equal to the original pointer. If a converted pointer is used to call a function whose type is not compatible with the referenced type, the behavior is undefined.
So in your lucky case int function2() indeed this works. It also works for example for atoi() function. But calling atol() will invoke undefined behavior.
Q2:
the linker throws the error
This should happen, but is really linker dependent. If you compile all sources using a single stage with the gcc compiler it will throw an error. But if you create static libraries and then link them using gcc compiler without -Wl,-whole-archive then it will pick the first declaration is sees, see this thread.
what's the purpose of adding header file?
I guess simplicity and order. It is a convenient and standard way to share data structures (enum, struct, typedefs) and declarations (function and variable types) between developers and libraries. Also to share preprocessor directives. Image you are writing a big library with over 1000+ files that will work with over 100+ other libraries. In the beginning of each file would you write struct mydata_s { int member1; int member2; ... }; int printf(const char*, ...); int scanf(const char *, ...); etc. or just #include "mydata.h" and #include <stdio.h>? If you would need to change mydata_s structure, you would need to change all files in your project and all the other developers which use your library would need to change the definition too. I don't say you can't do it, but it would be more work to do it and no one will use your library.
Q3:
What should we add to let the programmer know he should choose a different name for function2 rather then be informed the error by linker in the final stage.
In case of name clashes you will by informed (hopefully) by the linker that it found two identifiers with the same name. You would need to create a tool to check your sources exactly for that. I don't know why the need for this, the linker is specifically made to resolve symbols so it naturally handles the cases when two symbols with the same identifier exists.
Short answer:
Take away: the earlier the compiler alert the better.
Q1: meaning of .h: consistency and early alerts. Alerting early on common ways of going wrong improves reliability of code and adds up to less debugging and production crashes.
Q2: Clashing Names bring early alerts to developers, which are usually easier to fix.
Q3: Early duplicate definition alerts are not baked into the C standard.
Exercises:
1. Define a function in one file that printf("%d\n",i) an int argument then call that function in another file with a float of 42.0.
2. Call with (double)42.0.
3. Define function with char *str argument printed under %.s then call with int argument.
Longer answers:
Popular convention: in typical use the name of the .h file is derived from the .c file, or files, it is associated with. file.h and file.c. For .h files with many definitions, say string.h, derive the file name from a hither perspective of what's within (as in the str... functions).
My big rule: it’s always better to structure your code so compilers can immediately alert on bugs at compile time rather than letting them slide through to debug or run time where they depend on code actually running in just the right way to find. Run time errors can be very difficult to diagnose, especially if they hit long after the program is in production, and expensive in maintenance and brings down your customer experience. See "yoda notation".
Q1: meaning of .h: consistency and early alerts and improved reliability of code.
C .h files allow developers of .c files compiled at different times to share common declarations. No duplicate code. .h files also allow functions to be consistently called from all files while identifying improper argument signatures (argument counts, bad clashes, etc.). Having.c files defining functions also #include the .h file helps assure the arguments in the definition are consistent with the calls; this may sound elementary, but without it all the human errors of signature clashes can sneak through.
Omitting .h files only works if the argument signatures of all callers perfectly match those in the definitions. This is often not the case so without .h files any clashing signatures would produce bad numbers unless you also had parallel externs in the calling file (bad bad bad). Things like int vs float can produce spectacularly wrong argument values. Bad pointers can produce segment faults and other total crashes.
Advantage: with externs in .h files compilers can correctly cast mismatching arguments to the correct type, assuring better calls. While you can still botch arguments it’s much less likely. It also helps avoid conditions where the mismatches work on one implementation but not another.
Implicit declaration warnings are hugely helpful to me as they usually indicate I’ve forgotten a .h file or spelled the name an external name wrong.
Q2: Clashing Names. Early alerts.
Clashing names are bad and it is the developers responsibility to avoid problems. C++ solves the issue with name spaces, which C, being a lower level language, does not have.
Use of .h files can allow can let compiler diagnostics alert developers where clashes care are early in the game. If compiler diagnostics don’t do this hopefully linkers will do so on multidefined symbol errors, but this is not guaranteed by the standard.
A common way to fake name spaces is by starting all potentially clashing definitions in a .h with some prefix (extern int filex_function1(int arg, char *string) or #define FILEX_DEF 42).
What to do if two different external libraries being used share the same names is beyond the scope of this answer.
Q3: early duplicate alerts. Sorry… early alerts are implementation dependent.
This would be difficult for the C standard to define. As C is an old language there are many creative different ways C programs are written and stored.
Hunting for clashing names before using them is up to the developer. Tools like cross reference programs can help. Even something stupid like ctags associated with vim or emacs can help.
you misunderstand usage of header files and function prototypes.
header files are needed to share common information between multiple code files. such information includes macro definition, data types, and, possibly, function prototypes.
function protoypes are needed for the compiler to correctly handle return data types and to give you early warnings of misuse of function return types and arguments.
function prototypes can be declared in header files or can be declared in the files which use them (more typing).
you have a very simple example, with just 2 files. Now imagine a project with hudreds of files and thousands of functions. You will be lost in linker errors.
'c' allows you to use an undeclared function due to legacy reasons. In this situation it assumes that the function has a return type of 'int'. However, modern data types has a bigger veriety than in early days. The function can return pointers, 64-bit data, structures. To express that you must use prototypes or nothing will work. The compiler has to know how to handle function returns correctly.
Also, it can give you warnings about incorrect use of argument types. Due to leagacy, those are still warnings, but they got addressed in early c++ and converted to errors.
Those warnings give you early debugging capabilities. Type mismatch warnings can save you days of debugging in some cases.
So, in your example you do not need the header file. You can prototype the function in the 'main' file using the 'extern' syntax. You can even do without prototyping. However, in real modern programming world you cannot allow the latter. In particular when you work in a team or want your program to be maintainable.
It is a good idea to store you funcion protypes in header files. This would be a good documentation source, in particular with good comments. BTW, function names must make sense to be maintainable.
Q1. Yes. C is a low level language, and was historically used to bind low level constructs into higher level concepts. For example, traditionally the label _end is at the last address in a program. The label is typeless but you can declare it as any type that is convenient to you. A "properly typed" language would make this sort of abuse difficult.
Q2. By convention, both file1.c and file2.c would include file2.h; one as consumer, the other as producer. Following this simple idiom will catch declaration vs definition errors; although again, the "warning" is not necessarily enforced.
Q3. Many software organizations take a "warnings are errors" rule to socially control their programmers.
I have been researching this topic and I can not find a specific authoritative answer. I am hoping that someone very familiar with the C spec can answer - i.e. confirm or refute my assertion, preferably with citation to the spec.
Assertion:
If a program consists of more than one compilation unit (separately compiled source file), the compiler must assure that global variables (if modified) are written to memory before any call to a function in another unit or before the return from any function. Also, in any function, the global must be read before its first use. Also after a call of any function, not in the same unit, the global must be read before use. And these things must be true whether the variable is qualified as "volatile" or not because a function in another compilation unit (source file) could access the variable without the compiler's knowledge. Otherwise, "volatile" would always be required for global variables - i.e. non-volatile globals would have no purpose.
Could the compiler treat functions in the same compilation unit differently than ones that aren't? All of the discussions I have found for the "volatile" qualifier on globals show all functions in the same compilation unit.
Edit: The compiler cannot know whether functions in other units use the global or not. Therefore I am assuming the above conditions.
I found these two other questions with information related to this topic but they don't address it head on or they give information that I find suspect:
Are global variables refreshed between function calls?
When do I need to use volatile in ISRs?
[..] in any function, the global must be read before its first use.
Definitely not:
static int variable;
void foo(void) {
variable = 42;
}
Why should the compiler bother generating code to read the variable?
The compiler must assure that global variables are written to memory before any function call or before the return from a function.
No, why should it?
void bar(void) {
return;
}
void baz(void) {
variable = 42;
bar();
}
bar is a pure function (should be determinable for a decent compiler), so there's no chance of getting any different behaviour when writing to memory after the function call.
The case of "before returning from a function" is tricky, though. But I think the general statement ("must") is false if we count inlined (static) functions, too.
Could the compiler treat functions in the same compilation unit differently than ones that aren't?
Yes, I think so: for a static function (whose address is never taken) the compiler knows exactly how it is used, and this information could be used to apply some more radical optimisations.
I'm basing all of the above on the C version of the As-If rule, specified in §5.1.2.3/6 (N1570):
The least requirements on a conforming implementation are:
Accesses to volatile objects are evaluated strictly according to the rules of the abstract machine.
At program termination, all data written into files shall be identical to the result that execution of the program according to the abstract semantics would have produced.
The input and output dynamics of interactive devices shall take place as specied in 7.21.3. The intent of these requirements is that unbuffered or line-buffered output appear as soon as possible, to ensure that prompting messages actually appear prior to a program waiting for input.
This is theobservable behaviorof the program.
In particular, you might want to read the following "EXAMPLE 1".
I know that it's poor practice to not include function prototypes, but if you don't, then the compiler will infer a prototype based on what you pass into the function when you call it (according to this answer). My question is why does the compiler infer the prototype from what you pass into the function rather than the definition of the function itself? I can imagine some kind of preprocessing step where all declared functions are identified and checked to see if a prototype exists for each one. If one doesn't have a prototype, the first line of the function is copied and stuck under the existing prototypes. Why isn't this done?
Because the C compiler was designed as a single pass compiler, where any given file does not know about the other source files that make up the project.
Although compilers have gotten more sophisticated, and may do multiple passes, the general outline of the compilation process framework remains as it was in K&R's day:
Pre-process each source file(macro text replacement only).
Compile the processed source into an object file.
Link the objects into an executable or library.
Inferring prototypes would have to happen in the first step, but the compiler does not know about the existence of any other objects which may contain the function definition at that time.
It might be possible to make a compiler which did what you suggest, but not without breaking the existing rules for how to infer prototypes. A change with such big consequences would make the language no longer C.
The major use for prototypes is to declare a function and inform the compiler about the number and type of arguments in cases where the definition is not visible. Since C was originally compiled single-pass, the definition is not visible when it occurs later in the translation unit, but the more important case from a modern perspective is when the definition is not visible at all, due to lying in a separate translation unit, possibly even in a library file that exists only in compiled form and where no information about the function's type is recorded.
I'm following a guide to learn curses, and all of the C code within prototypes functions before main(), then defines them afterward. In my C++ learnings, I had heard about function prototyping but never done it, and as far as I know it doesn't make too much of a difference on how the code is compiled. Is it a programmer's personal choice more than anything else? If so, why was it included in C at all?
Function prototyping originally wasn't included in C. When you called a function, the compiler just took your word for it that it would exist and took the type of arguments you provided. If you got the argument order, number, or type wrong, too bad – your code would fail, possibly in mysterious ways, at runtime.
Later versions of C added function prototyping in order to address these problems. Your arguments are implicitly converted to the declared types under some circumstances or flagged as incompatible with the prototype, and the compiler could flag as an error the wrong order and number of types. This had the side effect of enabling varargs functions and the special argument handling they require.
Note that, in C (and unlike in C++), a function declared foo_t func() is not the same as a function declared as foo_t func(void). The latter is prototyped to have no arguments. The former declares a function without a prototype.
In C prototyping is needed so that your program knows that you have a function called x() when you have not gotten to defining it, that way y() knows that there is and exists a x(). C does top down compilation, so it needs to be defined before hand is the short answer.
x();
y();
main(){
}
y(){
x();
}
x(){
...
more code ...
maybe even y();
}
I was under the impression that it was so customers could have access to the .h file for libraries and see what functions were available to them, without having to see the implementation (which would be in another file).
Useful to see what the function returns/what parameters.
Function prototyping is a remnant from the olden days of compiler writing. It used to be considered horribly inefficient for a compiler to have to make multiple passes over a source file to compile it.
In C, in certain contexts, referring to a function in one manner is syntactically equivalent to referring to a variable: consider taking a pointer to a function versus taking a pointer to a variable. In the compiler's intermediate representation, the two are semantically distinct, but syntactically, whether an identifier is a variable, a function name, or an invalid identifier cannot be determined from the context.
Since it's not determinable from the context, without function prototypes, the compiler would need to make an extra pass over each one of your source files each time one of them compiles. This would add an extra O(n) factor for any compilation (that is, if compilation were O(m), it would now be O(m*n)), where n is the number of files in your project. In large projects, where compilation is already on the order of hours, having a two-pass compiler is highly undesirable.
Forward declaring all your functions would allow the compiler to build a table of functions as it scanned the file, and be able to determine when it encountered an identifier whether it referred to a function or a variable.
As a result of this, C (and by extension, C++) compilers can be extremely efficient in compilation.
It allows you to have a situation in which say you can have an iterator class defined in a separate .h file which includes the parent container class. Since you've included the parent header in the iterator, you can't have a method like say "getIterator()" because the return type would have to be the iterator class and therefore it would require that you include the iterator header inside the parent header creating a cyclic loop of inclusions (one includes the other which includes itself which includes the other again, etc.).
If you put the iterator class prototype inside the parent container, you can have such a method without including the iterator header. It only works because you're simply saying that such an object exists and will be defined.
There are ways of getting around it like having a precompiled header, but in my opinion it's less elegant and comes with a slew of disadvantages. Of couurse this is C++, not C. However, in practice you might have a situation in which you'd like to arrange code in this fashion, classes aside.