Retaining Compatibility To Assembly With inline Functions - c

I'm writing some header files, which are to be accessed by both C code and assembly. Assembly code is preprocessed with the C preprocessor for this sake.
The problem is I have plenty of inline functions in those header files. The assembler cannot process functions, which are not symbols in an object file (as with static inline functions), so I cannot use those. I've read this and this invaluable posts and have grasped how to use extern and static in conjunction with inline by now but I am unsure about how to make inline function accessible to both C code and assembly.
My current approach is to write inline functions (with >= GNU99, -O3 inlines the function, anything else calls an external definition of that function, which I need to define explicitly) in a header file and write external definitions in an implementation file. The C code includes the header file (the inline functions) compiles with -O3, thus using the inlined versions. The assembly code uses the external definitions.
Questions:
The assembly code can only call the functions, inlining is currently impossible. Can assembly code, by any means, make use of inlining? I mean as in an .S file, not inline assembly.
extern inline would be similarly good as my current method but it boils down to just one definition (the external definition is emitted automatically), so it cannot be divided into header and source file, which is crucial to make it accessible to C code (header) and assembly (source).
Is there any better method to achieve what I've been trying to do?

The overhead of a call forcing you to assume most registers are clobbered is pretty high. For high performance you need to manually inline your functions into asm so you can fully optimize everything.
Getting the compiler to emit a stand-alone definition and calling it should only be considered for code that's not performance-critical. You didn't say what you're writing in asm, or why, but I'm assuming that it is performance critical. Otherwise you'd just write it in C (with inline asm for any special instructions, I guess?).
If you don't want to manually inline, and you want to use these small inline C functions inside a loop, you'll probably get better performance from writing the whole thing in C. That would let the compiler optimize across a lot more code.
The register-arg calling conventions used for x86-64 are nice, but there are a lot of registers that are call-clobbered, so calls in the middle of computing stuff stop you from keeping as much data live in registers.
Can assembly code, by any means, make use of inlining? I mean as in an
.S file, not inline assembly.
No, there's no syntax for the reverse of inline-asm. If there was, it would be something like: you tell the compiler what registers the inputs are in, what registers you want outputs in, and which registers it's allowed to clobber.
Common-subexpression-elimination and other significant optimizations between the hand-written asm and the compiler output wouldn't be possible without a compiler that really understood the hand-written asm, or treated it as source code and then emitted an optimized version of the whole thing.
Optimal inlining of compiler output into asm will typically require adjustments to the asm, which is why there aren't any programs to do it.
Is there any better method to achieve what I've been trying to do?
Now that you've explained in comments what your goals are: make small wrappers in C for the special instructions you want to use, instead of the other way around.
#include <stdint.h>
struct __attribute__((packed)) lgdt_arg {
uint16_t limit;
void * base; // FIXME: always 64bit in long mode, including the x32 ABI where pointers and uintptr_t are 32bit.
// In 16bit mode, base is 24bit (not 32), so I guess be careful with that too
// you could just make this a uint64_t, since x86 is little-endian.
// The trailing bytes don't matter since the instruction just uses a pointer to the struct.
};
inline void lgdt (const struct lgdt_arg *p) {
asm volatile ("lgdt %0" : : "m"(*p) : "memory");
}
// Or this kind of construct sometimes gets used to make doubly sure compile-time reordering doesn't happen:
inline void lgdt_v2 (struct lgdt_arg *p) {
asm volatile ("lgdt %0" : "+m"(*(volatile struct lgdt_arg *)p) :: "memory");
}
// that puts the asm statement into the dependency chain of things affecting the contents of the pointed-to struct, so the compiler is forced to order it correctly.
void set_gdt(unsigned size, char *table) {
struct lgdt_arg tmp = { size, table };
lgdt (&tmp);
}
set_gdt compiles to (gcc 5.3 -O3 on godbolt):
movw %di, -24(%rsp)
movq %rsi, -22(%rsp)
lgdt -24(%rsp)
ret
I've never written code involving lgdt. It's probably a good idea to use a "memory" clobber like I did to make sure any loads/stores aren't reordered across it at compile time. That will make sure the GDT it points to might is fully initialized before running LGDT. (Same for LIDT). Compilers might notice the that base gives the inline asm a reference to the GDT, and make sure its contents are in sync, but I'm not sure. There should be little to no downside to just using a "memory" clobber here.
Linux (the kernel) uses this sort of wrapper around an instruction or two all over the place, writing as little code as possible in asm. Look there for inspiration if you want.
re: your comments: yes you'll want to write your boot sector in asm, and maybe some other 16bit code since gcc's -m16 code is silly (still basically 32bit code).
No, there's no way to inline C compiler output into asm other than manually. That's normal and expected, for the same reason there aren't programs that optimize assembly. (i.e. read asm source, optimize, write different asm source).
Think about what such a program would have to do: it would have to understand the hand-written asm to be able to know what it could change without breaking the hand-written asm. Asm as a source language doesn't give an optimizer much to work with.

The answer you linked to explains how C99 inline functions work but don't explain why the definition is that quirky. The relevant standard paragraph is ISO 9899:2011 §6.7.4 ¶6–7 (ISO 9899:1999 ibid.):
6 A function declared with an inline function specifier is an inline function. Making a function an inline function suggests that calls to the function be as fast as possible.138) The extent to which such suggestions are effective is implementation-defined. 139)
7 Any function with internal linkage can be an inline function. For a function with external linkage, the following restrictions apply: If a function is declared with an inline function specifier, then it shall also be defined in the same translation unit. If all of the file scope declarations for a function in a translation unit include the inline function specifier without extern, then the definition in that translation unit is an inline
definition. An inline definition does not provide an external definition for the function, and does not forbid an external definition in another translation unit. An inline definition provides an alternative to an external definition, which a translator may use to implement any call to the function in the same translation unit. It is unspecified whether a call to the function uses the inline definition or the external definition.140)
138) By using, for example, an alternative to the usual function call mechanism, such as ”inline substitution”. Inline substitution is not textual substitution, nor does it create a new function. Therefore, for example, the expansion of a macro used within the body of the function uses the definition it had at the point the function body appears, and not where the function is called; and identifiers refer to the declarations in scope where the body occurs. Likewise, the function has a single address, regardless of the number of inline definitions that occur in addition to the external definition.
139) For example, an implementation might never perform inline substitution, or might only perform inline substitutions to calls in the scope of an inline declaration.
140) Since an inline definition is distinct from the corresponding external definition and from any other corresponding inline definitions in other translation units, all corresponding objects with static storage duration are also distinct in each of the definitions.
How does the definition of inline come into play? Well, if only inline declarations (without extern or static) of a function exist in a translation unit, no code for the funcion is emitted. But if a single declaration without inline or with extern exists, then code for the function is emitted, even if it is defined as an inline function. This design aspect allows you to describe the module that contains the machine code for an inline function without having to duplicate the implementation:
In your header file, place inline definitions:
fast_things.h
/* TODO: add assembly implementation */
inline int fast_add(int a, int b)
{
return (a + b);
}
inline int fast_mul(int a, int b)
{
return (a * b);
}
This header can be included in every translation module and provides inline definitions for fast_add and fast_mul. To generate the machine code for these two, add this file:
fast_things.c
#include "fast_things.h"
extern inline int fast_add(int, int);
extern inline int fast_mul(int, int);
You can avoid typing all of this out using some macro magic. Change fast_things.h like this:
#ifndef EXTERN_INLINE
#define EXTERN_INLINE_UNDEFINED
#define EXTERN_INLINE inline
#endif
EXTERN_INLINE int fast_add(int a, int b)
{
return (a + b);
}
EXTERN_INLINE int fast_mul(int a, int b)
{
return (a * b);
}
#ifdef EXTERN_INLINE_UNDEFINED
#undef EXTERN_INLINE
#undef EXTERN_INLINE_UNDEFINED
#endif
Then fast_things.c simply becomes:
#define EXTERN_INLINE extern inline
#include "fast_things.h"
Since code is emitted for the inline functions, you can call them from assembly just fine. You cannot however inline them in assembly as the assembler doesn't speak C.
There are also static inline functions which might be more suitable for your purpose (i.e. tiny helper functions) when you can make reasonably sure that they are always inlined.
The GNU assembler supports macros in its custom macro language. One possibility is to write a custom preprocessor that takes your inline assembly and emits both gcc-style inline assembly for C and gas macros. This should be possible with sed, m4, or awk (in descending order of difficulty). It might also be possible to abuse the C preprocessors stringify (#) operator for this; if you can give me a concrete example, I could try to throw something together.

Related

"inline" directive doesn't work (Pure C) [duplicate]

I defined my function in .c (without header declaration) as here:
inline int func(int i) {
return i+1;
}
Then in the same file below I use it:
...
i = func(i);
And during the linking I got "undefined reference to 'func'". Why?
The inline model in C99 is a bit different than most people think, and in particular different from the one used by C++
inline is only a hint such that the compiler doesn't complain about doubly defined symbols. It doesn't guarantee that a function is inlined, nor actually that a symbol is generated, if it is needed. To force the generation of a symbol you'd have to add a sort of instantiation after the inline definition:
int func(int i);
Usually you'd have the inline definition in a header file, that is then included in several .c files (compilation units). And you'd only have the above line in exactly one of the compilation units. You probably only see the problem that you have because you are not using optimization for your compiler run.
So, your use case of having the inline in the .c file doesn't make much sense, better just use static for that, even an additional inline doesn't buy you much.
C99 inline semantics are often misunderstood. The inline specifier serves two purposes:
First, as a compiler hint in case of static inline and extern inline declarations. Semantics remain unchanged if you remove the specifier.
Second, in case of raw inline (ie without static or extern) to provide an inline definition as an alternative to an external one, which has to be present in a different translation unit. Not providing the external one is undefined behaviour, which will normally manifest as linking failure.
This is particularly useful if you want to put a function into a shared library, but also make the function body available for optimization (eg inlining or specialization). Assuming a sufficiently smart compiler, this allows you to recover many of the benefits of C++ templates without having to jump through preprocessor hoops.
Note that it's a bit more messy than I described here as having another file scope non-inline external declaration will trigger the first case as described in Jens' answer, even if the definition itself is inline instead of extern inline. This is by design so you can have have a single inline definition in a header file, which you can include into the source file that provides the external one by adding a single line for the external declaration.
This is because of the way GCC handle inline function. GCC performs inline substitution as the part of optimization.
To remove this error use static before inline. Using static keyword force the compiler to inline this function, which makes the program compile successfully.
static inline int func(int i) {
return i+1;
}
...
i = func(i);

What is the use of the `inline` keyword in C?

I read several questions in stackoverflow about inline in C but still am not clear about it.
static inline void f(void) {} has no practical difference with static void f(void) {}.
inline void f(void) {} in C doesn't work as the C++ way. How does it work in C?
What actually does extern inline void f(void); do?
I never really found a use of the inline keyword in my C programs, and when I see this keyword in other people's code, it's almost always static inline, in which I see no difference with just static.
A C code can be optimized in two ways: For Code size and for Execution Time.
inline functions:
gcc.gnu.org says,
By declaring a function inline, you can direct GCC to make calls to that function faster. One way GCC can achieve this is to integrate that function's code into the code for its callers. This makes execution faster by eliminating the function-call overhead; in addition, if any of the actual argument values are constant, their known values may permit simplifications at compile time so that not all of the inline function's code needs to be included. The effect on code size is less predictable; object code may be larger or smaller with function inlining, depending on the particular case.
So, it tells the compiler to build the function into the code where it is used with the intention of improving execution time.
If you declare Small functions like setting/clearing a flag or some bit toggle which are performed repeatedly, inline, it can make a big performance difference with respect to time, but at the cost of code size.
non-static inline and Static inline
Again referring to gcc.gnu.org,
When an inline function is not static, then the compiler must assume that there may be calls from other source files; since a global symbol can be defined only once in any program, the function must not be defined in the other source files, so the calls therein cannot be integrated. Therefore, a non-static inline function is always compiled on its own in the usual fashion.
extern inline?
Again, gcc.gnu.org, says it all:
If you specify both inline and extern in the function definition, then the definition is used only for inlining. In no case is the function compiled on its own, not even if you refer to its address explicitly. Such an address becomes an external reference, as if you had only declared the function, and had not defined it.
This combination of inline and extern has almost the effect of a macro. The way to use it is to put a function definition in a header file with these keywords, and put another copy of the definition (lacking inline and extern) in a library file. The definition in the header file causes most calls to the function to be inlined. If any uses of the function remain, they refer to the single copy in the library.
To sum it up:
For inline void f(void){},
inline definition is only valid in the current translation unit.
For static inline void f(void) {}
Since the storage class is static, the identifier has internal linkage and the inline definition is invisible in other translation units.
For extern inline void f(void);
Since the storage class is extern, the identifier has external linkage and the inline definition also provides the external definition.
Note: when I talk about .c files and .h files in this answer, I assume you have laid out your code correctly, i.e. .c files only include .h files. The distinction is that a .h file may be included in multiple translation units.
static inline void f(void) {} has no practical difference with static void f(void) {}.
In ISO C, this is correct. They are identical in behaviour (assuming you don't re-declare them differently in the same TU of course!) the only practical effect may be to cause the compiler to optimize differently.
inline void f(void) {} in C doesn't work as the C++ way. How does it work in C? What actually does extern inline void f(void); do?
This is explained by this answer and also this thread.
In ISO C and C++, you can freely use inline void f(void) {} in header files -- although for different reasons!
In ISO C, it does not provide an external definition at all. In ISO C++ it does provide an external definition; however C++ has an additional rule (which C doesn't), that if there are multiple external definitions of an inline function, then the compiler sorts it out and picks one of them.
extern inline void f(void); in a .c file in ISO C is meant to be paired with the use of inline void f(void) {} in header files. It causes the external definition of the function to be emitted in that translation unit. If you don't do this then there is no external definition, and so you may get a link error (it is unspecified whether any particular call of f links to the external definition or not).
In other words, in ISO C you can manually select where the external definition goes; or suppress external definition entirely by using static inline everywhere; but in ISO C++ the compiler chooses if and where an external definition would go.
In GNU C, things are different (more on this below).
To complicate things further, GNU C++ allows you to write static inline an extern inline in C++ code... I wouldn't like to guess on what that does exactly
I never really found a use of the inline keyword in my C programs, and when I see this keyword in other people's code, it's almost always static inline
Many coders don't know what they're doing and just put together something that appears to work. Another factor here is that the code you're looking at might have been written for GNU C, not ISO C.
In GNU C, plain inline behaves differently to ISO C. It actually emits an externally visible definition, so having a .h file with a plain inline function included from two translation units causes undefined behaviour.
So if the coder wants to supply the inline optimization hint in GNU C, then static inline is required. Since static inline works in both ISO C and GNU C, it's natural that people ended up settling for that and seeing that it appeared to work without giving errors.
, in which I see no difference with just static.
The difference is just in the intent to provide a speed-over-size optimization hint to the compiler. With modern compilers this is superfluous.
From 6.7.4 Function specifiers in C11 specs
6 A function declared with an inline function specifier is an inline
function. Making a function an inline function suggests that calls to
the function be as fast as possible.138)The extent to which
such suggestions are effective is
implementation-defined.139)
138) By using, for example, an alternative to the usual function call
mechanism, such as inline substitution. Inline substitution is not
textual substitution, nor does it create a new function. Therefore,
for example, the expansion of a macro used within the body of the
function uses the definition it had at the point the function body
appears, and not where the function is called; and identifiers refer
to the declarations in scope where the body occurs. Likewise, the
function has a single address, regardless of the number of inline
definitions that occur in addition to the external
definition.
139) For example, an implementation might
never perform inline substitution, or might only perform inline
substitutions to calls in the scope of an inline declaration.
It suggests compiler that this function is widely used and requests to prefer speed in invocation of this function. But with modern intelligent compiler this may be more or less irrelevant as compilers can decide whether a function should be inlined and may ignore the inline request from users, because modern compilers can very effectively decide about how to invoke the functions.
static inline void f(void) {} has no practical difference with static
void f(void) {}.
So yes with modern compilers most of the time none. With any compilers there are no practical / observable output differences.
inline void f(void) {} in C doesn't work as the C++ way. How does it
work in C?
A function that is inline anywhere must be inline everywhere in C++ and linker does not complain multiple definition error (definition must be same).
What actually does extern inline void f(void); do?
This will provide external linkage to f. Because the f may be present in other compilation unit, a compiler may choose different call mechanism to speed up the calls or may ignore the inline completely.
A function where all the declarations (including the definition) mention inline and never extern.
There must be a definition in the same translation unit. The standard refers to this as an inline definition.
No stand-alone object code is emitted, so this definition can't be called from another translation unit.
In this example, all the declarations and definitions use inline but not extern:
// a declaration mentioning inline
inline int max(int a, int b);
// a definition mentioning inline
inline int max(int a, int b) {
return a > b ? a : b;
}
Here is a reference which can give you more clarity on the inline functions in C & also on the usage of inline & extern.
If you understand where they come from then you'll understand why they are there.
Both "inline" and "const" are C++ innovations that were eventually retrofit into C. One of the design goals implicit in these innovations, as well as later innovations, like template's and even lambda's, was to carve out the most common use-cases for the pre-processor (particularly, of "#define"), so as to minimize the use of and need for the pre-processor phase.
The occurrence of a pre-processor phase in a language severely limits the ability to provide transparency in the analysis of and translation from a language. This turned what ought to have been easy translation shell scripts into more complicated programs, such as "f2c" (Fortran to C) and the original C++ compiler "cfront" (C++ to C); and to a lesser degree, the "indent" utility. If you've ever had to deal with the translation output of convertors like these (and we have) or with actually making your own translators, then you'll know how much of an issue this is.
The "indent" utility, by the way, balks on the whole issue and just wings it, compromising by just treating macros calls as ordinary variables or function calls, and passing over "#include"'s. The issue will also arise with other tools that may want to do source-to-source conversion/translation, like automated re-engineering, re-coding and re-factoring tools; that is, things that more intelligently automate what you, the programmer, do.
So, the ideal is to reduce dependency on the pre-processor phase to a bare minimum. This is a goal that is good in its own right, independently of how the issue may have been encountered in the past.
Over time, as more and more of the use-cases became known and even standardized in their usage, they were encapsulated formally as language innovations.
One common use-case of "#define" to create manifest constants. To a large extent, this can now be handled be the "const" keyword and (in C++) "constexpr".
Another common use-case of "#define" is to create functions with macros. Much of this is now encapsulated by the "inline" function, and that's what it's meant to replace. The "lambda" construct takes this a step further, in C++.
Both "const" and "inline" were present in C++ from the time of its first external release - release E in February 1985. (We're the ones who transcribed and restored it. Before 2016, it only existed as a badly-clipped printout of several hundred pages.)
Other innovations were added later, like "template" in version 3.0 of cfront (having been accepted in the ANSI X3J16 meeting in 1990) and the lambda construct and "constexpr" much more recently.
As the word "Inline" say "In" "Line", adding this keyword to a function affects the program in runtime, when a program is compiled, the code written inside a function is pasted under the function call, as function calls are more costly than inline code, so this optimizes the code.
So, static inline void f(void) {} and static void f(void) {}, here the inline keyword does make a difference in runtime. But when the function has too many lines of code then it won't affect runtime.
If you add static before a function, the function's lifetime is the lifetime of the whole program. And that function use is restricted to that file only.
To know about extern you can refer to - Effects of the extern keyword on C functions

What's is the idea behind C99 inline?

I am confused about inline in C99.
Here is what I want:
I want my function get inlined everywhere, not just limited in one translation unit (or one compilation unit, a .c file).
I want the address of the function consistent. If I save the address of the function in a function pointer, I want the function callable from the pointer, and I don't want duplication of the same function in different translation units (basically, I mean no static inline).
C++ inline does exactly this.
But (and please correct me if I am wrong) in C99 there is no way to get this behavior.
I could have use static inline, but it leads to duplication (the address of the same function in different translation unit is not the same). I don't want this duplication.
So, here are my questions:
What is idea behind inline in C99?
What benefits does this design give over C++'s approach?
References:
Here's a link that speaks highly of C99 inline, but I don't understand why. Is this “only in exactly one compilation unit” restriction really that nice?http://gustedt.wordpress.com/2010/11/29/myth-and-reality-about-inline-in-c99/
Here's the Rationale for C99 inline. I've read it, but I don't understand it.Is "inline" without "static" or "extern" ever useful in C99?
A nice post, provides strategies for using inline functions.http://www.greenend.org.uk/rjk/tech/inline.html
Answers Summary
How to get C++ inline behavior in C99 (Yes we can)
head.h
#ifndef __HEAD_H__
#define __HEAD_H__
inline int my_max(int x, int y) {
return (x>y) ? (x) : (y);
}
void call_and_print_addr();
#endif
src.c
#include "head.h"
#include <stdio.h>
// This is necessary! And it should occurs and only occurs in one [.c] file
extern inline int my_max(int x, int y);
void call_and_print_addr() {
printf("%d %u\n", my_max(10, 100), (unsigned int)my_max);
}
main.c
#include <stdio.h>
#include "head.h"
int main() {
printf("%d %u\n", my_max(10, 100), (unsigned int)my_max);
call_and_print_addr();
return 0;
}
Compile it with: gcc -O3 main.c src.c -std=c99
Check the assembly with: gcc -O3 -S main.c src.c -std=c99, You'll find that my_max is inlined in both call_and_print_addr() and main().
Actually, this is exactly the same instructions given by ref 1 and ref 3. And what's wrong with me?
I used a too old version of GCC (3.4.5) to experiment, it give me “multiple definition of my_max” error message, and this is the real reason why I am so confused. Shame.
Difference between C99 and C++ inline
Actually you can compile the example above by g++: g++ main.c src.c
extern inline int my_max(int x, int y);
is redundant in C++, but necessary in C99.
So what does it do in C99?
Again, use gcc -O3 -S main.c src.c -std=c99, you'll find something like this in src.s:
_my_max:
movl 4(%esp), %eax
movl 8(%esp), %edx
cmpl %eax, %edx
cmovge %edx, %eax
ret
.section .rdata,"dr"
If you cut extern inline int my_max(int x, int y); and paste it into main.c, you'll find these assembly code in main.s.
So, by extern inline, you tell the compiler where the true function my_max(), which you can call it by its address, will be defined and compiled.
Now look back in C++, we can't specify it. We will never know where my_max() will be, and this is the “vague linkage” by #Potatoswatter.
As is said by #Adriano, most of the time, we don't care about this detail, but C99 really removes the ambiguity.
To get C++-like behavior, you need to give each TU with potentially-inlined calls an inline definition, and give one TU an externally-visible definition. This is exactly what is illustrated by Example 1 in the relevant section (Function specifiers) of the C standard. (In that example, external visibility is retroactively applied to an inline definition by declaring the function extern afterward: this declaration could be done in the .c file after the definition in the .h file, which turns usual usage on its head.)
If inlining could be accomplished literally everywhere, you wouldn't need the extern function. Non-inlined calls are used, however, in contexts such as recursion and referencing the function address. You may get "always inline" semantics, in a sense, by omitting the extern parts, however this can arbitrarily fail for any simple function call because the standard does not demand that a call be inlined just because there is no alternative. (This is the subject of the linked question.)
C++ handles this with the implementation concept of "vague linkage"; this isn't specified in the standard but it is very real, and tricky, inside the compiler. C compilers are supposed to be easier to write than C++; I believe this accounts for the difference between the languages.
I want my function get inlined everywhere, not just limited in one translation unit(or one compile unit, a [.c] file).
With inline you politely ask your compiler to inline your function (if it has time and mood). It's unrelated to one compilation unit, at best it may even get inlined in every single call site and it won't have a body anywhere (and its code will be duplicated everywhere). It's purpose of inlining, speed in favor of size.
I want the address of the function consistent. If I save the address of the function in a function pointer, I want the function callable from the pointer, and I don't want duplication of the same function in different translation unit. (Basically, I mean no 'static inline')
Again you can't. If function is inlined then there is not any function pointer to it. Of course compiler will need a compilation unit where function will stay (because, well yes, you may need a function pointer or sometimes it may decide to do not inline that function in a specific call site).
From your description it seems that static inline is good. IMO it's not, a function body (when used, see above paragraph) in each compilation unit will lead to code duplication (and problem in comparison of function pointers because each compilation unit will have its own version of your function). It's here that C99 did something pretty good: you declare exactly one place to put function body (when and if required). Compiler won't do it for you (if you ever care about it) and there is nothing left to implementor.
What is idea behind inline in C99?
Pick a good thing (inline functions) but remove ambiguity (each C++ compiler did his own job about where function body has to stay).
What benefits does this design give over C++'s approach?
Honestly I can't see such big problem (even article you linked is pretty vague about this benefit). In a modern compiler you won't see any issue and you will never care about that. Why it's good what C did? IMO because it removed an ambiguity even if - frankly speaking - I'd prefer my compiler does that for me when I don't care about it (99.999%, I suppose).
That said, but I may be wrong, C and C++ have different targets. If you're using C (not C++ without classes and few C++ features) then probably you want to address this kind of details because they matters in your context so C and C++ had to diverge about that. There is not a better design: just different decision for a different audience.

C inline functions and "undefined external" error

I'm trying to replace some macro subroutines with inline functions, so the compiler can optimize them, so the debugger can step into them, etc. If I define them as normal functions it works:
void do_something(void)
{
blah;
}
void main(void)
{
do_something();
}
but if I define them as inline:
inline void do_something(void)
{
blah;
}
void main(void)
{
do_something();
}
it says "Error: Undefined external". What does that mean? Taking a stab in the dark, I tried
static inline void do_something(void)
{
blah;
}
void main(void)
{
do_something();
}
and no more errors. The function definition and call to the function are in the same .c file.
Can someone explain why one works and the other doesn't?
(Second related question: Where do I put inline functions if I want to use them in more than one .c file?)
First, the compiler does not always inline functions marked as inline; eg if you turn all optimizations off it will probably not inline them.
When you define an inline function
inline void do_something(void)
{
blah
}
and use that function, even in the same file, the call to that function is resolved by the linker not the compiler, because it is implicitely "extern". But this definition alone does not provide an external definition of the function.
If you include a declaration without inline
void do_something(void);
in a C file which can see the inline definition, the compiler will provide an external definition of the function, and the error should go away.
The reason static inline works is that it makes the function visible only within that compilatioin unit, and so allows the compiler to resolve the call to the function (and optimize it) and emit the code for the function within that compilation unit. The linker then doesn't have to resolve it, so there is no need for an external definition.
The best place to put inline function is in a header file, and declare them static inline. This removes any need for an external definition, so it resolves the linker problem. However, this causes the compiler to emit the code for the function in every compilation unit that uses it, so could result in code bloat. But since the function is inline, it is probably small anyway, so this usually isn't a problem.
The other option is to define it as extern inline in the header, and in one C file provide and extern declaration without the inline modifier.
The gcc manual explains it thus:
By declaring a function inline, you can direct GCC to make calls to
that function faster. One way GCC can achieve this is to integrate
that function's code into the code for its callers. This makes
execution faster by eliminating the function-call overhead; in
addition, if any of the actual argument values are constant, their
known values may permit simplifications at compile time so that not
all of the inline function's code needs to be included. The effect on
code size is less predictable; object code may be larger or smaller
with function inlining, depending on the particular case. You can
also direct GCC to try to integrate all "simple enough" functions into
their callers with the option -finline-functions.
GCC implements three different semantics of declaring a function
inline. One is available with -std=gnu89 or -fgnu89-inline or
when gnu_inline attribute is present on all inline declarations,
another when -std=c99, -std=c1x, -std=gnu99 or -std=gnu1x
(without -fgnu89-inline), and the third is used when compiling C++.
To declare a function inline, use the inline keyword in its
declaration, like this:
static inline int
inc (int *a)
{
return (*a)++;
}
If you are writing a header file to be included in ISO C90 programs,
write __inline__ instead of inline.
The three types of inlining behave similarly in two important cases:
when the inline keyword is used on a static function, like the
example above, and when a function is first declared without using the
inline keyword and then is defined with inline, like this:
extern int inc (int *a);
inline int
inc (int *a)
{
return (*a)++;
}
In both of these common cases, the program behaves the same as if you
had not used the inline keyword, except for its speed.
When a function is both inline and static, if all calls to the
function are integrated into the caller, and the function's address is
never used, then the function's own assembler code is never
referenced. In this case, GCC does not actually output assembler code
for the function, unless you specify the option
-fkeep-inline-functions. Some calls cannot be integrated for various
reasons (in particular, calls that precede the function's definition
cannot be integrated, and neither can recursive calls within the
definition). If there is a nonintegrated call, then the function is
compiled to assembler code as usual. The function must also be
compiled as usual if the program refers to its address, because that
can't be inlined.
Note that certain usages in a function definition can make it
unsuitable for inline substitution. Among these usages are: use of
varargs, use of alloca, use of variable sized data types , use of computed goto,
use of nonlocal goto, and nested functions.
Using -Winline will warn when a function marked inline could not
be substituted, and will give the reason for the failure.
As required by ISO C++, GCC considers member functions defined within
the body of a class to be marked inline even if they are not
explicitly declared with the inline keyword. You can override this
with -fno-default-inline.
GCC does not inline any functions when not optimizing unless you
specify the always_inline attribute for the function, like this:
/* Prototype. */
inline void foo (const char) __attribute__((always_inline));
The remainder of this section is specific to GNU C90 inlining.
When an inline function is not static, then the compiler must
assume that there may be calls from other source files; since a global
symbol can be defined only once in any program, the function must not
be defined in the other source files, so the calls therein cannot be
integrated. Therefore, a non-static inline function is always
compiled on its own in the usual fashion.
If you specify both inline and extern in the function definition,
then the definition is used only for inlining. In no case is the
function compiled on its own, not even if you refer to its address
explicitly. Such an address becomes an external reference, as if you
had only declared the function, and had not defined it.
This combination of inline and extern has almost the effect of a
macro. The way to use it is to put a function definition in a header
file with these keywords, and put another copy of the definition
(lacking inline and extern) in a library file. The definition in
the header file will cause most calls to the function to be inlined.
If any uses of the function remain, they will refer to the single copy
in the library.
For inline functions to work with C99 (they only came there into the language) you'd have to give the definition in a header file
inline void do_something(void)
{
blah
}
and in one compilation unit (aka .c) you place some sort of "instantiation"
void do_something(void);
without the inline.
You have to put them in a header file if you want to use them from multiple files.
And for the linker error: the default declaration of a function implies that it's "extern", but since it's inlined, the linker can find the compiler-generated symbol stub, hence the error.

What's the difference between "static" and "static inline" function?

IMO both make the function to have a scope of the translation unit only.
What's the difference between "static" and "static inline" function?
Why should inline be put in a header file, not in .c file?
By default, an inline definition is only valid in the current translation unit.
If the storage class is extern, the identifier has external linkage and the inline definition also provides the external definition.
If the storage class is static, the identifier has internal linkage and the inline definition is invisible in other translation units.
If the storage class is unspecified, the inline definition is only visible in the current translation unit, but the identifier still has external linkage and an external definition must be provided in a different translation unit. The compiler is free to use either the inline or the external definition if the function is called within the current translation unit.
As the compiler is free to inline (and to not inline) any function whose definition is visible in the current translation unit (and, thanks to link-time optimizations, even in different translation units, though the C standard doesn't really account for that), for most practical purposes, there's no difference between static and static inline function definitions.
The inline specifier (like the register storage class) is only a compiler hint, and the compiler is free to completely ignore it. Standards-compliant non-optimizing compilers only have to honor their side-effects, and optimizing compilers will do these optimizations with or without explicit hints.
inline and register are not useless, though, as they instruct the compiler to throw errors when the programmer writes code that would make the optimizations impossible: An external inline definition can't reference identifiers with internal linkage (as these would be unavailable in a different translation unit) or define modifiable local variables with static storage duration (as these wouldn't share state accross translation units), and you can't take addresses of register-qualified variables.
Personally, I use the convention to mark static function definitions within headers also inline, as the main reason for putting function definitions in header files is to make them inlinable.
In general, I only use static inline function and static const object definitions in addition to extern declarations within headers.
I've never written an inline function with a storage class different from static.
inline instructs the compiler to attempt to embed the function content into the calling code instead of executing an actual call.
For small functions that are called frequently that can make a big performance difference.
However, this is only a "hint", and the compiler may ignore it, and most compilers will try to "inline" even when the keyword is not used, as part of the optimizations, where its possible.
for example:
static int Inc(int i) {return i+1};
.... // some code
int i;
.... // some more code
for (i=0; i<999999; i = Inc(i)) {/*do something here*/};
This tight loop will perform a function call on each iteration, and the function content is actually significantly less than the code the compiler needs to put to perform the call. inline will essentially instruct the compiler to convert the code above into an equivalent of:
int i;
....
for (i=0; i<999999; i = i+1) { /* do something here */};
Skipping the actual function call and return
Obviously this is an example to show the point, not a real piece of code.
static refers to the scope. In C it means that the function/variable can only be used within the same translation unit.
From my experience with GCC I know that static and static inline differs in a way how compiler issue warnings about unused functions. More precisely when you declare static function and it isn't used in current translation unit then compiler produce warning about unused function, but you can inhibit that warning with changing it to static inline.
Thus I tend to think that static should be used in translation units and benefit from extra check compiler does to find unused functions. And static inline should be used in header files to provide functions that can be in-lined (due to absence of external linkage) without issuing warnings.
Unfortunately I cannot find any evidence for that logic. Even from GCC documentation I wasn't able to conclude that inline inhibits unused function warnings. I'd appreciate if someone will share links to description of that.
One difference that's not at the language level but the popular implementation level: certain versions of gcc will remove unreferenced static inline functions from output by default, but will keep plain static functions even if unreferenced. I'm not sure which versions this applies to, but from a practical standpoint it means it may be a good idea to always use inline for static functions in headers.
In C, static means the function or variable you define can be only used in this file(i.e. the compile unit)
So, static inline means the inline function which can be used in this file only.
EDIT:
The compile unit should be The Translation Unit
In C++, one important effect of inline (that is not mentioned in the other answers yet, I think) is that it prevents linker errors when multiple definitions of the function are found.
Consider a function that is defined in a header file to allow it to be inlined into the source files that include the header. If the compiler decides to not inline (all calls to) this function, the function definition will be included into every object file that references it (i.e. does not inline all calls).
This might cause multiple definitions of the functions to read the linker (though not always, since it depends on the inlining decisions made by the compiler). Without the inline keyword, this produces a linker error, but the inline keyword tells the linker to just pick one definition and discard the rest (which are expected to be equal, but this is not checked).
The static keyword, on the other hand, ensures that if a function is included in the object file, it will be private to that object file. If multiple object files contain the same function, they will coexist and all calls to the function will use their "own" version. This means that more memory is taken up. In practice, I believe this means that using static for functions defined in header files is not a good idea, better to just use inline.
In practice, this also means that static functions cannot produce linker errors, so the effect of inline above is not really useful for static functions. However, as suggested by ony in another answer, adding inline might be helpful to prevent warnings for unused functions.
Note that the above is true for C++. In C, inline works a bit different, and you have to explicitly put an extern declaration in a single source file to have the inline function emitted into that object file so it is available for any non-inlined uses. In other words, inline means that a function is not emitted into any source file, even when not all calls are inlined, unless it is also specified as extern, and then it is emitted (even if all local calls are inlined). I'm not sure how that interacts with static, though.
An inline definition is not externally linked.
// average.h
#ifndef AVERAGE_H
#define AVERAGE_H
inline double average(double a, double b);
#endif
Attempting to call an inline function with the definition above from another
module after it has been preprocessed or linked to a c file will result in an error.
There are two ways to solve this problem:
make it a static inline function defintion.
Example:
// average.h
#ifndef AVERAGE_H
#define AVERAGE_H
static inline double average(double a, double b);
#endif
include the defintion from the c file and make it external.
Example:
#include "average.h"
extern double average(double a ,double b){
return (a + b) / 2;
}

Resources