What's is the idea behind C99 inline? - c

I am confused about inline in C99.
Here is what I want:
I want my function get inlined everywhere, not just limited in one translation unit (or one compilation unit, a .c file).
I want the address of the function consistent. If I save the address of the function in a function pointer, I want the function callable from the pointer, and I don't want duplication of the same function in different translation units (basically, I mean no static inline).
C++ inline does exactly this.
But (and please correct me if I am wrong) in C99 there is no way to get this behavior.
I could have use static inline, but it leads to duplication (the address of the same function in different translation unit is not the same). I don't want this duplication.
So, here are my questions:
What is idea behind inline in C99?
What benefits does this design give over C++'s approach?
References:
Here's a link that speaks highly of C99 inline, but I don't understand why. Is this “only in exactly one compilation unit” restriction really that nice?http://gustedt.wordpress.com/2010/11/29/myth-and-reality-about-inline-in-c99/
Here's the Rationale for C99 inline. I've read it, but I don't understand it.Is "inline" without "static" or "extern" ever useful in C99?
A nice post, provides strategies for using inline functions.http://www.greenend.org.uk/rjk/tech/inline.html
Answers Summary
How to get C++ inline behavior in C99 (Yes we can)
head.h
#ifndef __HEAD_H__
#define __HEAD_H__
inline int my_max(int x, int y) {
return (x>y) ? (x) : (y);
}
void call_and_print_addr();
#endif
src.c
#include "head.h"
#include <stdio.h>
// This is necessary! And it should occurs and only occurs in one [.c] file
extern inline int my_max(int x, int y);
void call_and_print_addr() {
printf("%d %u\n", my_max(10, 100), (unsigned int)my_max);
}
main.c
#include <stdio.h>
#include "head.h"
int main() {
printf("%d %u\n", my_max(10, 100), (unsigned int)my_max);
call_and_print_addr();
return 0;
}
Compile it with: gcc -O3 main.c src.c -std=c99
Check the assembly with: gcc -O3 -S main.c src.c -std=c99, You'll find that my_max is inlined in both call_and_print_addr() and main().
Actually, this is exactly the same instructions given by ref 1 and ref 3. And what's wrong with me?
I used a too old version of GCC (3.4.5) to experiment, it give me “multiple definition of my_max” error message, and this is the real reason why I am so confused. Shame.
Difference between C99 and C++ inline
Actually you can compile the example above by g++: g++ main.c src.c
extern inline int my_max(int x, int y);
is redundant in C++, but necessary in C99.
So what does it do in C99?
Again, use gcc -O3 -S main.c src.c -std=c99, you'll find something like this in src.s:
_my_max:
movl 4(%esp), %eax
movl 8(%esp), %edx
cmpl %eax, %edx
cmovge %edx, %eax
ret
.section .rdata,"dr"
If you cut extern inline int my_max(int x, int y); and paste it into main.c, you'll find these assembly code in main.s.
So, by extern inline, you tell the compiler where the true function my_max(), which you can call it by its address, will be defined and compiled.
Now look back in C++, we can't specify it. We will never know where my_max() will be, and this is the “vague linkage” by #Potatoswatter.
As is said by #Adriano, most of the time, we don't care about this detail, but C99 really removes the ambiguity.

To get C++-like behavior, you need to give each TU with potentially-inlined calls an inline definition, and give one TU an externally-visible definition. This is exactly what is illustrated by Example 1 in the relevant section (Function specifiers) of the C standard. (In that example, external visibility is retroactively applied to an inline definition by declaring the function extern afterward: this declaration could be done in the .c file after the definition in the .h file, which turns usual usage on its head.)
If inlining could be accomplished literally everywhere, you wouldn't need the extern function. Non-inlined calls are used, however, in contexts such as recursion and referencing the function address. You may get "always inline" semantics, in a sense, by omitting the extern parts, however this can arbitrarily fail for any simple function call because the standard does not demand that a call be inlined just because there is no alternative. (This is the subject of the linked question.)
C++ handles this with the implementation concept of "vague linkage"; this isn't specified in the standard but it is very real, and tricky, inside the compiler. C compilers are supposed to be easier to write than C++; I believe this accounts for the difference between the languages.

I want my function get inlined everywhere, not just limited in one translation unit(or one compile unit, a [.c] file).
With inline you politely ask your compiler to inline your function (if it has time and mood). It's unrelated to one compilation unit, at best it may even get inlined in every single call site and it won't have a body anywhere (and its code will be duplicated everywhere). It's purpose of inlining, speed in favor of size.
I want the address of the function consistent. If I save the address of the function in a function pointer, I want the function callable from the pointer, and I don't want duplication of the same function in different translation unit. (Basically, I mean no 'static inline')
Again you can't. If function is inlined then there is not any function pointer to it. Of course compiler will need a compilation unit where function will stay (because, well yes, you may need a function pointer or sometimes it may decide to do not inline that function in a specific call site).
From your description it seems that static inline is good. IMO it's not, a function body (when used, see above paragraph) in each compilation unit will lead to code duplication (and problem in comparison of function pointers because each compilation unit will have its own version of your function). It's here that C99 did something pretty good: you declare exactly one place to put function body (when and if required). Compiler won't do it for you (if you ever care about it) and there is nothing left to implementor.
What is idea behind inline in C99?
Pick a good thing (inline functions) but remove ambiguity (each C++ compiler did his own job about where function body has to stay).
What benefits does this design give over C++'s approach?
Honestly I can't see such big problem (even article you linked is pretty vague about this benefit). In a modern compiler you won't see any issue and you will never care about that. Why it's good what C did? IMO because it removed an ambiguity even if - frankly speaking - I'd prefer my compiler does that for me when I don't care about it (99.999%, I suppose).
That said, but I may be wrong, C and C++ have different targets. If you're using C (not C++ without classes and few C++ features) then probably you want to address this kind of details because they matters in your context so C and C++ had to diverge about that. There is not a better design: just different decision for a different audience.

Related

Retaining Compatibility To Assembly With inline Functions

I'm writing some header files, which are to be accessed by both C code and assembly. Assembly code is preprocessed with the C preprocessor for this sake.
The problem is I have plenty of inline functions in those header files. The assembler cannot process functions, which are not symbols in an object file (as with static inline functions), so I cannot use those. I've read this and this invaluable posts and have grasped how to use extern and static in conjunction with inline by now but I am unsure about how to make inline function accessible to both C code and assembly.
My current approach is to write inline functions (with >= GNU99, -O3 inlines the function, anything else calls an external definition of that function, which I need to define explicitly) in a header file and write external definitions in an implementation file. The C code includes the header file (the inline functions) compiles with -O3, thus using the inlined versions. The assembly code uses the external definitions.
Questions:
The assembly code can only call the functions, inlining is currently impossible. Can assembly code, by any means, make use of inlining? I mean as in an .S file, not inline assembly.
extern inline would be similarly good as my current method but it boils down to just one definition (the external definition is emitted automatically), so it cannot be divided into header and source file, which is crucial to make it accessible to C code (header) and assembly (source).
Is there any better method to achieve what I've been trying to do?
The overhead of a call forcing you to assume most registers are clobbered is pretty high. For high performance you need to manually inline your functions into asm so you can fully optimize everything.
Getting the compiler to emit a stand-alone definition and calling it should only be considered for code that's not performance-critical. You didn't say what you're writing in asm, or why, but I'm assuming that it is performance critical. Otherwise you'd just write it in C (with inline asm for any special instructions, I guess?).
If you don't want to manually inline, and you want to use these small inline C functions inside a loop, you'll probably get better performance from writing the whole thing in C. That would let the compiler optimize across a lot more code.
The register-arg calling conventions used for x86-64 are nice, but there are a lot of registers that are call-clobbered, so calls in the middle of computing stuff stop you from keeping as much data live in registers.
Can assembly code, by any means, make use of inlining? I mean as in an
.S file, not inline assembly.
No, there's no syntax for the reverse of inline-asm. If there was, it would be something like: you tell the compiler what registers the inputs are in, what registers you want outputs in, and which registers it's allowed to clobber.
Common-subexpression-elimination and other significant optimizations between the hand-written asm and the compiler output wouldn't be possible without a compiler that really understood the hand-written asm, or treated it as source code and then emitted an optimized version of the whole thing.
Optimal inlining of compiler output into asm will typically require adjustments to the asm, which is why there aren't any programs to do it.
Is there any better method to achieve what I've been trying to do?
Now that you've explained in comments what your goals are: make small wrappers in C for the special instructions you want to use, instead of the other way around.
#include <stdint.h>
struct __attribute__((packed)) lgdt_arg {
uint16_t limit;
void * base; // FIXME: always 64bit in long mode, including the x32 ABI where pointers and uintptr_t are 32bit.
// In 16bit mode, base is 24bit (not 32), so I guess be careful with that too
// you could just make this a uint64_t, since x86 is little-endian.
// The trailing bytes don't matter since the instruction just uses a pointer to the struct.
};
inline void lgdt (const struct lgdt_arg *p) {
asm volatile ("lgdt %0" : : "m"(*p) : "memory");
}
// Or this kind of construct sometimes gets used to make doubly sure compile-time reordering doesn't happen:
inline void lgdt_v2 (struct lgdt_arg *p) {
asm volatile ("lgdt %0" : "+m"(*(volatile struct lgdt_arg *)p) :: "memory");
}
// that puts the asm statement into the dependency chain of things affecting the contents of the pointed-to struct, so the compiler is forced to order it correctly.
void set_gdt(unsigned size, char *table) {
struct lgdt_arg tmp = { size, table };
lgdt (&tmp);
}
set_gdt compiles to (gcc 5.3 -O3 on godbolt):
movw %di, -24(%rsp)
movq %rsi, -22(%rsp)
lgdt -24(%rsp)
ret
I've never written code involving lgdt. It's probably a good idea to use a "memory" clobber like I did to make sure any loads/stores aren't reordered across it at compile time. That will make sure the GDT it points to might is fully initialized before running LGDT. (Same for LIDT). Compilers might notice the that base gives the inline asm a reference to the GDT, and make sure its contents are in sync, but I'm not sure. There should be little to no downside to just using a "memory" clobber here.
Linux (the kernel) uses this sort of wrapper around an instruction or two all over the place, writing as little code as possible in asm. Look there for inspiration if you want.
re: your comments: yes you'll want to write your boot sector in asm, and maybe some other 16bit code since gcc's -m16 code is silly (still basically 32bit code).
No, there's no way to inline C compiler output into asm other than manually. That's normal and expected, for the same reason there aren't programs that optimize assembly. (i.e. read asm source, optimize, write different asm source).
Think about what such a program would have to do: it would have to understand the hand-written asm to be able to know what it could change without breaking the hand-written asm. Asm as a source language doesn't give an optimizer much to work with.
The answer you linked to explains how C99 inline functions work but don't explain why the definition is that quirky. The relevant standard paragraph is ISO 9899:2011 §6.7.4 ¶6–7 (ISO 9899:1999 ibid.):
6 A function declared with an inline function specifier is an inline function. Making a function an inline function suggests that calls to the function be as fast as possible.138) The extent to which such suggestions are effective is implementation-defined. 139)
7 Any function with internal linkage can be an inline function. For a function with external linkage, the following restrictions apply: If a function is declared with an inline function specifier, then it shall also be defined in the same translation unit. If all of the file scope declarations for a function in a translation unit include the inline function specifier without extern, then the definition in that translation unit is an inline
definition. An inline definition does not provide an external definition for the function, and does not forbid an external definition in another translation unit. An inline definition provides an alternative to an external definition, which a translator may use to implement any call to the function in the same translation unit. It is unspecified whether a call to the function uses the inline definition or the external definition.140)
138) By using, for example, an alternative to the usual function call mechanism, such as ”inline substitution”. Inline substitution is not textual substitution, nor does it create a new function. Therefore, for example, the expansion of a macro used within the body of the function uses the definition it had at the point the function body appears, and not where the function is called; and identifiers refer to the declarations in scope where the body occurs. Likewise, the function has a single address, regardless of the number of inline definitions that occur in addition to the external definition.
139) For example, an implementation might never perform inline substitution, or might only perform inline substitutions to calls in the scope of an inline declaration.
140) Since an inline definition is distinct from the corresponding external definition and from any other corresponding inline definitions in other translation units, all corresponding objects with static storage duration are also distinct in each of the definitions.
How does the definition of inline come into play? Well, if only inline declarations (without extern or static) of a function exist in a translation unit, no code for the funcion is emitted. But if a single declaration without inline or with extern exists, then code for the function is emitted, even if it is defined as an inline function. This design aspect allows you to describe the module that contains the machine code for an inline function without having to duplicate the implementation:
In your header file, place inline definitions:
fast_things.h
/* TODO: add assembly implementation */
inline int fast_add(int a, int b)
{
return (a + b);
}
inline int fast_mul(int a, int b)
{
return (a * b);
}
This header can be included in every translation module and provides inline definitions for fast_add and fast_mul. To generate the machine code for these two, add this file:
fast_things.c
#include "fast_things.h"
extern inline int fast_add(int, int);
extern inline int fast_mul(int, int);
You can avoid typing all of this out using some macro magic. Change fast_things.h like this:
#ifndef EXTERN_INLINE
#define EXTERN_INLINE_UNDEFINED
#define EXTERN_INLINE inline
#endif
EXTERN_INLINE int fast_add(int a, int b)
{
return (a + b);
}
EXTERN_INLINE int fast_mul(int a, int b)
{
return (a * b);
}
#ifdef EXTERN_INLINE_UNDEFINED
#undef EXTERN_INLINE
#undef EXTERN_INLINE_UNDEFINED
#endif
Then fast_things.c simply becomes:
#define EXTERN_INLINE extern inline
#include "fast_things.h"
Since code is emitted for the inline functions, you can call them from assembly just fine. You cannot however inline them in assembly as the assembler doesn't speak C.
There are also static inline functions which might be more suitable for your purpose (i.e. tiny helper functions) when you can make reasonably sure that they are always inlined.
The GNU assembler supports macros in its custom macro language. One possibility is to write a custom preprocessor that takes your inline assembly and emits both gcc-style inline assembly for C and gas macros. This should be possible with sed, m4, or awk (in descending order of difficulty). It might also be possible to abuse the C preprocessors stringify (#) operator for this; if you can give me a concrete example, I could try to throw something together.

What does the compiler do to a function which contains only one function call

As the title suggests, what happens if I have:
void a(uint8_t i) {
b(i, 0);
}
Will a compiler be able to replace a call to a(i) with b(i, 0)?
Also, in either case, would the following be considered good practice to replace the above:
#define a(i) b(i, 0)
This is pretty easy to test. If the call to a is in the same compilation unit most compilers will optimize it. Let's see what happens:
$ cat > foo.c
void b(int, int);
void
a(int a)
{
b(a, 0);
}
void
foo(void)
{
a(17);
}
Then compile it to just assembler with some basic optimizations (I added omit-frame-pointer to create cleaner output, you can verify that exactly the same thing will happen without that flag):
$ cc -fomit-frame-pointer -S -O2 foo.c
And then look at the output (I cleaned it up and just kept the code, there's lot of annotations in generated assembler that aren't relevant here):
$ cat foo.s
a:
xorl %esi, %esi
jmp b
foo:
xorl %esi, %esi
movl $17, %edi
jmp b
So we can see here that the compiler first generated a normal function a that calls b (except it's tail call optimized, so it's jmp instead of a call). Then when compiling foo instead of calling a it just inlined it.
The compiler I used in this case was a relatively old version of gcc, I also checked that clang does the exact same thing. This is pretty standard optimization and as long as the compiler does any inlining, a simple function like this will always be inlined.
It depends on a few things, not least of which is your choice of toolchain (compiler, linker, etc) and optimisation settings.
If the compiler has visibility of the definition of a() - not just a declaration - it might elect to inline a(). A compiler is not required to do that but, depending on optimisation settings and quality of implementation of the compiler itself, it might. Your case is, however, a fairly common and straight-forward optimisation for modern compilers.
If the function is not declared static (which very over-simplistically makes it local to a particular compilation unit) then most compilers will still keep a definition of the function a() in the object file, so it can be linked in with other object files (for other compilation units). Even if it choose to inline calls of the function within the compilation unit that defines it.
If the function is declared inline (and the compiler has visibility of the definition) the same actually applies. inline is a hint which the standard permits a compiler to ignore, no matter how adamant the programmer is. In practice, modern compilers can often do a better job of deciding which functions to inline than a programmer can.
If you have code that stores the address of a() (e.g. in a pointer to function) the compiler might elect to not inline it.
Even if the compiler does not inline the function, a smart linker might choose to (in effect) inline it. Most C implementations, however, use a traditional dumb linker as part of the toolchain - so this type of link-time optimisation is unlikely in practice.
Even if the linker doesn't, some virtual machine host environments might elect to inline at run time. This would be highly unusual for a C program but not beyond realms of possibility.
Personally, I wouldn't worry about it. There will be few observable differences (e.g. in program performance, size, etc) whether the compiler does this style of optimisation or not, unless you have a truly large number of such functions.
I would not use a macro. If you really don't want to type , 0 whenever you use b(), then simply write your function a(), and let the compiler worry about it. Only try to optimise further by hand if performance measures and profiling show your function a() is a performance hotspot. Which it probably won't be.
Or, use C++, and declare the function b() with a default value of 0 for the second argument. ;)
The compiler will most likely optimize this code, and make it an inline function:
inline void a(uint8_t i) {
b(i, 0);
}
So calls like a(i) will indeed be replaced with b(i, 0)

Should an inline function be defined before it is called?

The C language allows source files to be read in a single pass without looking ahead; at any point the compiler only has to consider declarations, prototypes, macro definitions etc. that have appeared before its current position in the file.
Does this mean that for a function call to be inlined, a compiler might require the function to be defined before the call? For example:
int foo(void);
int bar(void) { return foo(); }
inline int foo(void) { return 42; }
Is the call to foo more likely to be inlined if the inline definition is moved in front of the definition of bar?
Should I arrange inline definitions in my code so that they appear before the calls that should best be inlined? Or can I just assume that any optimizing compiler that is advanced enough to do inlining at all will be able to find the definition even if it appears after the call (which seems to be the case with gcc)?
EDIT: I noticed that in the Pelles C with the /Ob1 option indeed requires the definition to visible before a call can be inlined. The Compiler also offers an /Ob2 option which removes this limitation (and also allows the compiler to inline functions without an inline specifier, similar to what gcc does), but the documentation states that using this second option may require much more memory.
It shouldn't make any difference in practice. Because, its compiler's choice to inline a function or not even if it's explicitly told to inline. Compiler may also inline a function even if it's defined using inline keyword.
First, I ran your code with with gcc 4.6.3 without any optimizations:
$ gcc -fdump-ipa-inline test.c
From the generated assembly both foo and bar are not inlined even though foo is inlined.
When I changed put the definition of inline foo at the top, the compiler still didn't inline both.
Next I did the same with -O3:
$ gcc -fdump-ipa-inline -O3 test.c
Now both the functions are inlined. Even though only one has the inline declaration.
Basically the compiler can inline a function as it sees fit.

multiple definition of inline function

I have gone through some posts related to this topic but was not able to sort out my doubt completely. This might be a very naive question.
I have a header file inline.h and two translation units main.cpp and tran.cpp.
Details of code are as below
inline.h
#ifndef __HEADER__
#include <stdio.h>
extern inline int func1(void)
{ return 5; }
static inline int func2(void)
{ return 6; }
inline int func3(void)
{ return 7; }
#endif
main.c
#define <stdio.h>
#include <inline.h>
int main(int argc, char *argv[])
{
printf("%d\n",func1());
printf("%d\n",func2());
printf("%d\n",func3());
return 0;
}
tran.cpp
//(note that the functions are not inline here)
#include <stdio.h>
int func1(void)
{ return 500; }
int func2(void)
{ return 600; }
int func3(void)
{ return 700; }
The above code compiles in g++, but does not compile in gcc (even if you make changes related to gcc like changing the code to .c, not using any C++ header files, etc.). The error displayed is "duplicate definition of inline function - func3".
Can you clarify why this difference is present across compilers?
Also, when you run the program (g++ compiled) by creating two separate compilation units (main.o and tran.o) and create an executable a.out, the output obtained is:
500
6
700
Why does the compiler pick up the definition of the function which is not inline. Actually, since #include is used to "add" the inline definition I had expected 5,6,7 as the output. My understanding was during compilation since the inline definition is found, the function call would be "replaced" by inline function definition.
Can you please tell me in detailed steps the process of compilation and linking which would lead us to 500,6,700 output. I can only understand the output 6.
This answer is divided into the following sections:
How to reproduce the duplicate definition of inline function - func3 problem and why.
Why defintion of func3 is a duplicate instead of func1.
Why it compiles using g++
How to produce the duplicate definition of inline function - func3 problem
The problem can be successfully reproduced by
Rename tran.cpp to tran.c
Compile with gcc -o main main.c tran.c
#K71993 is actually compiling using the old gnu89 inline semantics, which is different from C99. The reason for renaming tran.cpp to tran.c is to tell the gcc driver to treat it as C source instead of C++ source.
Why definition of func3 is a duplicate instead of func1.
GNU 89 inline semantics
The following text is quoted from GCC Document: An Inline Function is As Fast As a Macro explains why func3 is a duplicate definition instead of func1, since func3 (instead of func1) is an externally visible symbol (in GNU89 inline semantics)
When an inline function is not static, then the compiler must assume that there may be calls from other source files; since a global symbol can be defined only once in any program, the function must not be defined in the other source files, so the calls therein cannot be integrated. Therefore, a non-static inline function is always compiled on its own in the usual fashion.
If you specify both inline and extern in the function definition, then the definition is used only for inlining. In no case is the function compiled on its own, not even if you refer to its address explicitly. Such an address becomes an external reference, as if you had only declared the function, and had not defined it.
C99 inline semantics
If compiled with C99 standard, i.e., gcc -o main main.c tran.c -std=c99, the linker will complain that definition of func1 is a duplicate due to the reason that extern inline in C99 is a external definition as mentioned in other posts and comments.
Please also refer to this execellent answer about semantic differents between GNU89 inline and C99 inline.
Why it compiles using g++.
When compiled with g++, the source program are considered as C++ source. Since func1, func2 and func3 are defined in multiple translation units and their definitions are different, the One Defintion Rule of C++ is violated. Since the compiler is not required to generate dignostic message when definitions spans multiple translation units, the behavior is undefined.
Maybe you should post the actual code. The snippets you show don't compile:
inline.h has extern inline int func1(void) That doesn't make any sense.
main.h has #define <stdio.h> I think you meant include instead.
Once I fixed those and compiled with gcc, it compiled fine and I got the following output
5
6
7
When I compile with g++, I get this output:
5
6
700
That happens because func3() is not static in inline.h
The compiling error is because there is a duplicate definition of func1();
Because func1() is defined using extern inline, it will produce a external definition.
However, there is also an external definition in tran.c, which cause multiple definition error.
However, func2() and func3() do not produce an external definition, hence no redefinition error.
You might want to look at here http://www.greenend.org.uk/rjk/2003/03/inline.html.
Also, take a note that c++ and c treats inline functions differently, and even in c, different standards (c89 vs. c99) treats inline functions differently.
Your code is invalid from the C++ point of view, since it blatantly violates the One Definition Rule. The only reason you managed to compile it by C++ compiler is the loose error checking in your C++ compiler (it happens to be one of those parts of ODR where "no diagnostic is required").
Your code is not valid C, because it provides duplicate external definition of function func1. Note that it is func1, not func3 that is problematic from the C point of view. There's nothing formally wrong with your func3. Your func2 is also OK, as long as the two definitions never "meet" each other in the same translation unit.
One possible reason you might be getting a different diagnostic report from your compiler is that your C compiler might be supporting inline functions in some non-standard compiler-specific way (either a pre-C99 compiler or a modern compiler run in non-standard "legacy" mode).
Frankly, I find it hard to believe you are getting an error report about func3 from any compiler, assuming the code you posted accurately represents what you are trying to compile. Most likely what you posted is not the real code.
The compile error you see is actually a linker error.
gcc and g++ are treating static inline a little differently. inline was first part of C++ and then made into an extension to many C compilers, before being added to standard C. The standard semantics could be different, but it could just be the implementations that are different.
It could also have something to do with some crazy stuff that happens with C++ code that gets rid of duplicate template stuff catching other duplicate stuff as well.
basically Inline is a late entry to GCC ( I mean c compiler).
"[ . . . ] An inline definition does not provide an external definition for the function, and does not forbid an external definition in another translation unit. An inline definition provides an alternative to an external definition, which a translator may use to implement any call to the function in the same translation unit. It is unspecified whether a call to the function uses the inline definition or the external definition."
— ISO 9899:1999(E), the C99 standard, section 6.7.4

Will GCC inline a function that takes a pointer?

I have a function which operates on piece of data (let's say, an int), and I want to change it in place by passing a reference to the valule. As such, I have the function: void myFunction(int *thing) { ... }. When I use it I call it thus: myFunction(&anInt).
As my function is called frequently (but from many different places) I am concerned about its performance. The reason I have refactored it into a function is testability and code reuse.
Will the compiler be able to optimize the function, inlining it to operate directly on my anInt variable?
I hope you'll take this question in the spirit in which it's asked (i.e. I'm not prematurely worrying about optimisation, I'm curious about the answer). Similarly, I don't want to make it into a macro.
One way to find out if the function is inlined is to use -Winline gcc option:
-Winline
Warn if a function can not be inlined and it was declared as inline.
Even with this option, the compiler will not warn about failures to inline
functions declared in system headers.
The compiler uses a variety of heuristics to determine whether or not to
inline a function. For example, the compiler takes into account the size
of the function being inlined and the amount of inlining that has already
been done in the current function. Therefore, seemingly insignificant
changes in the source program can cause the warnings produced by -Winline
to appear or disappear.
GCC is quite smart. Consider this code fragment:
#include <stdio.h>
void __inline__ inc(int *val)
{
++ *val;
}
int main()
{
int val;
scanf("%d", &val);
inc(&val);
printf("%d\n", val);
return 0;
}
After a gcc -S -O3 test.c you'll get the following relevant asm:
...
call __isoc99_scanf
movl 12(%rsp), %esi
movl $.LC1, %edi
xorl %eax, %eax
addl $1, %esi
movl %esi, 12(%rsp)
call printf
...
As you can see, there's no need to be an asm expert to see the inc() call has been converted to an increment instruction.
There are two issues here - can the code be optimised, and will it. It certainly can be, but the "will" depends on the mood the optimiser is in. If this is really important to you, compile the code and take a look at the assembly language output.
Also, you seem to be conflating two issues. An inlined function effectively has its body pasted at the call site. Whether or not you are using pointers is neither here nor there. But you seem to be asking if the compiler can transform:
int x = 42;
int * p = & x;
* p = * p + 1;
into
x = x + 1;
This is much more difficult for the optimiser to see.
It should not matter whether the argument is a pointer or not.
But first, if the compiler should automatically inline the function, it must be static. And it must be contained in the same compilation unit. NOTE: we are talking about C, not C++. C++ inline rules differ.
If it is not possible to have the function in the same compilation unit, then try global optimizations (check documentation of your compiler for details).
C99 gives you an "inline" keyword, just as in C++. Which lifts the restriction to the compilation unit.
Here is some further information.
It will (or at least can). There are some reasons where the function cannot be inlined - e.g. when you try to access a pointer of the function (calling function by reference - you are accessing parameters by reference which is ok). There may be other situation (static variables? unsure)
Try to declare the function with "extern inline" - this prevents the compiler from emitting the standalone body. If it cannot inline the function, it will emit an error.
If you're concerned about the compiler generating suboptimal code and want to change a simple valuetype, declare your function as int myFunction(int) and return the new value.
What compiler version are you using? With what options? On what platform?
All these questions effect the answer. You really need to compile the code and look at the assembly to be sure.
This looks to me like a classic case of premature optimization. Do you really know there is a performance issue here? One worth wasting your valuable time worrying about. I mean, really know? Like, have you measured it?
By itself this isn't too bad, but if you take this attitude over a large amount of code, you can do some serious damage and waste a large amount of deveolpement and maintanence time for no good reason.

Resources