inline a function inside another inline function in C - c

I currently have inline functions calling another inline function (a simple 4 lines big getAbs() function). However, I discovered by looking to the assembler code that the "big" inline functions are well inlined, but the compiler use a bl jump to call the getAbs() function.
Is it not possible to inline a function in another inline function? By the way, this is embedded code, we are not using the standard libraries.
Edit : The compiler is WindRiver, and I already checked that inlining would be beneficial (4 instructions instead of +-40).

Depending on what compiler you are using you may be able to encourage the compiler to be less reluctant to inline, e.g. with gcc you can use __attribute__ ((always_inline)), with Intel ICC you can use icc -inline-level=1 -inline-forceinline, and with Apple's gcc you can use gcc -obey-inline.

The inline keyword is a suggestion to the compiler, nothing more. It's free to take that suggestion on board, totally ignore it or even lie to you and tell that it's doing it while it's really not.
The only way to force code to be inline is to, well, write it inline. But, even, then the compiler may decide it knows better and decide to shift it out to another function. It has a lot of leeway in generating executable code for your particular source, provided it doesn't change the semantics of it.
Modern compilers are more than capable of generating better code than most developers would hand-craft in assembly. I think the inline keyword should go the same path as the register keyword.
If you've seen the output of gcc at its insane optimisation level, you'll understand why. It has produced code that I wouldn't have dreamed possible, and that took me a long time to understand.
As an aside, check this out for what optimisations that gcc actually has, including a great many containing the text "inline" or "inlining".

#gramm: There's quite a few scenarios in which inline isn't necessarily to your benefit. Most compilers use some very advanced heuristics to determine when to inline. When discussing inlining, the simplest idea is, trust your compiler to produce the fastest code.

I have recently had a very similar problem, reading this post has given me a wackky idea. Why not Have a simple pre-compilation (a simple reg ex should do the job ) code parser that parses out the function call to actually put the source code in-line. use a tag such as /inline/ /end_of_inline/ so that you can use normal ide features (if you are or might use an ide.
Include this in your build process, that way you have the readability advantage as well as removing the compilers assumption that you are only as good a developer as most and do not understand when to in-line.
Nonetheless before trying this you should probably go through the compilers command line options.

I would suggest that if your getAbs() function (sounds like absolute value but you really should be showing us code with the question...) is 4 lines long, then you have much bigger optimizations to worry about than whether the code gets inlined or not.

Related

Why is optimizing inline functions easier than normal functions?

Im reading What Every Programmer Should Know About Memory
https://people.freebsd.org/~lstewart/articles/cpumemory.pdf and it says that inline functions make your code more optimizable
for example:
Inlining of functions, in particular, allows the compiler to optimize larger chunks of code at a time which, in turn, enables the generation of machine code which better exploits the processor’s pipeline architecture
and:
The handling of both code and data (through dead code elimination or value range propagation, and others) works better when larger parts of the program can be considered as a single unit.
and this also:
If a function is only called once it might as well be inlined. This gives the compiler the opportunity to perform more optimizations (like value range propagation, which might significantly improve the code).
After reading these, to me atleast it seems like inline functions are easier to optimize, but why? Why is it easier to optimize something is inline?
The reason that it is easier to make a better job when optimizing inlined functions than with outlined is that you know the complete context in which the function is called and can use this information to tune the generated code to match this exact context. This often allows more efficient code for the inlined copy of the function but also for the calling function. The caller and the callee can be optimized to fit each other in a way that is not possible for outlined functions.
There is no difference!
All functions are subject to being inlined by gcc in -O3 optimization mode, whether declared inline, static, neither or both.
see: https://stackoverflow.com/a/40783656/9925764
or here is the modifying the example of #Eugene Sh. without noinline option.
https://godbolt.org/z/arPEf7rd4

Advantage of #define instead of creating a function in embedded

Recently I got to view an embedded code in that they are using
#define print() printf("hello world")
instead of
void print() { printf("hello world"); }
My question what is the gain on using #define instead of creating a function?
It may be related to performance.
A function call has some overhead (i.e. calling, saving things on the stack, returning, etc) while a macro is a direct substitution of the macro name with it's contents (i.e. no overhead).
In this example the functions foo and bar does exactly the same. foo uses a macro while bar uses a function call.
As you can see bar and printY together requires more instructions than foo
.
So by using a macro the performance got a little better.
But... there are downsides to this approach:
Macros are hard to debug as you can't single step a macro
Extensive use of a macro increases the size of the binary (compared to using function call). Something that can impact performance in a negative direction.
Also notice that modern compilers (with optimization on) are really good at figuring out when it's a good idea to automatically inline a function (i.e. your code is written with a function call but the compiler decides to inline the function as if it was a macro). So you might get the same performance using function call.
Further, you can use the inline key word as a hint to the compiler that you think it will be good to inline a function. But even with that keyword the compiler may decide not to inline. The only way to make sure that the code gets inline, is by using a macro.
There is no advantage. Using #define like this is quite ancient C programming style.
In the year 1999, the C language got the inline keyword to make all such macros obsolete. And with modern compilers, inline is often superfluous too, since the compiler is nowadays better than the programmer when it comes to determining when to inline.
Some of the embedded compilers out can still be rather bad at such optimizations though, and that's why embedded C code tends to lag behind in modernization.
In general, doing micro-optimizations like this is called "pre-mature optimizations", meaning the programmer is meddling with optimizations that they should leave to the compiler. Even in hard real time systems. Optimizations should only be the last resort when you have 1) detected an actual bottleneck, and 2) disassembled to see if manual inlining actually does anything good for performance.
Sometimes you want to stub out functionality at compile time. Macros give you an easy way to do this.

Is there, as in JavaScript, a performance penalty for creating functions in C?

In JavaScript, there are, often, huge performance penalties for writing functions. For example, if you use this function:
function double(x){ return x*2; }
inside an inner loop, you are probably hitting your performance considerably, so it is really profitable to inline that kind of function for intensive applications. Does this, in general, hold for C? Am I free to create those kind of functions for everything, and rest assured the compiler will do the job, or is hand inlining still important?
The answer is: it depends.
I'm currently using MSVC compiler and GCC for a project at work and my experience is that they both do a pretty good job. Furthermore, the cost of a function call in native code can be pretty small, especially in functions that do not need to be accessible outside the executable (like functions not exported in a shared library). For these functions, there is more flexibility with how the call is actually implemented.
A few things to note: it's much easier for a compiler to optimize calls to static functions. Functions with external linkage often require link time optimization since one must know how and where the function is actually called, as well as the implementation, to do much optimization or inlining. This requires examining more than one compilation unit at a time.
I would say that you should use functions where it makes sense and makes the code easier to read and maintain. In general, it is safe to assume that the cost is smaller than it would be in JavaScript. But in the end, you'd have to profile the code to say anything more precise.
UPDATE: I want to emphasize that functions can be inlined across compilation units, but this requires link-time optimization (or whole program optimization). This is supported in both GCC (https://gcc.gnu.org/wiki/LinkTimeOptimization) and MSVC (http://msdn.microsoft.com/en-us/library/0zza0de8.aspx).
These days, if you can beat the compiler by copying the body of a function and pasting it everywhere you call that function, you probably need a different compiler.
In general, with optimizations turned on, gcc will tend to inline short functions provided that they are defined in the same compilation unit that they are called in.
Moreover, if the calling function and called function are in different compilation units, the compiler does not have a chance to inline them regardless of what you request.
So, if you want to maximize the chance of the compiler optimizing away a function call (without manually inlining), you should define the function call in .h file or in the same c file that it is called in.
There are no inner functions in C. Dot. So the rest of your question is kind of irrelevant.
Anyway, as of "normal" functions in C compiler may or may not inline them ( replace function invocation by its body ). If you compile your code with "optimize for size" it may decide to do not do inlining for obvious reason.

Difference between macros and functions in C in relation to instruction memory and speed

To my understanding the difference between a macro and a function is, that a macro-call will be replaced by the instruction in the definition, and a function does the whole push, branch and pop -thing. Is this right, or have I understand something wrong?
Additionally, if this is right, it would mean, that macros would take more space, but would be faster (because of the lack of the push, branch and pop instructions), wouldn't it?
What you are wrote about the performance implications is correct if the C compiler is not optimizing. But optimizing compilers can inline functions just as if they were macros, so an inlined function call runs at the same speed as a macro, and there is no pushing/popping overhead. To trigger inlining, enable optimization in your compiler settings (e.g. gcc -O2), and put your functions to the .h file as static inline.
Please note that sometimes inlining/macros is faster, sometimes a real function call is faster, depending on the code and the compiler. If the function body is very short (and most of it will be optimized away), usually inlining is faster than a function call.
Another important difference that macros can take arguments of different types, and the macro definition can make sense for multiple types (but the compiler won't do type checking for you, so you may get undesired behavior or a cryptic error message if you use a macro with the wrong argument type). This polymorphism is hard to mimic with functions in C (but easy in C++ with function overloading and function templates).
This might have been right in the 1980s, but modern compilers are much better.
Functions don't always push and pop the stack, especially if they're leaf functions or have tail calls. Also, functions are often inlined, and can be inlined even if they are defined in other translation units (this is called link-time optimization).
But you're right that in general, when optimizations are turned off, a macro be inlined and a function won't be inlined. Either version may take more space, it depends on the particulars of the macro/function.
A function uses space in two ways: the body uses space, and the function call uses space. If the function body is very small, it may actually save space to inline it.
Yes your understanding is right. But you should also note that, no type checking in macro and it can lead to side effect. You should also be very careful in parenthesizing macros.
Your understanding is half correct. The point is that macros are resolved before compilation. You should think of them as sophisticated text replacement tools (that's oversimplifying it, but is mostly what it comes down to).
So the difference is when in the build process your code is used.
This is orthogonal to the question of what the compiler really does with it when it creates the final binary code. It is more or less free to do whatever it thinks is correct to produce the intended behaviour. In C++, you can only hint at your preference with the inline keyword. The compiler is free to ignore that hint.
Again, this is orthogonal to the whole preprocessor business. Nothing stops you from writing macros which result in C++ code using the inline keyword, after all. Likewise, nobody stops you from writing macros which result in a lot of recursive C++ functions which the compiler will probably not be able to inline even if wanted to do.
The conclusion is that your question is wrong. It's a general question of having binaries with a lot of inlined functions vs. binaries with a lot of real function calls. Macros are just one technique you can use to influence the tradeoff in one way or the other, and you will ask yourself the same general question without macros.
The assumption that inlining a function will always trade space for speed is wrong. Inlining the wrong (i.e. too big) functions will even have a negative impact on speed. As is always the case with such opimisations, do not guess but measure.
You should read the FAQ on this: "Do inline functions improve performance?"

How to use the __attribute__ keyword in GCC C?

I am not clear with use of __attribute__ keyword in C.I had read the relevant docs of gcc but still I am not able to understand this.Can some one help to understand.
__attribute__ is not part of C, but is an extension in GCC that is used to convey special information to the compiler. The syntax of __attribute__ was chosen to be something that the C preprocessor would accept and not alter (by default, anyway), so it looks a lot like a function call. It is not a function call, though.
Like much of the information that a compiler can learn about C code (by reading it), the compiler can make use of the information it learns through __attribute__ data in many different ways -- even using the same piece of data in multiple ways, sometimes.
The pure attribute tells the compiler that a function is actually a mathematical function -- using only its arguments and the rules of the language to arrive at its answer with no other side effects. Knowing this the compiler may be able to optimize better when calling a pure function, but it may also be used when compiling the pure function to warn you if the function does do something that makes it impure.
If you can keep in mind that (even though a few other compilers support them) attributes are a GCC extension and not part of C and their syntax does not fit into C in an elegant way (only enough to fool the preprocessor) then you should be able to understand them better.
You should try playing around with them. Take the ones that are more easily understood for functions and try them out. Do the same thing with data (it may help to look at the assembly output of GCC for this, but sizeof and checking the alignment will often help).
Think of it as a way to inject syntax into the source code, which is not standard C, but rather meant for consumption of the GCC compiler only. But, of course, you inject this syntax not for the fun of it, but rather to give the compiler additional information about the elements to which it is attached.
You may want to instruct the compiler to align a certain variable in memory at a certain alignment. Or you may want to declare a function deprecated so that the compiler will automatically generate a deprecated warning when others try to use it in their programs (useful in libraries). Or you may want to declare a symbol as a weak symbol, so that it will be linked in only as a last resort, if any other definitions are not found (useful in providing default definitions).
All of this (and more) can be achieved by attaching the right attributes to elements in your program. You can attach them to variables and functions.
Take a look at this whole bunch of other GCC extensions to C. The attribute mechanism is a part of these extensions.
There are too many attributes for there to be a single answer, but examples help.
For example __attribute__((aligned(16))) makes the compiler align that struct/function on a 16-bit stack boundary.
__attribute__((noreturn)) tells the compiler this function never reaches the end (e.g. standard functions like exit(int) )
__attribute__((always_inline)) makes the compiler inline that function even if it wouldn't normally choose to (using the inline keyword suggests to the compiler that you'd like it inlining, but it's free to ignore you - this attribute forces it).
Essentially they're mostly about telling the compiler you know better than it does, or for overriding default compiler behaviour on a function by function basis.
One of the best (but little known) features of GNU C is the attribute mechanism, which allows a developer to attach characteristics to function declarations to allow the compiler to perform more error checking. It was designed in a way to be compatible with non-GNU implementations, and we've been using this for years in highly portable code with very good results.
Note that attribute spelled with two underscores before and two after, and there are always two sets of parentheses surrounding the contents. There is a good reason for this - see below. Gnu CC needs to use the -Wall compiler directive to enable this (yes, there is a finer degree of warnings control available, but we are very big fans of max warnings anyway).
For more information please go to http://unixwiz.net/techtips/gnu-c-attributes.html
Lokesh Venkateshiah

Resources