Check if a specific function is inlined - Intel compiler - c

I have C/C++ binary libraries (*.dll, *.sys) the obj files they consist of, and their symbols (pdb), but not the source code nor map files.
According to the symbols they were built by the Intel compiler (for windows).
Is there any way to check if a specific function is inlined?
Thanks in advance.

ICC is particularly aggressive with inlining, and in many cases when a function is declared as inline (and especially if it's __forceinline'd on MSVC), it'll actually throw an error during the compilation stage if it's unable to inline it (depending, obviously, on your project compilation settings).
That said, honestly the only way you'll be able to do what you need is to attach a debugger, pause the app in MSVC, switch to ASM view, and search for calls to the function you're looking for's name (you say C/C++, it makes a difference which as with C++ you'll have to search for the mangled name). If you find calls to the function (call _myFunc), it's not inlined.
Otherwise, if you know where to look, browse through the ASM to find the caller function, and check its source to verify that a call to the callee either is or isn't there.
It may sound rather daunting, but it's actually easy enough and just a ctrl+f away.

Related

Issue preventing GCC from optimizing out global variable

I am using ARM-GCC v4.9 (released 2015-06-23) for a STM32F105RC processor.
I've searched stackoverflow.com and I've found this in order to try to convince gcc not to optimize out a global variable, as you may see below:
static const char AppVersion[] __attribute__((used)) = "v3.05/10.oct.2015";
Yet, to my real surprise, the compiler optimized away the AppVersion variable!
BTW: I am using the optimize level -O0 (default).
I also tried using volatile keyword (as suggested on other thread), but it didn't work either :(
I already tried (void)AppVersion; but it doesn't work...
Smart compiler!? Too smart I suppose...
In the meantime, I use a printf(AppVersion); some place in my code, just to be able to keep the version... But this is a boorish solution :(
So, the question is: Is there any other trick that does the job, i.e. keep the version from being optimized away by GCC?
[EDIT]:
I also tried like this (i.e. without static):
const char AppVersion[] __attribute__((used)) = "v3.05/10.oct.2015";
... and it didn't work either :(
Unfortunately I am not aware of a pragma to do this.
There is however another solution. Change AppVersion to:
static char * AppVersion = "v3.05/10.oct.2015";
and add:
__asm__ ("" : : "" (AppVersion));
to your main function.
You see I dropped the 'used' attribute, according to the documentation this is a function attribute.
Other solutions: Does gcc have any options to add version info in ELF binary file?
Though I found this one to be the easiest. This basically won't let the compiler and linker remove AppVersion since we told it that this piece of inline assembly uses it, even though we don't actually insert any inline assembly.
Hopefully that will be satisfactory to you.
Author: Andre Simoes Dias Vieira
Original link: https://answers.launchpad.net/gcc-arm-embedded/+question/280104
Given the presence of "static", all your declaration does is ask the compiler to include the bytes representing characters of the string "v3.05/10.oct.2015" in
some order at some arbitrary location within the file, but not bother to tell
anyone where it put them. Given that the compiler could legitimately write
that sequence of bytes somewhere in the code image file whether or not it
appeared anywhere in the code such a declaration really isn't very useful. To
be sure, it would be unlikely that such a sequence would appear in the code
entirely by chance, and so scanning the binary image for it might be a somewhat
reliable way to determine that it appeared in the code, but in general it's
much better to have some means of affirmatively determining where the string
may be found.
If the string isn't declared static, then the compiler is required to tell the
linker where it is. Since the linker generally outputs the names and
addresses of all symbols in a variety of places including symbol tables,
debug-information files, etc. which may be used in a variety of ways that the
linker knows nothing about, it may be able to tell that a symbol isn't used
within the code, but can generally have no clue about whether some other
utility may be expecting to find it in the symbol table and make use of it. A directive saying the symbol is "used" will tell the linker that even though it doesn't know of anything that's interested in that symbol, something out in the larger universe the linker knows nothing about is interested in it.
It's typical for each compilation unit to give a blob of information to the
linker and say "Here's some stuff; I need a symbol for the start of it, but
I can compute all the addresses of all the internals from that". The linker
has no way of knowing which parts of such a blob are actually used, so it
has no choice but to accept the whole thing verbatim. If the compiler were
to include unused static declarations in its blob, they'd make it through
to the output file. On the other hand, the compiler knows that if it doesn't
export a symbol for something within that blob, nobody else downstream would
be able to find it whether or not the object was included; thus, there would
typically be little benefit to being able to include such a blob and compiler writers generally have to reason to provide a feature to force such inclusion.
It seems that using a custom section also works.
Instead of
__attribute__((used))
try with
__attribute__((section(".your.section.name.here")))
The linker won't touch it, nor will the strip command.

Size optimization options

I am trying to sort out an embedded project where the developers took the option of including all the h and c files into a c file, then they can compile just that one file with the -whole-program option to get good size optimization.
I hate this and am determined to make this into a traditional program just using LTO to achieve the same.
The versions included with the dev kit are;
aps-gcc (GCC) 4.7.3 20130524 (Cortus)
GNU ld (GNU Binutils) 2.22
With one .o file .text is 0x1c7ac, fractured into 67 .o files .text comes out as 0x2f73c, I added the LTO stuff and reduced it to 0x20a44, good but nowhere near enough.
I have tried --gc-sections and using the linker plugin option but they made no further improvment.
Any suggestions, am I see the right sort of improvement from LTO?
To get LTO to work perfectly you need to have the same information and optimisation algorithms available at link stage as you have at compile stage. The GNU tools cannot do this and I believe this was actually one of the motivating factors in the creation of LLVM/Clang.
If you want to inspect the difference in detail I'd suggest you generate a Map file (ld option -Map <filename>) for each option and see if there are functions which haven't been in-lined or functions that are larger. The lack of in-lining you can manually resolve by forcing those functions to inline by moving the definition of the function into a header file and defining it as extern inline which effectively turns it into a macro (this is a GNU extension).
Larger functions are likely not being subject to constant propagation and I don't think there's anything you can do about that. You can make some improvements by carefully declaring the function attributes such as const, leaf, noreturn, pure, and returns_nonnull. These effectively promise that the function will behave in a particular way that the compiler may otherwise detect if using a single compilation unit, and that allow additional optimisations.
In contrast, Clang can compile your object code to a special kind of bytecode (LLVM stands for Low Level Virtual Machine, like JVM is Java Virtual Machine, and runs bytecode) and then optimisation of this bytecode can be performed at link time (or indeed run-time, which is cool). Since this bytecode is what is optimised whether you do LTO or not, and the optimisation algorithms are common between the compiler and the linker, in theory Clang/LLVM should give exactly the same results whether you use LTO or not.
Unfortunately now that the C backend has been removed from LLVM I don't know of any way to use the LLVM LTO capabilities for the custom CPU you're targeting.
In my opinion, the method chosen by the previous developers is the correct one. It is the method that gives the compiler the most information and thus the most opportunities to perform the optimizations that you want. It is a terrible way to compile (any change will require the whole project to be compiled) so marking this as just an option is a good idea.
Of course, you would have to run all your integration tests against such a build, but that should be trivial to do. What is the downside of the chosen approach except for compilation time (which shouldn't be an issue because you don't need to build in that manner all the time ... just for integration tests).

Generating link-time error for deprecated functions

Is there a way with gcc and GNU binutils to mark some functions such that they will generate an error at link-time if used? My situation is that I have some library functions which I am not removing for the sake of compatibility with existing binaries, but I want to ensure that no newly-compiled binary tries to make use of the functions. I can't just use compile-time gcc attributes because the offending code is ignoring my headers and detecting the presence of the functions with a configure script and prototyping them itself. My goal is to generate a link-time error for the bad configure scripts so that they stop detecting the existence of the functions.
Edit: An idea.. would using assembly to specify the wrong .type for the entry points be compatible with the dynamic linker but generate link errors when trying to link new programs?
FreeBSD 9.x does something very close to what you want with the ttyslot() function. This function is meaningless with utmpx. The trick is that there are only non-default versions of this symbol. Therefore, ld will not find it, but rtld will find the versioned definition when an old binary is run. I don't know what happens if an old binary has an unversioned reference, but it is probably sensible if there is only one definition.
For example,
__asm__(".symver hidden_badfunc, badfunc#MYLIB_1.0");
Normally, there would also be a default version, like
__asm__(".symver new_badfunc, badfunc##MYLIB_1.1");
or via a Solaris-compatible version script, but the trick is not to add one.
Typically, the asm directive is wrapped into a macro.
The trick depends on the GNU extensions to define symbol versions with the .symver assembler directive, so it will probably only work on Linux and FreeBSD. The Solaris-compatible version scripts can only express one definition per symbol.
More information: .symver directive in info gas, Ulrich Drepper's "How to write shared libraries", the commit that deprecated ttyslot() at http://gitorious.org/freebsd/freebsd/commit/3f59ed0d571ac62355fc2bde3edbfe9a4e722845
One idea could be to generate a stub library that has these symbols but with unexpected properties.
perhaps create objects that have the name of the functions, so the linker in the configuration phase might complain that the symbols are not compatible
create functions that have a dependency "dont_use_symbol_XXX" that is never resolved
or fake a .a file with a global index that would have your functions but where the .o members in the archive have the wrong format
The best way to generate a link-time error for deprecated functions that you do not want people to use is to make sure the deprecated functions are not present in the libraries - which makes them one stage beyond 'deprecated'.
Maybe you can provide an auxilliary library with the deprecated function in it; the reprobates who won't pay attention can link with the auxilliary library, but people in the mainstream won't use the auxilliary library and therefore won't use the functions. However, it is still taking it beyond the 'deprecated' stage.
Getting a link-time warning is tricky. Clearly, GCC does that for some function (mktemp() et al), and Apple has GCC warn if you run a program that uses gets(). I don't know what they do to make that happen.
In the light of the comments, I think you need to head the problem off at compile time, rather than waiting until link time or run time.
The GCC attributes include (from the GCC 4.4.1 manual):
error ("message")
If this attribute is used on a function declaration and a call to such a function is
not eliminated through dead code elimination or other optimizations, an error
which will include message will be diagnosed. This is useful for compile time
checking, especially together with __builtin_constant_p and inline functions
where checking the inline function arguments is not possible through extern
char [(condition) ? 1 : -1]; tricks. While it is possible to leave the function
undefined and thus invoke a link failure, when using this attribute the problem
will be diagnosed earlier and with exact location of the call even in presence of
inline functions or when not emitting debugging information.
warning ("message")
If this attribute is used on a function declaration and a call to such a function is
not eliminated through dead code elimination or other optimizations, a warning
which will include message will be diagnosed. This is useful for compile time
checking, especially together with __builtin_constant_p and inline functions.
While it is possible to define the function with a message in .gnu.warning*
section, when using this attribute the problem will be diagnosed earlier and
with exact location of the call even in presence of inline functions or when not
emitting debugging information.
If the configuration programs ignore the errors, they're simply broken. This means that new code could not be compiled using the functions, but the existing code can continue to use the deprecated functions in the libraries (up until it needs to be recompiled).

How to use the __attribute__ keyword in GCC C?

I am not clear with use of __attribute__ keyword in C.I had read the relevant docs of gcc but still I am not able to understand this.Can some one help to understand.
__attribute__ is not part of C, but is an extension in GCC that is used to convey special information to the compiler. The syntax of __attribute__ was chosen to be something that the C preprocessor would accept and not alter (by default, anyway), so it looks a lot like a function call. It is not a function call, though.
Like much of the information that a compiler can learn about C code (by reading it), the compiler can make use of the information it learns through __attribute__ data in many different ways -- even using the same piece of data in multiple ways, sometimes.
The pure attribute tells the compiler that a function is actually a mathematical function -- using only its arguments and the rules of the language to arrive at its answer with no other side effects. Knowing this the compiler may be able to optimize better when calling a pure function, but it may also be used when compiling the pure function to warn you if the function does do something that makes it impure.
If you can keep in mind that (even though a few other compilers support them) attributes are a GCC extension and not part of C and their syntax does not fit into C in an elegant way (only enough to fool the preprocessor) then you should be able to understand them better.
You should try playing around with them. Take the ones that are more easily understood for functions and try them out. Do the same thing with data (it may help to look at the assembly output of GCC for this, but sizeof and checking the alignment will often help).
Think of it as a way to inject syntax into the source code, which is not standard C, but rather meant for consumption of the GCC compiler only. But, of course, you inject this syntax not for the fun of it, but rather to give the compiler additional information about the elements to which it is attached.
You may want to instruct the compiler to align a certain variable in memory at a certain alignment. Or you may want to declare a function deprecated so that the compiler will automatically generate a deprecated warning when others try to use it in their programs (useful in libraries). Or you may want to declare a symbol as a weak symbol, so that it will be linked in only as a last resort, if any other definitions are not found (useful in providing default definitions).
All of this (and more) can be achieved by attaching the right attributes to elements in your program. You can attach them to variables and functions.
Take a look at this whole bunch of other GCC extensions to C. The attribute mechanism is a part of these extensions.
There are too many attributes for there to be a single answer, but examples help.
For example __attribute__((aligned(16))) makes the compiler align that struct/function on a 16-bit stack boundary.
__attribute__((noreturn)) tells the compiler this function never reaches the end (e.g. standard functions like exit(int) )
__attribute__((always_inline)) makes the compiler inline that function even if it wouldn't normally choose to (using the inline keyword suggests to the compiler that you'd like it inlining, but it's free to ignore you - this attribute forces it).
Essentially they're mostly about telling the compiler you know better than it does, or for overriding default compiler behaviour on a function by function basis.
One of the best (but little known) features of GNU C is the attribute mechanism, which allows a developer to attach characteristics to function declarations to allow the compiler to perform more error checking. It was designed in a way to be compatible with non-GNU implementations, and we've been using this for years in highly portable code with very good results.
Note that attribute spelled with two underscores before and two after, and there are always two sets of parentheses surrounding the contents. There is a good reason for this - see below. Gnu CC needs to use the -Wall compiler directive to enable this (yes, there is a finer degree of warnings control available, but we are very big fans of max warnings anyway).
For more information please go to http://unixwiz.net/techtips/gnu-c-attributes.html
Lokesh Venkateshiah

inline a function inside another inline function in C

I currently have inline functions calling another inline function (a simple 4 lines big getAbs() function). However, I discovered by looking to the assembler code that the "big" inline functions are well inlined, but the compiler use a bl jump to call the getAbs() function.
Is it not possible to inline a function in another inline function? By the way, this is embedded code, we are not using the standard libraries.
Edit : The compiler is WindRiver, and I already checked that inlining would be beneficial (4 instructions instead of +-40).
Depending on what compiler you are using you may be able to encourage the compiler to be less reluctant to inline, e.g. with gcc you can use __attribute__ ((always_inline)), with Intel ICC you can use icc -inline-level=1 -inline-forceinline, and with Apple's gcc you can use gcc -obey-inline.
The inline keyword is a suggestion to the compiler, nothing more. It's free to take that suggestion on board, totally ignore it or even lie to you and tell that it's doing it while it's really not.
The only way to force code to be inline is to, well, write it inline. But, even, then the compiler may decide it knows better and decide to shift it out to another function. It has a lot of leeway in generating executable code for your particular source, provided it doesn't change the semantics of it.
Modern compilers are more than capable of generating better code than most developers would hand-craft in assembly. I think the inline keyword should go the same path as the register keyword.
If you've seen the output of gcc at its insane optimisation level, you'll understand why. It has produced code that I wouldn't have dreamed possible, and that took me a long time to understand.
As an aside, check this out for what optimisations that gcc actually has, including a great many containing the text "inline" or "inlining".
#gramm: There's quite a few scenarios in which inline isn't necessarily to your benefit. Most compilers use some very advanced heuristics to determine when to inline. When discussing inlining, the simplest idea is, trust your compiler to produce the fastest code.
I have recently had a very similar problem, reading this post has given me a wackky idea. Why not Have a simple pre-compilation (a simple reg ex should do the job ) code parser that parses out the function call to actually put the source code in-line. use a tag such as /inline/ /end_of_inline/ so that you can use normal ide features (if you are or might use an ide.
Include this in your build process, that way you have the readability advantage as well as removing the compilers assumption that you are only as good a developer as most and do not understand when to in-line.
Nonetheless before trying this you should probably go through the compilers command line options.
I would suggest that if your getAbs() function (sounds like absolute value but you really should be showing us code with the question...) is 4 lines long, then you have much bigger optimizations to worry about than whether the code gets inlined or not.

Resources