C compiler structure optimisation - c

The C standard does not allow certain optimisations of structures: for example, rearrangement of fields, merging fields, discarding fields that are never read from, hoisting fields out of the structure if they can be turned into auto variables, etc. This is needed for various reasons, including consistent structure layouts across compilation units and allowing cast-compatible structures.
Do any modern compilers (e.g. gcc, clang, Visual C) support extensions that allow me to tell it that it is okay to do these optimisations?
Naturally, they'd only make sense for definitions that were local to a single compilation unit, so that the compiler could see all possible uses of the structure; and certain things (like the aforesaid cast-compatible structure definitions) would become unusable. But for certain tasks this could be a very valuable optimisation.
I do know that gcc used to have a -fipa-struct-reorg option to allow precisely this, but it never worked very well and bit rotted, and was eventually taken out. But I don't know if it's been replaced by anything. And I haven't been able to find anything in clang, which surprises me because I would think that this is precisely the kind of optimisation that clang would be all over...

No. There is no reason for such a thing to be supplied.
You can't do it where the structure's address is taken and sent anywhere, as it might be aliased anyway. That pretty much rules out anything outside of a single function.
If you can go through and do the analysis required to flag structure members that "this can be optimised away if not used" (beware funky offset calculating macros) then you can see for yourself if it is needed or not, and take it out yourself.
If unsure, just comment it out and see if you get a compile error.

Related

Is it possible to generate ansi C functions with type information for a moving GC implementation?

I am wondering what methods there are to add typing information to generated C methods. I'm transpiling a higher-level programming language to C and I'd like to add a moving garbage collector. However to do that I need the method variables to have typing information, otherwise I could modify a primitive value that looks like a pointer.
An obvious approach would be to encapsulate all (primitive and non-primitive) variables in a struct that has an extra (enum) variable for typing information, however this would cause memory and performance overhead, the transpiled code is namely meant for embedded platforms. If I were to accept the memory overhead the obvious option would be to use a heap handle for all objects and then I'd be able to freely move heap blocks. However I'm wondering if there's a more efficient better approach.
I've come up with a potential solution, namely to predeclare and group variables based whether they're primitives or not (I can do that in the transpiler), and add an offset variable to each method at the end (I need to be able to find it accurately when scanning the stack area), that tells me where the non-primitive variables begin and where they end, so I can only scan those. This means that each method will use an additional 16/32-bit (depending on arch) of memory, however this should still be more memory efficient than the heap handle approach.
Example:
void my_func() {
int i = 5;
int z = 3;
bool b = false;
void* person;
void* person_info = ...;
.... // logic
volatile int offset = 0x034;
}
My aim is for something that works universally across GCC compilers, thus my concerns are:
Can the compiler reorder the variables from how they're declared in
the source code?
Can I force the compiler to put some data in the
method's stack frame (using volatile)?
Can I find the offset accurately when scanning the stack?
I'd like to avoid assembly so this approach can work (by default) across multiple platforms, however I'm open for methods even if they involve assembly (if they're reliable).
Typing information could be somehow encoded in the C function name; this is done by C++ and other implementations and called name mangling.
Actually, you could decide, since all your C code is generated, to adopt a different convention: generate long C identifiers which are practically unique and sort-of random program-wide, such as tiziw_7oa7eIzzcxv03TmmZ and keep their typing information elsewhere (e.g. some database). On Linux, such an approach is friendly to both libbacktrace and dlsym(3) + dladdr(3) (and of course nm(1) or readelf(1) or gdb(1)), so used in both bismon and RefPerSys projects.
Typing information is practically tied to calling conventions and ABIs. For example, the x86-64 ABI for Linux mandates different processor registers for passing floating points or pointers.
Read the Garbage Collection handbook or at least P.Wilson Uniprocessor Garbage Collection Techniques survey. You could decide to use tagged integers instead of boxing them, and you could decide to have a conservative GC (e.g. Boehm's GC) instead of a precise one. In my old GCC MELT project I generated C or C++ code for a generational copying GC. Similar techniques are used both in Bismon and in RefPerSys.
Since you are transpiling to C, consider also alternatives, such as libgccjit or LLVM. Look into libjit and asmjit.
Study also the implementation of other transpilers (compilers to C), including Chicken/Scheme and Bigloo.
Can the GCC compiler reorder the variables from how they're declared in the source code?
Of course yes, depending upon the optimizations you are asking. Some variables won't even exist in the binary (e.g. those staying in registers).
Can I force the compiler to put some data in the method's stack frame (using volatile)?
Better generate a single struct variable containing all your language variables, and leave optimizations to the compiler. You will be surprised (see this draft report).
Can I find the offset accurately when scanning the stack?
This is the most difficult, and depends a lot of compiler optimizations (e.g. if you run gcc with -O1 or -O3 on the generated C code; in some cases a recent GCC -e.g GCC 9 or GCC 10 on x86-64 for Linux- is capable of tail-call optimizations; check by compiling using gcc -O3 -S -fverbose-asm then looking into the produced assembler code). If you accept some small target processor and compiler specific tricks, this is doable. Study the implementation of the Ocaml compiler.
Send me (to basile#starynkevitch.net) an email for discussion. Please mention the URL of your question in it.
If you want to have an efficient generational copying GC with multi-threading, things become extremely tricky. The question is then how many years of development can you afford spending.
If you have exceptions in your language, take also a great care. You could with great caution generate calls to longjmp.
See of course this answer of mine.
With transpiling techniques, the evil is in the details
On Linux (specifically!) see also my manydl.c program. It demonstrates that on a Linux x86-64 laptop you could generate, in practice, hundred of thousands of dlopen(3)-ed plugins. Read then How to write shared libraries
Study also the implementation of SBCL and of GNU Prolog, at least for inspiration.
PS. The dream of a totally architecture-neutral and operating-system independent transpiler is an illusion.

unused `static inline` functions generate warnings with `clang`

When using gcc or clang, it's generally a good idea to enable a number of warnings, and a first batch of warnings is generally provided by -Wall.
This batch is pretty large, and includes the specific warning -Wunused-function.
Now, -Wunused-function is useful to detect static functions which are no longer invoked, meaning they are useless, and should therefore preferably be removed from source code.
When applying a "zero-warning" policy, it's no longer "preferable", but downright compulsory.
For performance reasons, some functions may be defined directly into header files *.h, so that they can be inlined at compile time (disregarding any kind of LTO magic). Such functions are generally declared and defined as static inline.
In the past, such functions would probably have been defined as macros instead, but it's considered better to make them static inline functions instead, whenever applicable (no funny type issue).
OK, so now we have a bunch of functions defined directly into header files, for performance reasons. A unit including such a header file is under no obligation to use all its declared symbols. Therefore, a static inline function defined in a header file may reasonably not be invoked.
For gcc, that's fine. gcc would flag an unused static function, but not an inline static one.
For clang though, the outcome is different : static inline functions declared in headers trigger a -Wunused-function warning if a single unit does not invoke them. And it doesn't take a lot of flags to get there : -Wall is enough.
A work-around is to introduce a compiler-specific extension, such as __attribute__((unused)), which explicitly states to the compiler that the function defined in the header may not necessarily be invoked by all its units.
OK, but now, the code which used to be clean C99 is including some form of specific compiler extension, adding to the weight of portability and maintenance.
The question therefore is more about the logic of such a choice : why does clang selects to trigger a warning when a static inline function defined in a header is not invoked ? In which case is that a good idea ?
And what does clang proposes to cover the relatively common case of inlined functions defined in header file, without requesting the usage of compiler extension ?
edit :
After further investigation, it appears the question is incorrect.
The warning is triggered in the editor (VSCode) using clang linter applying a selected list compilation flags (-Wall, etc.).
But when the source code is actually compiled with clang and with exactly the same list of flags, the "unused function" warning is not present.
So far, the results visible in the editor used to be exactly the ones found at compilation time. It's the first time I witness a difference.
So the problem seems related to the way the linter uses clang to produce its list of warnings. That's a much more complex and specific question.
Note the comment:
OK, sorry, this is actually different from expectation. It appears the warning is triggered in the editor using clang linter with selected compilation flags (-Wall, etc.). But when the source code is compiled with exactly the same flags, the "unused function" warning is actually not present. So far, the results visible in the editor used to be exactly the ones found at compilation time; it's the first time I witness a difference. So the problem seems related to the way the linter uses clang to produce its list of warnings. It seems to be a more complex question [than I realized].
I'm not sure you'll find any "why". I think this is a bug, possibly one that they don't care to fix. As you hint in your question, it does encourage really bad practice (annotation with compiler extensions where no annotation should be needed), and this should not be done; rather, the warning should just be turned off unless/until the bug is fixed.
If you haven't already, you should search their tracker for an existing bug report, and open one if none already exists.
Follow-up: I'm getting reports which I haven't verified that this behavior only happens for functions defined in source files directly, not from included header files. If that's true, it's nowhere near as bad, and probably something you can ignore.
'#ifdef USES_FUNTION_XYZ'
One would have to configure the used inline functions before including the header.
Sounds like a hassle and looks clumsy.
When using gcc or clang, it's generally a good idea to enable a number of warnings,
When using any C compiler, it's a good idea to ensure that the warning level is turned up, and to pay attention to the resulting warnings. Much breakage, confusion, and wasted effort can be saved that way.
Now, -Wunused-function is useful to detect static functions which are
no longer invoked, meaning they are useless, and should therefore
preferably be removed from source code. When applying a "zero-warning"
policy, it's no longer "preferable", but downright compulsory.
Note well that
Such zero-warning policies, though well-intended, are a crutch. I have little regard for policies that substitute inflexible rules for human judgement.
Such zero-warning policies can be subverted in a variety of ways, with disabling certain warnings being high on the list. Just how useful are they really, then?
Policy is adopted by choice, as a means to an end. Maybe not your choice personally, but someone's. If existing policy is not adequately serving the intended objective, or is interfering with other objectives, then it should be re-evaluated (though that does not necessarily imply that it will be changed).
For performance reasons, some functions may be defined directly into header files *.h, so that they can be inlined at compile time (disregarding any kind of LTO magic).
That's a choice. More often than not, one affording little advantage.
Such functions are generally declared and defined as static inline. In the past, such functions would probably have been defined as macros instead, but it's considered better to make them static inline functions instead, whenever applicable (no funny type issue).
Considered by whom? There are reasons to prefer functions over macros, but there are also reasons to prefer macros in some cases. Not all such reasons are objective.
A unit including such a header file is
under no obligation to use all its declared symbols.
Correct.
Therefore, a
static inline function defined in a header file may reasonably not be
invoked.
Well, that's a matter of what one considers "reasonable". It's one thing to have reasons to want to do things that way, but whether those reasons outweigh those for not doing it that way is a judgement call. I wouldn't do that.
The question therefore is more about the logic of such a choice : why
does clang selects to trigger a warning when a static inline function
defined in a header is not invoked ? In which case is that a good idea
?
If we accept that it is an intentional choice, one would presume that the Clang developers have a different opinion about how reasonable the practice you're advocating is. You should consider this a quality-of-implementation issue, there being no rules for whether compilers should emit diagnostics in such cases. If they have different ideas about what they should warn about than you do, then maybe a different compiler would be more suitable.
Moreover, it would be of little consequence if you did not also have a zero-warning policy, so multiple choices on your part are going into creating an issue for you.
And what does clang proposes to cover the relatively common case of
inlined functions defined in header file, without requesting the usage
of compiler extension ?
I doubt that clang or its developers propose any particular course of action here. You seem to be taking the position that they are doing something wrong. They are not. They are doing something that is inconvenient for you, and that therefore you (understandably) dislike. You will surely find others who agree with you. But none of that puts any onus on Clang to have a fix.
With that said, you could try defining the functions in the header as extern inline instead of static inline. You are then obligated to provide one non-inline definition of each somewhere in the whole program, too, but those can otherwise be lexically identical to the inline definitions. I speculate that this may assuage Clang.

Assembly-level function fingerprint

I would like to determine, whether two functions in two executables were compiled from the same (C) source code, and would like to do so even if they were compiled by different compiler versions or with different compilation options. Currently, I'm considering implementing some kind of assembler-level function fingerprinting. The fingerprint of a function should have the properties that:
two functions compiled from the same source under different circumstances are likely to have the same fingerprint (or similar one),
two functions compiled from different C source are likely to have different fingerprints,
(bonus) if the two source functions were similar, the fingerprints are also similar (for some reasonable definition of similar).
What I'm looking for right now is a set of properties of compiled functions that individually satisfy (1.) and taken together hopefully also (2.).
Assumptions
Of course that this is generally impossible, but there might exist something that will work in most of the cases. Here are some assumptions that could make it easier:
linux ELF binaries (without debugging information available, though),
not obfuscated in any way,
compiled by gcc,
on x86 linux (approach that can be implemented on other architectures would be nice).
Ideas
Unfortunately, I have little to no experience with assembly. Here are some ideas for the abovementioned properties:
types of instructions contained in the function (i.e. floating point instructions, memory barriers)
memory accesses from the function (does it read/writes from/to heap? stack?)
library functions called (their names should be available in the ELF; also their order shouldn't usually change)
shape of the control flow graph (I guess this will be highly dependent on the compiler)
Existing work
I was able to find only tangentially related work:
Automated approach which can identify crypto algorithms in compiled code: http://www.emma.rub.de/research/publications/automated-identification-cryptographic-primitives/
Fast Library Identification and Recognition Technology in IDA disassembler; identifies concrete instruction sequences, but still contains some possibly useful ideas: http://www.hex-rays.com/idapro/flirt.htm
Do you have any suggestions regarding the function properties? Or a different idea which also accomplishes my goal? Or was something similar already implemented and I completely missed it?
FLIRT uses byte-level pattern matching, so it breaks down with any changes in the instruction encodings (e.g. different register allocation/reordered instructions).
For graph matching, see BinDiff. While it's closed source, Halvar has described some of the approaches on his blog. They even have open sourced some of the algos they do to generate fingerprints, in the form of BinCrowd plugin.
In my opinion, the easiest way to do something like this would be to decompose the functions assembly back into some higher level form where constructs (like for, while, function calls etc.) exist, then match the structure of these higher level constructs.
This would prevent instruction reordering, loop hoisting, loop unrolling and any other optimizations messing with the comparison, you can even (de)optimize this higher level structures to their maximum on both ends to ensure they are at the same point, so comparisons between unoptimized debug code and -O3 won't fail out due to missing temporaries/lack of register spills etc.
You can use something like boomerang as a basis for the decompilation (except you wouldn't spit out C code).
I suggest you approach this problem from the standpoint of the language the code was written in and what constraints that code puts on compiler optimization.
I'm not real familiar with the C standard, but C++ has the concept of "observable" behavior. The standard carefully defines this, and compilers are given great latitude in optimizing as long as the result gives the same observable behavior. My recommendation for trying to determine if two functions are the same would be to try to determine what their observable behavior is (what I/O they do and how the interact with other areas of memory and in what order).
If the problem set can be reduced to a small set of known C or C++ source code functions being compiled by n different compilers, each with m[n] different sets of compiler options, then a straightforward, if tedious, solution would be to compile the code with every combination of compiler and options and catalog the resulting instruction bytes, or more efficiently, their hash signature in a database.
The set of likely compiler options used is potentially large, but in actual practice, engineers typically use a pretty standard and small set of options, usually just minimally optimized for debugging and fully optimized for release. Researching many project configurations might reveal there are only two or three more in any engineering culture relating to prejudice or superstition of how compilers work—whether accurate or not.
I suspect this approach is closest to what you actually want: a way of investigating suspected misappropriated source code. All the suggested techniques of reconstructing the compiler's parse tree might bear fruit, but have great potential for overlooked symmetric solutions or ambiguous unsolvable cases.

How to use the __attribute__ keyword in GCC C?

I am not clear with use of __attribute__ keyword in C.I had read the relevant docs of gcc but still I am not able to understand this.Can some one help to understand.
__attribute__ is not part of C, but is an extension in GCC that is used to convey special information to the compiler. The syntax of __attribute__ was chosen to be something that the C preprocessor would accept and not alter (by default, anyway), so it looks a lot like a function call. It is not a function call, though.
Like much of the information that a compiler can learn about C code (by reading it), the compiler can make use of the information it learns through __attribute__ data in many different ways -- even using the same piece of data in multiple ways, sometimes.
The pure attribute tells the compiler that a function is actually a mathematical function -- using only its arguments and the rules of the language to arrive at its answer with no other side effects. Knowing this the compiler may be able to optimize better when calling a pure function, but it may also be used when compiling the pure function to warn you if the function does do something that makes it impure.
If you can keep in mind that (even though a few other compilers support them) attributes are a GCC extension and not part of C and their syntax does not fit into C in an elegant way (only enough to fool the preprocessor) then you should be able to understand them better.
You should try playing around with them. Take the ones that are more easily understood for functions and try them out. Do the same thing with data (it may help to look at the assembly output of GCC for this, but sizeof and checking the alignment will often help).
Think of it as a way to inject syntax into the source code, which is not standard C, but rather meant for consumption of the GCC compiler only. But, of course, you inject this syntax not for the fun of it, but rather to give the compiler additional information about the elements to which it is attached.
You may want to instruct the compiler to align a certain variable in memory at a certain alignment. Or you may want to declare a function deprecated so that the compiler will automatically generate a deprecated warning when others try to use it in their programs (useful in libraries). Or you may want to declare a symbol as a weak symbol, so that it will be linked in only as a last resort, if any other definitions are not found (useful in providing default definitions).
All of this (and more) can be achieved by attaching the right attributes to elements in your program. You can attach them to variables and functions.
Take a look at this whole bunch of other GCC extensions to C. The attribute mechanism is a part of these extensions.
There are too many attributes for there to be a single answer, but examples help.
For example __attribute__((aligned(16))) makes the compiler align that struct/function on a 16-bit stack boundary.
__attribute__((noreturn)) tells the compiler this function never reaches the end (e.g. standard functions like exit(int) )
__attribute__((always_inline)) makes the compiler inline that function even if it wouldn't normally choose to (using the inline keyword suggests to the compiler that you'd like it inlining, but it's free to ignore you - this attribute forces it).
Essentially they're mostly about telling the compiler you know better than it does, or for overriding default compiler behaviour on a function by function basis.
One of the best (but little known) features of GNU C is the attribute mechanism, which allows a developer to attach characteristics to function declarations to allow the compiler to perform more error checking. It was designed in a way to be compatible with non-GNU implementations, and we've been using this for years in highly portable code with very good results.
Note that attribute spelled with two underscores before and two after, and there are always two sets of parentheses surrounding the contents. There is a good reason for this - see below. Gnu CC needs to use the -Wall compiler directive to enable this (yes, there is a finer degree of warnings control available, but we are very big fans of max warnings anyway).
For more information please go to http://unixwiz.net/techtips/gnu-c-attributes.html
Lokesh Venkateshiah

Large C macros. What's the benefit?

I've been working with a large codebase written primarily by programmers who no longer work at the company. One of the programmers apparently had a special place in his heart for very long macros. The only benefit I can see to using macros is being able to write functions that don't need to be passed in all their parameters (which is recommended against in a best practices guide I've read). Other than that I see no benefit over an inline function.
Some of the macros are so complicated I have a hard time imagining someone even writing them. I tried creating one in that spirit and it was a nightmare. Debugging is extremely difficult, as it takes N+ lines of code into 1 in the a debugger (e.g. there was a segfault somewhere in this large block of code. Good luck!). I had to actually pull the macro out and run it un-macro-tized to debug it. The only way I could see the person having written these is by automatically generating them out of code written in a function after he had debugged it (or by being smarter than me and writing it perfectly the first time, which is always possible I guess).
Am I missing something? Am I crazy? Are there debugging tricks I'm not aware of? Please fill me in. I would really like to hear from the macro-lovers in the audience. :)
To me the best use of macros is to compress code and reduce errors. The downside is obviously in debugging, so they have to be used with care.
I tend to think that if the resulting code isn't an order of magnitude smaller and less prone to errors (meaning the macros take care of some bookkeeping details) then it wasn't worth it.
In C++, many uses like this can be replaced with templates, but not all. A simple example of Macros that are useful are in the event handler macros of MFC -- without them, creating event tables would be much harder to get right and the code you'd have to write (and read) would be much more complex.
If the macros are extremely long, they probably make the code short but efficient. In effect, he might have used macros to explicitly inline code or remove decision points from the run-time code path.
It might be important to understand that, in the past, such optimizations weren't done by many compilers, and some things that we take for granted today, like fast function calls, weren't valid then.
To me, macros are evil. With their so many side effects, and the fact that in C++ you can gain same perf gains with inline, they are not worth the risk.
For ex. see this short macro:
#define max(a, b) ((a)>(b)?(a):(b))
then try this call:
max(i++, j++)
More. Say you have
#define PLANETS 8
#define SOCCER_MIDDLE_RIGHT 8
if an error is thrown, it will refer to '8', but not either of its meaninful representations.
I only know of two reasons for doing what you describe.
First is to force functions to be inlined. This is pretty much pointless, since the inline keyword usually does the same thing, and function inlining is often a premature micro-optimization anyway.
Second is to simulate nested functions in C or C++. This is related to your "writing functions that don't need to be passed in all their parameters" but can actually be quite a bit more powerful than that. Walter Bright gives examples of where nested functions can be useful.
There are other reasons to use of macros, such as using preprocessor-specific functionality (like including __FILE__ and __LINE__ in autogenerated error messages) or reducing boilerplate code in ways that functions and templates can't (the Boost.Preprocessor library excels here; see Boost.ScopeExit or this sample enum code for examples), but these reasons don't seem to apply for doing what you describe.
Very long macros will have performance drawbacks, like increased compiled binary size, and there are certainly other reasons for not using them.
For the most problematic macros, I would consider running the code through the preprocessor, and replacing the macro output with function calls (inline if possible) or straight LOC. If the macros exists for compatibility with other architectures/OS's, you might be stuck though.
Part of the benefit is code replication without the eventual maintenance cost - that is, instead of copying code elsewhere you create a macro from it and only have to edit it once...
Of course, you could also just make a method to be called but that is sort of more work... I'm against much macro use myself, just trying to present a potential rationale.
There are a number of good reasons to write macros in C.
Some of the most important are for creating configuration tables using x-macros, for making function like macros that can accept multiple parameter types as inputs and converting tables from human readable/configurable/understandable values into computer used values.
I cant really see a reason for people to write very long macros, except for the historic automatic function inline.
I would say that when debugging complex macros, (when writing X macros etc) I tend to preprocess the source file and substitute the preprocessed file for the original.
This allows you to see the C code generated, and gives you real lines to work with in the debugger.
I don't use macros at all. Inline functions serve every useful purpose a macro can do. Macro allow you to do very weird and counterintuitive things like splitting up identifiers (How does someone search for the identifier then?).
I have also worked on a product where a legacy programmer (who thankfully is long gone) also had a special love affair with Macros. His 'custom' scripting language is the height of sloppiness. This was compounded by the fact that he wrote his C++ classes in C, meaning all class functions and variables were all public. Anyways, he wrote almost everything in macro's and variadic functions (Another hideous monstrosity foisted on the world). So instead of writing a proper template class he would use a Macro instead! He also resorted to macro's to create factory classes as well, instead of normal code... His code is pretty much unmaintanable.
From what I have seen, macro's can be used when they are small and are used declaratively and don't contain moving parts like loops, and other program flow expressions. It's OK if the macro is one or at the most two lines long and it declares and instance of something. Something that won't break during runtime. Also macro's should not contain class definitions, or function definitions. If the macro contains code that needs to be stepped into using a debugger than the macro should be removed and replace with something else.
They can also be useful for wrapping custom tracing/debugging functionality. For instance you want custom tracing in debug builds but not release builds.
Anyways when you are working in legacy code like that, just be sure to remove a bit of the macro mess a bit at a time. If you keep it up, with enough time eventually you will remove them all and make life a bit easier for yourself. I have done this in the past, with especially messy macro's. What I do is turn on the compiler switch to have the preprocessor generate an output file. Then I raid that file, and copy the code, re-indent it, and replace the macro with the generated code. Thank goodness for that compiler feature.
Some of the legacy code I've worked with used macros very extensively in the place of methods. The reasoning was that the computer/OS/runtime had an extremely small stack, so that stack overflows were a common problem. Using macros instead of methods meant that there were fewer methods on the stack.
Luckily, most of that code was obsolete, so it is (mostly) gone now.
C89 did not have inline functions. If using a compiler with extensions disabled (which is a desirable thing to do for several reasons), then the macro might be the only option.
Although C99 came out in 1999, there was resistance to it for a long time; commercial compiler vendors didn't feel it was worth their time to implement C99. Some (e.g. MS) still haven't. So for many companies it was not a viable practical decision to use C99 conforming mode, even up to today in the case of some compilers.
I have used C89 compilers that did have an extension for inline functions, but the extension was buggy (e.g. multiple definition errors when there should not be), things like that may dissuade a programmer from using inline functions.
Another thing is that the macro version effectively forces that the function will actually be inlined. The C99 inline keyword is only a compiler hint and the compiler may still decide to generate a single instance of the function code which is linked like a non-inline function. (One compiler that I still use will do this if the function is not trivial and returning void).

Resources