Negative stigma with C macros [closed] - c

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 6 years ago.
Improve this question
As I dive deeper into the C programming language I am having trouble understanding why macros should be used as a last resort. For instance, this post. This is not the first time I've heard chatter that they are a last resort. Some suggest that the memory footprint is more excessive than calling a function (this post). Albeit I understand these arguments, as well as why they should not be used for C++ (compiler optimizations and the like), I do not understand the following:
Since macros 'unroll' (if you will) in the .text segment of the stack there is less overhead associated with macros as opposed to function calls - e.g. memory need not be allocated between the frame pointer and stack pointer. This memory overhead is quantifiable, where as this post suggest that macros are not.
Much of the work I do is in embedded systems, micro-controllers, and systems programming. I have read many books and articles by Bjarne Stroustrup. I am also in the process of reading Clean Code - where Robert Martin persists that readability is king (not implying macros increase readability in their own right).
TLDR:
Considering macros reduce the overhead associated with stack frames and (if used appropriately) can increase readability - Why the negative stigma? They are littered through BSD papers and man pages.

Some will vote to close because this could be construed as an opinion question, but there are some facts.
If you have a decent compiler, calls to simple functions are expanded inline just like macros, but they're considerably more type safe. I've seen cases where inlined functions were faster because the compiler missed common sub-expressions in the the compiled macro that were eliminated when function arguments were inlined.
When you call a simple function multiple times, you can give the compiler hints on whether to expand it inline each time or call it -- trading code space for tiny stack and runtime overhead you mention. Better yet, lots of compilers (like gcc) will make this call automatically based on heuristics that may be better than your intuition. With a macro, you can only modify the macro to call a function. Code changes are more error prone than build hints.
In most compilation systems, error messages and debuggers don't reference the body code of macros. OTOH, modern debuggers will step correctly line-by-line through the body of a function even though it's actually been expanded inline. Similarly, compilers will correctly point to error locations in function bodies.
It's extremely easy to code subtle bugs by expanding multiple arg references: #define MAX(X,Y) (X>Y?X:Y) followed by MAX(++x, 3). Here x is incremented once if it's less than 3, twice otherwise. Ouch.
Multi-level macro expansion (where a macro call produces other macro calls) relies on complex rules and is therefore error-prone. E.g. it's not hard to create macros that depend on specific behavior of one preprocessor so that they fail on another.
Functions can be recursive. Macros can't.
C11 features provide ways to do things where macros have been (ab)used in the past: aggregate literals for example. Again, the advantages are type safety, error, and debugger references.
The upshot is that when you use a macro, you're adding error risk and maintenance cost that don't exist with inlined functions. It's hard to escape the conclusion that you should use macros only as a fallback. The main defensible reason to do otherwise is that you're forced to use an old or junky compiler.

No typesafety and have side effects of pure textual replacement. These are main things for which we should avoid.

Macros can do a lot - quite a lot - but as is well-known it's easy to accidentally build macros that do the wrong thing, either by mishandling side-effects, accidentally evaluating arguments multiple times, or messing up operator precedence. I think that the right evaluation to make would be to evaluate the upsides of the benefits of macros versus the potential risks or drawbacks, then make the call from there.
For example, let's take function-like macros. As you mention, they're often used to make code faster by eliminating short function calls. But nowadays, you can achieve the same thing either by using the new inline keyword adopted from C++ or just by cranking up compiler optimization settings, since compilers these days are dramatically better at optimization than they were many years back. If you're using those macros because you want to perform operations like taking a min or max that have basically identical code but different types, use the ‘_Generic` keyword. These other options are less error-prone and easier to debug and test, so it's probably worth avoiding the risk of a macro error by using them.
Then there's defining constants with#define. This fell out of favor in C++ in favor of constants, and that's also now possible in C using static const variables at global scope. You can use macros to do this, but it's more type-safe to use the other option instead. Most compilers are smart enough to inline the constants and do optimizations on the values in ways that previously only macros would guarantee.
For these more routine operations, the benefits of using macros aren't as high as they used to be because new language features and vastly smarter compilers have provided less-risky alternatives. That's the main reason why in general the advice is to avoid using macros - they're just not the best tool for the job.
This doesn't mean to never use macros. There are many times and places where they're fantastic. X Macros are a really neat way to automatically generate code, and macro substitution is super helpful for taking advantage of compiler-specific or OS-specific features while maintaining portability. I don't foresee those uses going away any time soon. But do consider the alternatives in other cases, since in many instances they were specifically invented to address weaknesses of the macro preprocessing system!

A good optimizing compiler should give efficient code for calls to inline functions, as efficient as if you use macros (think of getc as an example)
But you may want to use macros when they are not replacable by inline functions. (Here is an example).

Related

Advantage of #define instead of creating a function in embedded

Recently I got to view an embedded code in that they are using
#define print() printf("hello world")
instead of
void print() { printf("hello world"); }
My question what is the gain on using #define instead of creating a function?
It may be related to performance.
A function call has some overhead (i.e. calling, saving things on the stack, returning, etc) while a macro is a direct substitution of the macro name with it's contents (i.e. no overhead).
In this example the functions foo and bar does exactly the same. foo uses a macro while bar uses a function call.
As you can see bar and printY together requires more instructions than foo
.
So by using a macro the performance got a little better.
But... there are downsides to this approach:
Macros are hard to debug as you can't single step a macro
Extensive use of a macro increases the size of the binary (compared to using function call). Something that can impact performance in a negative direction.
Also notice that modern compilers (with optimization on) are really good at figuring out when it's a good idea to automatically inline a function (i.e. your code is written with a function call but the compiler decides to inline the function as if it was a macro). So you might get the same performance using function call.
Further, you can use the inline key word as a hint to the compiler that you think it will be good to inline a function. But even with that keyword the compiler may decide not to inline. The only way to make sure that the code gets inline, is by using a macro.
There is no advantage. Using #define like this is quite ancient C programming style.
In the year 1999, the C language got the inline keyword to make all such macros obsolete. And with modern compilers, inline is often superfluous too, since the compiler is nowadays better than the programmer when it comes to determining when to inline.
Some of the embedded compilers out can still be rather bad at such optimizations though, and that's why embedded C code tends to lag behind in modernization.
In general, doing micro-optimizations like this is called "pre-mature optimizations", meaning the programmer is meddling with optimizations that they should leave to the compiler. Even in hard real time systems. Optimizations should only be the last resort when you have 1) detected an actual bottleneck, and 2) disassembled to see if manual inlining actually does anything good for performance.
Sometimes you want to stub out functionality at compile time. Macros give you an easy way to do this.

Difference between macros and functions in C in relation to instruction memory and speed

To my understanding the difference between a macro and a function is, that a macro-call will be replaced by the instruction in the definition, and a function does the whole push, branch and pop -thing. Is this right, or have I understand something wrong?
Additionally, if this is right, it would mean, that macros would take more space, but would be faster (because of the lack of the push, branch and pop instructions), wouldn't it?
What you are wrote about the performance implications is correct if the C compiler is not optimizing. But optimizing compilers can inline functions just as if they were macros, so an inlined function call runs at the same speed as a macro, and there is no pushing/popping overhead. To trigger inlining, enable optimization in your compiler settings (e.g. gcc -O2), and put your functions to the .h file as static inline.
Please note that sometimes inlining/macros is faster, sometimes a real function call is faster, depending on the code and the compiler. If the function body is very short (and most of it will be optimized away), usually inlining is faster than a function call.
Another important difference that macros can take arguments of different types, and the macro definition can make sense for multiple types (but the compiler won't do type checking for you, so you may get undesired behavior or a cryptic error message if you use a macro with the wrong argument type). This polymorphism is hard to mimic with functions in C (but easy in C++ with function overloading and function templates).
This might have been right in the 1980s, but modern compilers are much better.
Functions don't always push and pop the stack, especially if they're leaf functions or have tail calls. Also, functions are often inlined, and can be inlined even if they are defined in other translation units (this is called link-time optimization).
But you're right that in general, when optimizations are turned off, a macro be inlined and a function won't be inlined. Either version may take more space, it depends on the particulars of the macro/function.
A function uses space in two ways: the body uses space, and the function call uses space. If the function body is very small, it may actually save space to inline it.
Yes your understanding is right. But you should also note that, no type checking in macro and it can lead to side effect. You should also be very careful in parenthesizing macros.
Your understanding is half correct. The point is that macros are resolved before compilation. You should think of them as sophisticated text replacement tools (that's oversimplifying it, but is mostly what it comes down to).
So the difference is when in the build process your code is used.
This is orthogonal to the question of what the compiler really does with it when it creates the final binary code. It is more or less free to do whatever it thinks is correct to produce the intended behaviour. In C++, you can only hint at your preference with the inline keyword. The compiler is free to ignore that hint.
Again, this is orthogonal to the whole preprocessor business. Nothing stops you from writing macros which result in C++ code using the inline keyword, after all. Likewise, nobody stops you from writing macros which result in a lot of recursive C++ functions which the compiler will probably not be able to inline even if wanted to do.
The conclusion is that your question is wrong. It's a general question of having binaries with a lot of inlined functions vs. binaries with a lot of real function calls. Macros are just one technique you can use to influence the tradeoff in one way or the other, and you will ask yourself the same general question without macros.
The assumption that inlining a function will always trade space for speed is wrong. Inlining the wrong (i.e. too big) functions will even have a negative impact on speed. As is always the case with such opimisations, do not guess but measure.
You should read the FAQ on this: "Do inline functions improve performance?"

What's the purpose of using assembly language inside a C program?

What's the purpose of using assembly language inside a C program? Compilers are able to generate assembly language already. In what cases would it be better to write assembly than C? Is performance a consideration?
In addition to what everyone said: not all CPU features are exposed to C. Sometimes, especially in driver and operating system programming, one needs to explicitly work with special registers and/or commands that are not otherwise available.
Also vector extensions.
That was especially true before the advent of compiler intrinsics. Those alleviate the need for inline assembly somewhat.
One more use case for inline assembly has to do with interfacing C with reflected languages. Specifically, assembly is all but necessary if you need to call a function when its prototype is not known at compile time. In other words, when the quantity and datatypes of that function's arguments are but runtime variables. C variadic functions and the stdarg machinery won't help you in this case - they would help you parse a stack frame, but not build one. In assembly, on the other hand, it's quite doable.
This is not an OS/driver scenario. There are at least two technologies out there - Java's JNI and COM Automation - where this is a must. In case of Automation, I'm talking about the way the COM runtime is marshaling dual interfaces using their type libraries.
I can think of a very crude C alternative to assembly for that, but it'd be ugly as sin. Slightly less ugly in C++ with templates.
Yet another use case: crash/run-time error reporting. For postmortem debugging, you'd want to capture as much of program state at the point of crash as possible (i. e. all the CPU registers), and assembly is a much better vehicle for that than C. Postmortem debugging of crashing native code usually involves staring at the assembly anyway.
Yet another use case - code that is intended for execution in another process without that process' co-operation or knowledge. This is often referred to as "shellcode", but it doesn't have to be shell related. Code like that needs to be very carefully written, and it can't rely on the conveniences of a high level language (like the run time library, or having a data section) that are normally taken for granted. When one is after injecting a significant piece of functionality into a target process, they usually end up loading a dynamic library, but the initial trampoline code that loads the library and passes control to it tends to be in assembly.
I've been only covering cases where assembly is necessary. Hand-optimizing for performance is covered in other answers.
There are a few, although not many, cases where hand-optimized assembly language can be made to run more efficiently than assembly language generated by C compilers from C source code. Also, for developers used to assembly language, some things can just seem easier to write in assembler.
For these cases, many C compilers allow inline assembly.
However, this is becoming increasingly rare as C compilers get better and better and producing efficient code, and most platforms put restrictions on some of the low-level type of software that is often the type of software that benefits most from being written in assembler.
In general, it is performance but performance of a very specific kind. For example, the SIMD parallel instructions of a processor might not generated by the compiler. By utilizing processor specific data formats and then issuing processor specific parallel instructions (e.g. ARM NEON or Intel SSE), very fast performance on graphics or signal processing problems can occur. Even then, some compilers allow these to be expressed in C using intrinsic functions.
While it used to be common to use assembly language inserts to hand-optimize critical functions, those days are largely done. Modern compilers are very good and modern processors have very complicated timing requirements so hand optimized code is often less optimal than expected.
There were various reasons to write inline assemblies in C. We can simply categorize the reasons into necessary and unnecessary.
For the reasons of unnecessary, possibly be:
platform compatibility
performance concerning
code optimization
etc.
I consider above as unnecessary because sometime they can be discard or implemented through pure C. For example of platform compatibility, you can totally implement particular version for each platform, however, use inline assemblies might reduce the effort. Here we are not going to talk too much about the unnecessary reasons.
For necessary reasons, they possibly be:
something with standard libraries was insufficient to do
some instruction set was not supported by compilers
object code generated incorrectly
writing stack-sensitive code
etc.
These reasons considered necessary, because of they are almost not possibly done with pure C language. For example, in old DOSes, software interrupt INT21 was not reentrantable. If you want to write a Virtual Dirve fully use INT21 supported by the compiler, it was impossible to do. In this situation, you would need to hook the original INT21, and make it reentrantable. However, the compiled code wraps your every call with prolog/epilog. Thus, you can never break something restricted, or you just crashed the code. You can try any of trick by using the pure language of C with libraries; but even you can successfully find a trick, that would mean you found a particular order that the compiler generates the machine code; this is implying: you tried to let the compiler compiles your code to exactly machine code. So, why not just write inline assemblies directly?
This example explained all above of necessary reasons except instruction set not supported, but I think that was easy to think about.
In fact, there're more reasons to write inline assemblies, but now you have some ideas of them, and so on.
Just as a curiosity, I'm adding here a concrete example of something not-so-low-level you can only do in assembly. I read this in an assembly book from my university time where it was used to show an inherent limitation of C/C++, and how to overcome it with assembly.
The problem is how do I invoke a function when the exact number of parameters is only known at runtime? In fact, in C/C++ you can easily define a function that takes a variable number of arguments like printf. But when it comes to calling that function, the compiler wants to know exactly how many parameters must be passed. You may pass more paremters than required, that won't do any harm. But what if the number grows unexpectedly to 100 or 1000 parameters, and must be picked out of an array?
The solution of course is using assembly, where you can dynamically create a stack frame of the proper size, copy the parameters on the stack, invoke the function, and finally reset the stack.
In practice, this would hardly ever be a limitation (except if the library you're using is really really bad designed). People who use assembly in C have much better reasons to do so like others have pointed out in their answers. Still, I think may be an interesting fact to know.
I would rather think of that as a way to write a very specific code for a specific platform, optimization, though still common, is used less nowadays. Knowledge and usage of assembly in C is also practiced by all-color hats.

Alternatives to C "inline" keyword

From my course instructor, he has repeatedly emphasized and asked us not to use the "inline" keyword for functions. He says it is not "portable" across compilers and is not "standard". Considering this, are there any "standard" alternatives that allow for "inline expansion"?
Your course instructor is wrong. It is standard. It's actually in the current standard, right there in section 6.7.4 Function specifiers (C99). The fact that it's a suggestion to the compiler that may be totally ignored does not make it any less standard.
I don't think it was in C89/90 which may be what some embedded compilers use but I would give serious consideration to upgrading in that case.
However, even where inline is available, I generally leave those decisions up to the compiler itself since most modern ones are more than capable of figuring out how best to optimise code (and usually far better than I). The inline keyword, like register and auto, is not something I normally worry about at all.
You can use macros instead since that's relatively simple text substitution that generally happens before the compile phase but you should be aware of the limitations and foibles.
Or you can manually inline code (ie, duplicate it) although I wouldn't suggest this as an option since it may quickly become a maintenance nightmare.
Myself, I would write the code using normal functions without any of those tricks and then introduce them where necessary (and only if you can demonstrate that they're needed, such as a specific performance issue).
You should always assume that the coder who has to maintain your code is a psychopathic killer who knows where you live :-)
As others have said, inline was integrated to the C standard 11 years ago.
Other than was indicated, inline makes a difference since it changes the visibility properties of the function. In particular for large libraries with a lot of functions declared only static you might have one version of any these function in all object files (e.g when you compile with debugging switched on).
Please have a look into that post: Myth and reality about inline in C99
As evil as they may be, macros are still king (although specific compilers may support extra capabilities).
Here, now it's "portable across compilers":
#if (__STDC_VERSION__ < 199901L)
#define inline
#endif
static inline int foobar(int x) /* ... */
By the way, as others have said, the inline keyword is just a hint and rather useless, but the important keyword is static. Unless your function is declared static, it will have external linkage, and the compiler is unlikely to consider it a candidate for inlining when it makes its own decisions about which functions to inline.
Also note that unlike in C++, the C language does not allow inline without static.

Large C macros. What's the benefit?

I've been working with a large codebase written primarily by programmers who no longer work at the company. One of the programmers apparently had a special place in his heart for very long macros. The only benefit I can see to using macros is being able to write functions that don't need to be passed in all their parameters (which is recommended against in a best practices guide I've read). Other than that I see no benefit over an inline function.
Some of the macros are so complicated I have a hard time imagining someone even writing them. I tried creating one in that spirit and it was a nightmare. Debugging is extremely difficult, as it takes N+ lines of code into 1 in the a debugger (e.g. there was a segfault somewhere in this large block of code. Good luck!). I had to actually pull the macro out and run it un-macro-tized to debug it. The only way I could see the person having written these is by automatically generating them out of code written in a function after he had debugged it (or by being smarter than me and writing it perfectly the first time, which is always possible I guess).
Am I missing something? Am I crazy? Are there debugging tricks I'm not aware of? Please fill me in. I would really like to hear from the macro-lovers in the audience. :)
To me the best use of macros is to compress code and reduce errors. The downside is obviously in debugging, so they have to be used with care.
I tend to think that if the resulting code isn't an order of magnitude smaller and less prone to errors (meaning the macros take care of some bookkeeping details) then it wasn't worth it.
In C++, many uses like this can be replaced with templates, but not all. A simple example of Macros that are useful are in the event handler macros of MFC -- without them, creating event tables would be much harder to get right and the code you'd have to write (and read) would be much more complex.
If the macros are extremely long, they probably make the code short but efficient. In effect, he might have used macros to explicitly inline code or remove decision points from the run-time code path.
It might be important to understand that, in the past, such optimizations weren't done by many compilers, and some things that we take for granted today, like fast function calls, weren't valid then.
To me, macros are evil. With their so many side effects, and the fact that in C++ you can gain same perf gains with inline, they are not worth the risk.
For ex. see this short macro:
#define max(a, b) ((a)>(b)?(a):(b))
then try this call:
max(i++, j++)
More. Say you have
#define PLANETS 8
#define SOCCER_MIDDLE_RIGHT 8
if an error is thrown, it will refer to '8', but not either of its meaninful representations.
I only know of two reasons for doing what you describe.
First is to force functions to be inlined. This is pretty much pointless, since the inline keyword usually does the same thing, and function inlining is often a premature micro-optimization anyway.
Second is to simulate nested functions in C or C++. This is related to your "writing functions that don't need to be passed in all their parameters" but can actually be quite a bit more powerful than that. Walter Bright gives examples of where nested functions can be useful.
There are other reasons to use of macros, such as using preprocessor-specific functionality (like including __FILE__ and __LINE__ in autogenerated error messages) or reducing boilerplate code in ways that functions and templates can't (the Boost.Preprocessor library excels here; see Boost.ScopeExit or this sample enum code for examples), but these reasons don't seem to apply for doing what you describe.
Very long macros will have performance drawbacks, like increased compiled binary size, and there are certainly other reasons for not using them.
For the most problematic macros, I would consider running the code through the preprocessor, and replacing the macro output with function calls (inline if possible) or straight LOC. If the macros exists for compatibility with other architectures/OS's, you might be stuck though.
Part of the benefit is code replication without the eventual maintenance cost - that is, instead of copying code elsewhere you create a macro from it and only have to edit it once...
Of course, you could also just make a method to be called but that is sort of more work... I'm against much macro use myself, just trying to present a potential rationale.
There are a number of good reasons to write macros in C.
Some of the most important are for creating configuration tables using x-macros, for making function like macros that can accept multiple parameter types as inputs and converting tables from human readable/configurable/understandable values into computer used values.
I cant really see a reason for people to write very long macros, except for the historic automatic function inline.
I would say that when debugging complex macros, (when writing X macros etc) I tend to preprocess the source file and substitute the preprocessed file for the original.
This allows you to see the C code generated, and gives you real lines to work with in the debugger.
I don't use macros at all. Inline functions serve every useful purpose a macro can do. Macro allow you to do very weird and counterintuitive things like splitting up identifiers (How does someone search for the identifier then?).
I have also worked on a product where a legacy programmer (who thankfully is long gone) also had a special love affair with Macros. His 'custom' scripting language is the height of sloppiness. This was compounded by the fact that he wrote his C++ classes in C, meaning all class functions and variables were all public. Anyways, he wrote almost everything in macro's and variadic functions (Another hideous monstrosity foisted on the world). So instead of writing a proper template class he would use a Macro instead! He also resorted to macro's to create factory classes as well, instead of normal code... His code is pretty much unmaintanable.
From what I have seen, macro's can be used when they are small and are used declaratively and don't contain moving parts like loops, and other program flow expressions. It's OK if the macro is one or at the most two lines long and it declares and instance of something. Something that won't break during runtime. Also macro's should not contain class definitions, or function definitions. If the macro contains code that needs to be stepped into using a debugger than the macro should be removed and replace with something else.
They can also be useful for wrapping custom tracing/debugging functionality. For instance you want custom tracing in debug builds but not release builds.
Anyways when you are working in legacy code like that, just be sure to remove a bit of the macro mess a bit at a time. If you keep it up, with enough time eventually you will remove them all and make life a bit easier for yourself. I have done this in the past, with especially messy macro's. What I do is turn on the compiler switch to have the preprocessor generate an output file. Then I raid that file, and copy the code, re-indent it, and replace the macro with the generated code. Thank goodness for that compiler feature.
Some of the legacy code I've worked with used macros very extensively in the place of methods. The reasoning was that the computer/OS/runtime had an extremely small stack, so that stack overflows were a common problem. Using macros instead of methods meant that there were fewer methods on the stack.
Luckily, most of that code was obsolete, so it is (mostly) gone now.
C89 did not have inline functions. If using a compiler with extensions disabled (which is a desirable thing to do for several reasons), then the macro might be the only option.
Although C99 came out in 1999, there was resistance to it for a long time; commercial compiler vendors didn't feel it was worth their time to implement C99. Some (e.g. MS) still haven't. So for many companies it was not a viable practical decision to use C99 conforming mode, even up to today in the case of some compilers.
I have used C89 compilers that did have an extension for inline functions, but the extension was buggy (e.g. multiple definition errors when there should not be), things like that may dissuade a programmer from using inline functions.
Another thing is that the macro version effectively forces that the function will actually be inlined. The C99 inline keyword is only a compiler hint and the compiler may still decide to generate a single instance of the function code which is linked like a non-inline function. (One compiler that I still use will do this if the function is not trivial and returning void).

Resources