Use of #define to alias structure members - c

This is a subjective question, so I will accept 'there is no answer' but read fully as this is specifically on a system where the code is safety critical.
I've adopted some embedded C code for a safety critical system, where the original author has (in random places) used syntax like this:
#include <stdio.h>
typedef struct tag_s2 {
int a;
}s2;
typedef struct tag_s1 {
s2 a;
}s1;
s1 inst;
#define X_myvar inst.a.a
int main(int argc, char **argv)
{
X_myvar = 10;
printf("myvar = %d\n", X_myvar + 1);
return 0;
}
Effectively using a #define to alias and obscure a deep structure member. Mostly two or three, but occasionally four deep.
BTW: This is a simple example, the real code is far more complicated but I can't publish any part of that here.
The use of this is not consistent, in some places the aliased variable is used directly other by it's alias, some parts of code are not aliased.
IMO this is bad practice as it obscures the code with no gain reducing maintainability and readability, leading to future errors and misunderstanding.
If the style was 100% consistent then perhaps I would be more happy with it.
However, being safety critical a change is costly. So not wanting to fix 'wot aint broke' I am open to other arguments.
Should I fix it or leave well alone?
Is there any guidance (e.g. Generic C, MISRA or DO178B style guides) that would have an opinion on this?

However, being safety critical a change is costly. So not wanting to fix 'wot aint broke' I am open to other arguments.
It's paradoxical death spiral that the most critical code gets the least attention because people are afraid to change it.
That you are hesitant to make a simple, rote refactoring to this code tells me the code either has no tests or you don't trust the tests. When you're afraid to improve code because you might break it, that delays improvements to the code. You're likely to do the smallest possible thing which will make the code even more brittle and unsafe.
I'd advise the first thing is to get some tests in place along with a staging environment for trials. Then all changes become safer. There might be some gafes initially while you find all the weird and dangerous things this code is doing, but that's what the staging area is for. In the medium and long term everyone will improve this code faster and with more confidence. Making code easier and safer to change allows it to be made easier and safer to change; the spiral then goes up, not down.
The technique of making a macro seem like a single variable is a technique I've seen before in the Perl 5 code base. It is written more in C macros than in C. For example, here's a bit of manipulating the Perl call stack.
#define SP sp
#define MARK mark
#define TARG targ
#define PUSHMARK(p) \
STMT_START { \
I32 * mark_stack_entry; \
if (UNLIKELY((mark_stack_entry = ++PL_markstack_ptr) \
== PL_markstack_max)) \
mark_stack_entry = markstack_grow(); \
*mark_stack_entry = (I32)((p) - PL_stack_base); \
DEBUG_s(DEBUG_v(PerlIO_printf(Perl_debug_log, \
"MARK push %p %" IVdf "\n", \
PL_markstack_ptr, (IV)*mark_stack_entry))); \
} STMT_END
#define TOPMARK S_TOPMARK(aTHX)
#define POPMARK S_POPMARK(aTHX)
#define INCMARK \
STMT_START { \
DEBUG_s(DEBUG_v(PerlIO_printf(Perl_debug_log, \
"MARK inc %p %" IVdf "\n", \
(PL_markstack_ptr+1), (IV)*(PL_markstack_ptr+1)))); \
PL_markstack_ptr++; \
} STMT_END
#define dSP SV **sp = PL_stack_sp
#define djSP dSP
#define dMARK SV **mark = PL_stack_base + POPMARK
#define dORIGMARK const I32 origmark = (I32)(mark - PL_stack_base)
#define ORIGMARK (PL_stack_base + origmark)
#define SPAGAIN sp = PL_stack_sp
#define MSPAGAIN STMT_START { sp = PL_stack_sp; mark = ORIGMARK; } STMT_END
#define GETTARGETSTACKED targ = (PL_op->op_flags & OPf_STACKED ? POPs : PAD_SV(PL_op->op_targ))
#define dTARGETSTACKED SV * GETTARGETSTACKED
These are macros upon macros upon macros. The Perl 5 source is riddled with them. There is a lot of opaque magic happening there. Some of them need to be macros to allow assignment, but many could be inline functions. Despite being part of a public API they are indifferently documented in part because they are macros and not functions.
This style is very clever and useful if you're already very familiar with the Perl 5 source code. For everyone else it has made the Perl 5 internals extremely difficult to work with. While some compilers will provide stack traces for macro expansions, others will only report on the expanded macro leaving one scratching their head what the hell const I32 origmark = (I32)(mark - PL_stack_base) is because it never appears in your source.
Like many macro hacks, while the technique is very clever it is also mind-bending and unfamiliar to many programmers. Mind-bending is not what you want in safety critical code. You want simple, boring code. That alone is the simplest argument to replace it with well named getter and setter functions. Trust the compiler to optimize them.
A good example of this is GLib which carefully uses well-documented function-like macros to make generic data structures. For example, adding a value to an array.
#define g_array_append_val(a,v)
While this is a macro, it acts and is documented like a function. It's macro solely as a mechanism to create a safe, type generic array. It hides no variables. You can safely use it without ever being aware it's a macro.
In conclusion, yes, change it. But instead of simply replacing X_myvar with inst.a.a consider creating functions that continue to provide encapsulation.
void s1_set_a( s1 *s, int val ) {
s->a.a = val;
}
int s1_get_a( s1 *s ) {
return s->a.a;
}
s1_set_a(&inst, 10);
printf("myvar = %d\n", s1_get_a(&inst) + 1);
The internals of s1 are hidden making it easier to change the internals later (for example, changing s1.a to a pointer to save memory). What variable you're working with is clear making the overall code easier to understand. The function names provide a clear explanation of what's happening. Because they're functions they have an obvious place for documentation. Trust the compiler to know how best to optimize it.

Yeah you should get rid of it. Obscure macros are dangerous.
It was common in older C to avoid spelling out deep nesting to do things like
#define a inst.a
In which case you only had to type inst.a instead of inst.a.a. Although this is questionable practice, macros like these were used to repair a shortcoming in the language, namely the lack of anonymous structs. Modern C supports that from C11 though. We can use anonymous structures to get rid of unnecessarily nested structs:
typedef struct {
struct
{
int a;
};
}s1;
But MISRA-C:2012 doesn't support C11 so that might not be an option.
Another trick you can use to get rid of long names is something like this:
int* x = &longstructname.somemember.anotherstruct.x;
// use *x from here on
x is a local variable with limited scope. That's much more readable than the obscure macros and it gets optimized away in the machine code.

From a maintenance perspective, yes, this is definitely code that needs to be fixed.
However, it is only from that perspective that the code needs to be fixed. It does not harm program correctness, and if the code is correct as-is, that is the paramount consideration.
That's why code like this should never be fixed unless a thorough unit test and regression test regimen is already in place. You should only fix code like this if you can be certain that you don't break correctly-functioning code in the process.

Related

Is it "proper" to define macros for use in a single function

I'm not asking if it's possible to #define a macro inside a function. I understand that it is possible and that macros are preprocessing mechanisms that are almost copy-paste (to an extent). What I'm asking is are there broad reasons to avoid making a macro for use in a single function or is it simply up to the codebase maintainers as to what they decide is good style?
Here is a contrived example similar to my usecase:
int main() {
/* ... */
#define POS(x, y) grid_array[width*(y) + (x)]
/* Use POS(x, y) in this function */
#undef POS
/* ... */
}
Another way to put the question is: Would other C developers nod their head in understanding or shake their head in distain?
Edit:
This is not a duplicate of the question "Macro vs function." I understand (some of) the differences between the two, e.g. MIN(a, b) (a)>(b)?(b):(a) evaluates a and b twice. I'm asking if it is good practice to use a macro for a single function.
The answer (and comments) pointed out that my simple example doesn't merit using a macro. Though my actual use case is not that simple, I have to agree. The "saved typing" doesn't merit convoluted code.
Here's my actual use case if you were curious:
/* array represents a graph with max degree 2. These operations are domain specific. */
#define CONNECT(a, b, i) {array[8*(a) + i] = b; array[8*(b) + (i+4)%8] = a;}
The primary purpose of any programming language, outside of just hex values, is to make the intent and implementation more obvious to the reader. Programs are meant to be read by people, and translated by machines.
To that extent, if POS(x,y) merely saves you typing, that might be of value to the reader, but you should consider the reader more than the writer. Any worthwhile program will have many more readers than writers, so your obligation as a writer is to lessen their load.

How can I maintain correlation between structure definitions and their construction / destruction code?

When developing and maintaining code, I add a new member to a structure and sometimes forget to add the code to initialize or free it which may later result in a memory leak, an ineffective assertion, or run-time memory corruption.
I try to maintain symmetry in the code where things of the same type are structured and named in a similar manner, which works for matching Construct() and Deconstruct() code but because structures are defined in separate files I can't seem to align their definitions with the functions.
Question: is there a way through coding to make myself more aware that I (or someone else) has changed a structure and functions need updating?
Efforts:
The simple:
-Have improved code organization to help minimize the problem
-Have worked to get into the habit of updating everything at once
-Have used comments to document struct members, but this just means results in duplication
-Do use IDE's auto-suggest to take a look and compare suggested entries to implemented code, but this doesn't detect changes.
I had thought that maybe structure definitions could appear multiple times as long as they were identical, but that doesn't compile. I believe duplicate structure names can appear as long as they do not share visibility.
The most effective thing I've come up with is to use a compile time assertion:
static_assert(sizeof(struct Foobar) == 128, "Foobar structure size changed, reevaluate construct and destroy functions");
It's pretty good, definitely good enough. I don't mind updating the constant when modifying the struct. Unfortunately compile time assertions are very platform (compiler) and C Standard dependent, and I'm trying to maintain the backwards compatibility and cross platform compatibility of my code.
This is a good link regarding C Compile Time Assertions:
http://www.pixelbeat.org/programming/gcc/static_assert.html
Edit:
I just had a thought; although a structure definition can't easily be relocated to a source file (unless it does not need to be shared with other source files), I believe a function can actually be relocated to a header file by inlining it.
That seems like a hacked way to make the language serve my unintended purpose, which is not what I want. I want to be professional. If the professional practice is not to approach this code-maintainability issue this way, then that is the answer.
I've been programming in C for almost 40 years, and I don't know of a good solution to this problem.
In some circles it's popular to use a set of carefully-contrived macro definitions so that you can write the structure once, not as a direct C struct declaration but as a sequence of these macros and then, by defining the macro differently and re-expanding, turn your "definition" into either a declaration or a definition or an initialization. Personally, I feel that these techniques are too obfuscatory and are more trouble than they're worth, but they can be used to decent effect.
Otherwise, the only solution -- though it's not what you're looking for -- is "Be careful."
In an ideal project (although I realize full well there's no such thing) you can define your data structures first, and then spend the rest of your time writing and debugging the code that uses them. If you never have occasion to add fields to structs, then obviously you won't have this problem. (I'm sorry if this sounds like a facetious or unhelpful comment, but I think it's part of the reason that I, just as #CoffeeTableEspresso mentioned in a comment, tend not to have too many problems like this in practice.)
It's perhaps worth noting that C++ has more or less the same problem. My biggest wishlist feature in C++ was always that it would be possible to initialize class members in the class declaration. (Actually, I think I've heard that a recent revision to the C++ standard does allow this -- in which case another not-necessarily-helpful answer to your question is "Use C++ instead".)
C doesn't let you have benign struct redefinitions but it does let you have benign macro redefinitions.
So as long as you
save the struct body in a macro (according to a fixed naming convention)
redefine the macro at the point of your constructor
you will get a warning if the struct body changes and you haven't updated the corresponding constructor.
Example:
header.h:
#define MC_foo_bod \
int x; \
double y; \
void *p
struct foo{ MC_foo_bod; };
foo__init.c
#include "header.h"
#ifdef MC_foo_bod
//try for a silent redefinition
//if it wasn't silent, the macro changed and so should this code
#define MC_foo_bod \
int x; \
double y; \
void *p
#else
#error ""
//oops--not a redefinition
//perhaps a typo in the macro name or a failure to include the header?
#endif
void foo__init(struct foo*X)
{
//...
}

"with" macro in C

I was looking for a macro that will resemble the with-construct.
The usage should be something like:
with (lock(&x), unlock(&x)) {
...
}
It might be useful for some other purposes.
I came up with this macro:
#define __with(_onenter, _onexit, v) \
for (int __with_uniq##v=1; __with_uniq##v > 0; )\
for (_onenter; __with_uniq##v > 0; _onexit) \
while (__with_uniq##v-- > 0)
#define _with(x, y, z) __with(x, y, z)
#define with(_onenter, _onexit) _with(_onenter, _onexit, __COUNTER__)
It has 3 nested loops because it should:
Initialize loop counter (C99 only, of course)
Possibly initialize variable _onenter (such as with (int fd=open(..), close(fd)))
Allow break inside the code block. (continue is allowed too. And the macro could be adjusted to assert() it out)
I used it on the code for the XV6 OS and it seems quite useful.
My question is - what are the worst problems with such a macro? I mean, besides the mere usage of a C macro (especially one that implements new control-flow construct).
So far have found these drawbacks / problems:
No support for return or goto (but it can save some gotos in kernel code)
No support for errors (such as fd < 0). I think this one is fixable.
gnu89 / c99 and above only (loop counter. the unique variable trick is not necessary)
Somewhat less efficient than simple lock-unlock. I believe it to be insignificant.
Are there any other problems? Is there a better way to implement similar construct in C?
That macro scares me. I'd prefer the traditional approach using gotos.
That approach is primitive, but most C programmers are familiar with the pattern and if they're not, they can understand it by reading the local code. There is no hidden behavior. As a consequence, it's pretty reliable.
Your macro is clever, but it would be new to most everybody and it comes with hidden gotchas. New contributors would have to be thought rules such as "don't return or goto out of a with block" and "break will break out of the with block, not out of the surrounding loop". I fear mistakes would be common.
The balance would shift if you could add warnings for misuses of this construct to the compiler. With clang, that seems to be an option. In this case, misuses would be detected and your code would remain portable to other compilers.
If you're willing to restrict yourself to GCC and Clang, you can use the cleanup attribute. That would make your example look like this:
lock_t x = NULL __attribute__((cleanup(unlock)));
lock(&x);
And unlock will be called with a pointer to the variable when it goes out of scope. This is integrates with other language features like return and goto, and even with exceptions in mixed C/C++ projects.

How to give readable names to elements of an array in C?

I'm inexperienced with C, and working on a microcontroller with messages stored in arrays where each byte does something different. How do I give each element of the array a human-readable name instead of referencing them as msg[1], msg[2], etc.?
Is this what structs are for? But "you cannot make assumptions about the binary layout of a structure, as it may have padding between fields."
Should I just use macros like this? (I know "macros are bad", but the code is already full of them)
#define MSG_ID msg[0]
#define MSG_COMMAND msg[1]
Oh! Or I guess I could just do
MSG_ID = 0;
MSG_COMMAND = 1;
MSG[MSG_ID];
That's probably better, if a little uglier.
If you want to go that route, use a macro, for sure, but make them better than what you suggest:
#define MSG_ID(x) (x)[0]
#define MSG_COMMAND(x) (x)[1]
Which will allow the code to name the arrays in ways that make sense, instead of ways that work with the macro.
Otherwise, you can define constants for the indexes instead (sorry I could not come up with better names for them...):
#define IDX_MSG_ID 0
#define IDX_MSG_COMMAND 1
And macros are not bad if they are used responsibly. This kind of "simple aliasing" is one of the cases where macros help making the code easier to read and understand, provided the macros are named appropriately and well documented.
Edit: per #Lundin's comments, the best way to improve readability and safety of the code is to introduce a type and a set of functions, like so (assuming you store in char and a message is MESSAGE_SIZE long):
typedef char MESSAGE[MESSAGE_SIZE];
char get_message_id(MESSAGE msg) { return msg[0]; }
char get_message_command(MESSAGE msg) { return msg[1]; }
This method, though it brings some level of type safety and allows you to abstract the storage away from the use, also introduces call overhead, which in microcontroller world might be problematic. The compiler may alleviate some of this through inlining the functions (which you could incentize by adding the inline keyword to the definitions).
The most natural concept for naming a set of integers in C are enumerations:
enum msg_pos { msg_id, msg_command, };
By default they start counting at 0 and increment by one. You would then access a field by msg[msg_id] for example.
It's fine to use a struct if you take the time to figure out how your compiler lays them out, and structs can very useful in embedded programming. It will always lay out the members in order, but there may be padding if you are not on an 8-bit micro. GCC has a "packed" attribute you can apply to the struct to prohibit padding, and some other compilers have a similar feature.

What are C macros useful for?

I have written a little bit of C, and I can read it well enough to get a general idea of what it is doing, but every time I have encountered a macro it has thrown me completely. I end up having to remember what the macro is and substitute it in my head as I read. The ones that I have encountered that were intuitive and easy to understand were always like little mini functions, so I always wondered why they weren't just functions.
I can understand the need to define different build types for debug or cross platform builds in the preprocessor but the ability to define arbitrary substitutions seems to be useful only to make an already difficult language even more difficult to understand.
Why was such a complex preprocessor introduced for C? And does anyone have an example of using it that will make me understand why it still seems to be used for purposes other than simple if #debug style conditional compilations?
Edit:
Having read a number of answers I still just don't get it. The most common answer is to inline code. If the inline keyword doesn't do it then either it has a good reason to not do it, or the implementation needs fixing. I don't understand why a whole different mechanism is needed that means "really inline this code" (aside form the code being written before inline was around). I also don't understand the idea that was mentioned that "if its too silly to be put in a function". Surely any piece of code that takes an input and produces an output is best put in a function. I think I may not be getting it because I am not used to the micro optimisations of writing C, but the preprocessor just feels like a complex solution to a few simple problems.
I end up having to remember what the macro is and substitute it in my head as I read.
That seems to reflect poorly on the naming of the macros. I would assume you wouldn't have to emulate the preprocessor if it were a log_function_entry() macro.
The ones that I have encountered that were intuitive and easy to understand were always like little mini functions, so I always wondered why they weren't just functions.
Usually they should be, unless they need to operate on generic parameters.
#define max(a,b) ((a)<(b)?(b):(a))
will work on any type with an < operator.
More that just functions, macros let you perform operations using the symbols in the source file. That means you can create a new variable name, or reference the source file and line number the macro is on.
In C99, macros also allow you to call variadic functions such as printf
#define log_message(guard,format,...) \
if (guard) printf("%s:%d: " format "\n", __FILE__, __LINE__,__VA_ARGS_);
log_message( foo == 7, "x %d", x)
In which the format works like printf. If the guard is true, it outputs the message along with the file and line number that printed the message. If it was a function call, it would not know the file and line you called it from, and using a vaprintf would be a bit more work.
This excerpt pretty much sums up my view on the matter, by comparing several ways that C macros are used, and how to implement them in D.
copied from DigitalMars.com
Back when C was invented, compiler
technology was primitive. Installing a
text macro preprocessor onto the front
end was a straightforward and easy way
to add many powerful features. The
increasing size & complexity of
programs have illustrated that these
features come with many inherent
problems. D doesn't have a
preprocessor; but D provides a more
scalable means to solve the same
problems.
Macros
Preprocessor macros add powerful features and flexibility to C. But they have a downside:
Macros have no concept of scope; they are valid from the point of definition to the end of the source. They cut a swath across .h files, nested code, etc. When #include'ing tens of thousands of lines of macro definitions, it becomes problematical to avoid inadvertent macro expansions.
Macros are unknown to the debugger. Trying to debug a program with symbolic data is undermined by the debugger only knowing about macro expansions, not the macros themselves.
Macros make it impossible to tokenize source code, as an earlier macro change can arbitrarily redo tokens.
The purely textual basis of macros leads to arbitrary and inconsistent usage, making code using macros error prone. (Some attempt to resolve this was introduced with templates in C++.)
Macros are still used to make up for deficits in the language's expressive capability, such as for "wrappers" around header files.
Here's an enumeration of the common uses for macros, and the corresponding feature in D:
Defining literal constants:
The C Preprocessor Way
#define VALUE 5
The D Way
const int VALUE = 5;
Creating a list of values or flags:
The C Preprocessor Way
int flags:
#define FLAG_X 0x1
#define FLAG_Y 0x2
#define FLAG_Z 0x4
...
flags |= FLAG_X;
The D Way
enum FLAGS { X = 0x1, Y = 0x2, Z = 0x4 };
FLAGS flags;
...
flags |= FLAGS.X;
Setting function calling conventions:
The C Preprocessor Way
#ifndef _CRTAPI1
#define _CRTAPI1 __cdecl
#endif
#ifndef _CRTAPI2
#define _CRTAPI2 __cdecl
#endif
int _CRTAPI2 func();
The D Way
Calling conventions can be specified in blocks, so there's no need to change it for every function:
extern (Windows)
{
int onefunc();
int anotherfunc();
}
Simple generic programming:
The C Preprocessor Way
Selecting which function to use based on text substitution:
#ifdef UNICODE
int getValueW(wchar_t *p);
#define getValue getValueW
#else
int getValueA(char *p);
#define getValue getValueA
#endif
The D Way
D enables declarations of symbols that are aliases of other symbols:
version (UNICODE)
{
int getValueW(wchar[] p);
alias getValueW getValue;
}
else
{
int getValueA(char[] p);
alias getValueA getValue;
}
There are more examples on the DigitalMars website.
They are a programming language (a simpler one) on top of C, so they are useful for doing metaprogramming in compile time... in other words, you can write macro code that generates C code in less lines and time that it will take writing it directly in C.
They are also very useful to write "function like" expressions that are "polymorphic" or "overloaded"; e.g. a max macro defined as:
#define max(a,b) ((a)>(b)?(a):(b))
is useful for any numeric type; and in C you could not write:
int max(int a, int b) {return a>b?a:b;}
float max(float a, float b) {return a>b?a:b;}
double max(double a, double b) {return a>b?a:b;}
...
even if you wanted, because you cannot overload functions.
And not to mention conditional compiling and file including (that are also part of the macro language)...
Macros allow someone to modify the program behavior during compilation time. Consider this:
C constants allow fixing program behavior at development time
C variables allow modifying program behavior at execution time
C macros allow modifying program behavior at compilation time
At compilation time means that unused code won't even go into the binary and that the build process can modify the values, as long as it's integrated with the macro preprocessor. Example: make ARCH=arm (assumes forwarding macro definition as cc -DARCH=arm)
Simple examples:
(from glibc limits.h, define the largest value of long)
#if __WORDSIZE == 64
#define LONG_MAX 9223372036854775807L
#else
#define LONG_MAX 2147483647L
#endif
Verifies (using the #define __WORDSIZE) at compile time if we're compiling for 32 or 64 bits. With a multilib toolchain, using parameters -m32 and -m64 may automatically change bit size.
(POSIX version request)
#define _POSIX_C_SOURCE 200809L
Requests during compilation time POSIX 2008 support. The standard library may support many (incompatible) standards but with this definition, it will provide the correct function prototypes (example: getline(), no gets(), etc.). If the library doesn't support the standard it may give an #error during compile time, instead of crashing during execution, for example.
(hardcoded path)
#ifndef LIBRARY_PATH
#define LIBRARY_PATH "/usr/lib"
#endif
Defines, during compilation time a hardcode directory. Could be changed with -DLIBRARY_PATH=/home/user/lib, for example. If that were a const char *, how would you configure it during compilation ?
(pthread.h, complex definitions at compile time)
# define PTHREAD_MUTEX_INITIALIZER \
{ { 0, 0, 0, 0, 0, 0, { 0, 0 } } }
Large pieces of text may that otherwise wouldn't be simplified may be declared (always at compile time). It's not possible to do this with functions or constants (at compile time).
To avoid really complicating things and to avoid suggesting poor coding styles, I'm wont give an example of code that compiles in different, incompatible, operating systems. Use your cross build system for that, but it should be clear that the preprocessor allows that without help from the build system, without breaking compilation because of absent interfaces.
Finally, think about the importance of conditional compilation on embedded systems, where processor speed and memory are limited and systems are very heterogeneous.
Now, if you ask, is it possible to replace all macro constant definitions and function calls with proper definitions ? The answer is yes, but it won't simply make the need for changing program behavior during compilation go away. The preprocessor would still be required.
Remember that macros (and the pre-processor) come from the earliest days of C. They used to be the ONLY way to do inline 'functions' (because, of course, inline is a very recent keyword), and they are still the only way to FORCE something to be inlined.
Also, macros are the only way you can do such tricks as inserting the file and line into string constants at compile time.
These days, many of the things that macros used to be the only way to do are better handled through newer mechanisms. But they still have their place, from time to time.
Apart from inlining for efficiency and conditional compilation, macros can be used to raise the abstraction level of low-level C code. C doesn't really insulate you from the nitty-gritty details of memory and resource management and exact layout of data, and supports very limited forms of information hiding and other mechanisms for managing large systems. With macros, you are no longer limited to using only the base constructs in the C language: you can define your own data structures and coding constructs (including classes and templates!) while still nominally writing C!
Preprocessor macros actually offer a Turing-complete language executed at compile time. One of the impressive (and slightly scary) examples of this is over on the C++ side: the Boost Preprocessor library uses the C99/C++98 preprocessor to build (relatively) safe programming constructs which are then expanded to whatever underlying declarations and code you input, whether C or C++.
In practice, I'd recommend regarding preprocessor programming as a last resort, when you don't have the latitude to use high level constructs in safer languages. But sometimes it's good to know what you can do if your back is against the wall and the weasels are closing in...!
From Computer Stupidities:
I've seen this code excerpt in a lot of freeware gaming programs for UNIX:
/*
* Bit values.
*/
#define BIT_0 1
#define BIT_1 2
#define BIT_2 4
#define BIT_3 8
#define BIT_4 16
#define BIT_5 32
#define BIT_6 64
#define BIT_7 128
#define BIT_8 256
#define BIT_9 512
#define BIT_10 1024
#define BIT_11 2048
#define BIT_12 4096
#define BIT_13 8192
#define BIT_14 16384
#define BIT_15 32768
#define BIT_16 65536
#define BIT_17 131072
#define BIT_18 262144
#define BIT_19 524288
#define BIT_20 1048576
#define BIT_21 2097152
#define BIT_22 4194304
#define BIT_23 8388608
#define BIT_24 16777216
#define BIT_25 33554432
#define BIT_26 67108864
#define BIT_27 134217728
#define BIT_28 268435456
#define BIT_29 536870912
#define BIT_30 1073741824
#define BIT_31 2147483648
A much easier way of achieving this is:
#define BIT_0 0x00000001
#define BIT_1 0x00000002
#define BIT_2 0x00000004
#define BIT_3 0x00000008
#define BIT_4 0x00000010
...
#define BIT_28 0x10000000
#define BIT_29 0x20000000
#define BIT_30 0x40000000
#define BIT_31 0x80000000
An easier way still is to let the compiler do the calculations:
#define BIT_0 (1)
#define BIT_1 (1 << 1)
#define BIT_2 (1 << 2)
#define BIT_3 (1 << 3)
#define BIT_4 (1 << 4)
...
#define BIT_28 (1 << 28)
#define BIT_29 (1 << 29)
#define BIT_30 (1 << 30)
#define BIT_31 (1 << 31)
But why go to all the trouble of defining 32 constants? The C language also has parameterized macros. All you really need is:
#define BIT(x) (1 << (x))
Anyway, I wonder if guy who wrote the original code used a calculator or just computed it all out on paper.
That's just one possible use of Macros.
I will add to whats already been said.
Because macros work on text substitutions they allow you do very useful things which wouldn't be possible to do using functions.
Here a few cases where macros can be really useful:
/* Get the number of elements in array 'A'. */
#define ARRAY_LENGTH(A) (sizeof(A) / sizeof(A[0]))
This is a very popular and frequently used macro. This is very handy when you for example need to iterate through an array.
int main(void)
{
int a[] = {1, 2, 3, 4, 5};
int i;
for (i = 0; i < ARRAY_LENGTH(a); ++i) {
printf("a[%d] = %d\n", i, a[i]);
}
return 0;
}
Here it doesn't matter if another programmer adds five more elements to a in the decleration. The for-loop will always iterate through all elements.
The C library's functions to compare memory and strings are quite ugly to use.
You write:
char *str = "Hello, world!";
if (strcmp(str, "Hello, world!") == 0) {
/* ... */
}
or
char *str = "Hello, world!";
if (!strcmp(str, "Hello, world!")) {
/* ... */
}
To check if str points to "Hello, world". I personally think that both these solutions look quite ugly and confusing (especially !strcmp(...)).
Here are two neat macros some people (including I) use when they need to compare strings or memory using strcmp/memcmp:
/* Compare strings */
#define STRCMP(A, o, B) (strcmp((A), (B)) o 0)
/* Compare memory */
#define MEMCMP(A, o, B) (memcmp((A), (B)) o 0)
Now you can now write the code like this:
char *str = "Hello, world!";
if (STRCMP(str, ==, "Hello, world!")) {
/* ... */
}
Here is the intention alot clearer!
These are cases were macros are used for things functions cannot accomplish. Macros should not be used to replace functions but they have other good uses.
One of the case where macros really shine is when doing code-generation with them.
I used to work on an old C++ system that was using a plugin system with his own way to pass parameters to the plugin (Using a custom map-like structure). Some simple macros were used to be able to deal with this quirk and allowed us to use real C++ classes and functions with normal parameters in the plugins without too much problems. All the glue code being generated by macros.
Given the comments in your question, you may not fully appreciate is that calling a function can entail a fair amount of overhead. The parameters and key registers may have to be copied to the stack on the way in, and the stack unwound on the way out. This was particularly true of the older Intel chips. Macros let the programmer keep the abstraction of a function (almost), but avoided the costly overhead of a function call. The inline keyword is advisory, but the compiler may not always get it right. The glory and peril of 'C' is that you can usually bend the compiler to your will.
In your bread and butter, day-to-day application programming this kind of micro-optimization (avoiding function calls) is generally worse then useless, but if you are writing a time-critical function called by the kernel of an operating system, then it can make a huge difference.
Unlike regular functions, you can do control flow (if, while, for,...) in macros. Here's an example:
#include <stdio.h>
#define Loop(i,x) for(i=0; i<x; i++)
int main(int argc, char *argv[])
{
int i;
int x = 5;
Loop(i, x)
{
printf("%d", i); // Output: 01234
}
return 0;
}
It's good for inlining code and avoiding function call overhead. As well as using it if you want to change the behaviour later without editing lots of places. It's not useful for complex things, but for simple lines of code that you want to inline, it's not bad.
By leveraging C preprocessor's text manipulation one can construct the C equivalent of a polymorphic data structure. Using this technique we can construct a reliable toolbox of primitive data structures that can be used in any C program, since they take advantage of C syntax and not the specifics of any particular implementation.
Detailed explanation on how to use macros for managing data structure is given here - http://multi-core-dump.blogspot.com/2010/11/interesting-use-of-c-macros-polymorphic.html
Macros let you get rid of copy-pasted fragments, which you can't eliminate in any other way.
For instance (the real code, syntax of VS 2010 compiler):
for each (auto entry in entries)
{
sciter::value item;
item.set_item("DisplayName", entry.DisplayName);
item.set_item("IsFolder", entry.IsFolder);
item.set_item("IconPath", entry.IconPath);
item.set_item("FilePath", entry.FilePath);
item.set_item("LocalName", entry.LocalName);
items.append(item);
}
This is the place where you pass a field value under the same name into a script engine. Is this copy-pasted? Yes. DisplayName is used as a string for a script and as a field name for the compiler. Is that bad? Yes. If you refactor you code and rename LocalName to RelativeFolderName (as I did) and forget to do the same with the string (as I did), the script will work in a way you don't expect (in fact, in my example it depends on did you forget to rename the field in a separate script file, but if the script is used for serialization, it would be a 100% bug).
If you use a macro for this, there will be no room for the bug:
for each (auto entry in entries)
{
#define STR_VALUE(arg) #arg
#define SET_ITEM(field) item.set_item(STR_VALUE(field), entry.field)
sciter::value item;
SET_ITEM(DisplayName);
SET_ITEM(IsFolder);
SET_ITEM(IconPath);
SET_ITEM(FilePath);
SET_ITEM(LocalName);
#undef SET_ITEM
#undef STR_VALUE
items.append(item);
}
Unfortunately, this opens a door for other types of bugs. You can make a typo writing the macro and will never see a spoiled code, because the compiler doesn't show how it looks after all preprocessing. Someone else could use the same name (that's why I "release" macros ASAP with #undef). So, use it wisely. If you see another way of getting rid of copy-pasted code (such as functions), use that way. If you see that getting rid of copy-pasted code with macros isn't worth the result, keep the copy-pasted code.
One of the obvious reasons is that by using a macro, the code will be expanded at compile time, and you get a pseudo function-call without the call overhead.
Otherwise, you can also use it for symbolic constants, so that you don't have to edit the same value in several places to change one small thing.
Macros .. for when your &#(*$& compiler just refuses to inline something.
That should be a motivational poster, no?
In all seriousness, google preprocessor abuse (you may see a similar SO question as the #1 result). If I'm writing a macro that goes beyond the functionality of assert(), I usually try to see if my compiler would actually inline a similar function.
Others will argue against using #if for conditional compilation .. they would rather you:
if (RUNNING_ON_VALGRIND)
rather than
#if RUNNING_ON_VALGRIND
.. for debugging purposes, since you can see the if() but not #if in a debugger. Then we dive into #ifdef vs #if.
If its under 10 lines of code, try to inline it. If it can't be inlined, try to optimize it. If its too silly to be a function, make a macro.
While I'm not a big fan of macros and don't tend to write much C anymore, based on my current tasking, something like this (which could obviously have some side-effects) is convenient:
#define MIN(X, Y) ((X) < (Y) ? (X) : (Y))
Now I haven't written anything like that in years, but 'functions' like that were all over code that I maintained earlier in my career. I guess the expansion could be considered convenient.
I didn't see anyone mentioning this so, regarding function like macros, eg:
#define MIN(X, Y) ((X) < (Y) ? (X) : (Y))
Generally it's recommended to avoid using macros when not necessary, for many reasons, readability being the main concern. So:
When should you use these over a function?
Almost never, since there's a more readable alternative which is inline, see https://www.greenend.org.uk/rjk/tech/inline.html
or http://www.cplusplus.com/articles/2LywvCM9/ (the second link is a C++ page, but the point is applicable to c compilers as far as I know).
Now, the slight difference is that macros are handled by the pre-processor and inline is handled by the compiler, but there's no practical difference nowadays.
when is it appropriate to use these?
For small functions (two or three liners max). The goal is to gain some advantage during the run time of a program, as function like macros (and inline functions) are code replacements done during the pre-proccessing (or compilation in case of inline) and are not real functions living in memory, so there's no function call overhead (more details in the linked pages).

Resources