if defined in C struct - c

Can if defined (XX) be placed inside a struct/union in a header file?
struct
{
int i;
#if defined(xx)
int j;
#endif
}t;
I am dealing with a large file base of .c and .h files and I need to know the possible cons of this type of usage.

While completely valid, a definite con to using this is any time you need to use t.j you would also have to surround it with your #if defined(xx) otherwise you will invoke compiler errors

Sure you can. The preprocessor can be used for anything, no need to feed it C. The cons of this useage are, that you have a struct which changes size depending on wether xx is defined or not. This is asking for trouble, because a library built with this define and somebody using this library without the define are having different structs....

Preprocessor directives such as #if can be placed anywhere in your program. They have no actual relationship to the C code (or anything else) that is present in the text (except comments), since they are processed before the compilation phase. You can do stupid things like the code below, although it is generally a bad idea.
int foo(int x)
{
#if defined MONKEY
return 0;
}
int bar(int x)
{
#endif
return x;
}

Related

Is it possible to determine the type of object on-line in one-pass, including macros?

I have a very simple parser that provides a small section of the C language; it looks at a well-formed translation unit and, with one pass and online, determine what the global symbols and types (function, struct, union, variable,) if one is not trying to trick it. However, I'm having trouble determining if it's a struct or a function in this example,
#define CAT_(x, y) x ## y
#define CAT(x, y) CAT_(x, y)
#define F_(thing) CAT(foo, thing)
static struct F_(widget) { int i; }
F_(widget);
static struct F_(widget) a(void) { int i;
return i = 42, F_(widget).i = i, F_(widget); }
int main(void) {
a();
return 0;
}
It assumes that the parenthesis is a function and parses this this way,
[ID<stati>, ID<struc>, ID<F_>, LPAR<(>, ID<widge>, RPAR<)>, LBRA<{>, RBRA<}>].
[ID<F_>, LPAR<(>, ID<widge>, RPAR<)>, SEMI<;>].
[ID<stati>, ID<struc>, ID<F_>, LPAR<(>, ID<widge>, RPAR<)>, ID<a>, LPAR<(>, ID<void>, RPAR<)>, LBRA<{>, RBRA<}>].
[ID<int>, ID<main>, LPAR<(>, ID<void>, RPAR<)>, LBRA<{>, RBRA<}>].
When in fact, what it thinks is the function at the top is actually a struct declaration and the top two should be concatenated. What is the simplest way to recognise that this?
Two-pass, emulating what actually happens in macro replacement; I would have to build a subset of the C pre-processor;
like the C lexer hack, except with macros;
backtrack with the semicolon at the end; that seems hard;
somehow recognise the difference at the beginning, (probably requiring me to add struct to my symbol table.)
As mentioned in the comments, if you want to be able to handle preprocessor macros, you will need to implement (or borrow) a preprocessor.
Writing a preprocessor mostly involves coming to terms with the formal description in the C standard, but it is not otherwise particularly challenging. It can be done online with the resulting token stream fed into a parser, so it doesn't really require a second pass.
(This depends on how you define a "pass" I suppose, but in my usage a one-pass parser reads the input only once without creating and rereading a temporary file. And that is definitely doable.)

Static functions declared in "C" header files

For me it's a rule to define and declare static functions inside source files, I mean .c files.
However in very rare situations I saw people declaring it in the header file.
Since static functions have internal linkage we need to define it in every file we include the header file where the function is declared. This looks pretty odd and far from what we usually want when declaring something as static.
On the other hand if someone naive tries to use that function without defining it the compiler will complaint. So in some sense is not really unsafe to do this even sounding strange.
My questions are:
What is the problem of declaring static functions in header files?
What are the risks?
What the impact in compilation time?
Is there any risk in runtime?
First I'd like to clarify my understanding of the situation you describe: The header contains (only) a static function declaration while the C file contains the definition, i.e. the function's source code. For example
some.h:
static void f();
// potentially more declarations
some.c:
#include "some.h"
static void f() { printf("Hello world\n"); }
// more code, some of it potentially using f()
If this is the situation you describe, I take issue with your remark
Since static functions have internal linkage we need to define it in every file we include the header file where the function is declared.
If you declare the function but do not use it in a given translation unit, I don't think you have to define it. gcc accepts that with a warning; the standard does not seem to forbid it, unless I missed something. This may be important in your scenario because translation units which do not use the function but include the header with its declaration don't have to provide an unused definition.
Now let's examine the questions:
What is the problem of declaring static functions in header files?
It is somewhat unusual. Typically, static functions are functions needed in only one file. They are declared static to make that explicit by limiting their visibility. Declaring them in a header therefore is somewhat antithetical. If the function is indeed used in multiple files with identical definitions it should be made external, with a single definition. If only one translation unit actually uses it, the declaration does not belong in a header.
One possible scenario therefore is to ensure a uniform function signature for different implementations in the respective translation units. The common header leads to a compile time error for different return types in C (and C++); different parameter types would cause a compile time error only in C (but not in C++' because of function overloading).
What are the risks?
I do not see risks in your scenario. (As opposed to also including the function definition in a header which may violate the encapsulation principle.)
What the impact in compilation time?
A function declaration is small and its complexity is low, so the overhead of having additional function declarations in a header is likely negligible. But if you create and include an additional header for the declaration in many translation units the file handling overhead can be significant (i.e. the compiler idles a lot while it waits for the header I/O)
Is there any risk in runtime? I cannot see any.
This is not an answer to the stated questions, but hopefully shows why one might implement a static (or static inline) function in a header file.
I can personally only think of two good reasons to declare some functions static in a header file:
If the header file completely implements an interface that should only be visible in the current compilation unit
This is extremely rare, but might be useful in e.g. an educational context, at some point during the development of some example library; or perhaps when interfacing to another programming language with minimal code.
A developer might choose to do so if the library or interaface implementation is trivial and nearly so, and ease of use (to the developer using the header file) is more important than code size. In these cases, the declarations in the header file often use preprocessor macros, allowing the same header file to be included more than once, providing some sort of crude polymorphism in C.
Here is a practical example: Shoot-yourself-in-the-foot playground for linear congruential pseudorandom number generators. Because the implementation is local to the compilation unit, each compilation unit will get their own copies of the PRNG. This example also shows how crude polymorphism can be implemented in C.
prng32.h:
#if defined(PRNG_NAME) && defined(PRNG_MULTIPLIER) && defined(PRNG_CONSTANT) && defined(PRNG_MODULUS)
#define MERGE3_(a,b,c) a ## b ## c
#define MERGE3(a,b,c) MERGE3_(a,b,c)
#define NAME(name) MERGE3(PRNG_NAME, _, name)
static uint32_t NAME(state) = 0U;
static uint32_t NAME(next)(void)
{
NAME(state) = ((uint64_t)PRNG_MULTIPLIER * (uint64_t)NAME(state) + (uint64_t)PRNG_CONSTANT) % (uint64_t)PRNG_MODULUS;
return NAME(state);
}
#undef NAME
#undef MERGE3
#endif
#undef PRNG_NAME
#undef PRNG_MULTIPLIER
#undef PRNG_CONSTANT
#undef PRNG_MODULUS
An example using the above, example-prng32.h:
#include <stdlib.h>
#include <stdint.h>
#include <stdio.h>
#define PRNG_NAME glibc
#define PRNG_MULTIPLIER 1103515245UL
#define PRNG_CONSTANT 12345UL
#define PRNG_MODULUS 2147483647UL
#include "prng32.h"
/* provides glibc_state and glibc_next() */
#define PRNG_NAME borland
#define PRNG_MULTIPLIER 22695477UL
#define PRNG_CONSTANT 1UL
#define PRNG_MODULUS 2147483647UL
#include "prng32.h"
/* provides borland_state and borland_next() */
int main(void)
{
int i;
glibc_state = 1U;
printf("glibc lcg: Seed %u\n", (unsigned int)glibc_state);
for (i = 0; i < 10; i++)
printf("%u, ", (unsigned int)glibc_next());
printf("%u\n", (unsigned int)glibc_next());
borland_state = 1U;
printf("Borland lcg: Seed %u\n", (unsigned int)borland_state);
for (i = 0; i < 10; i++)
printf("%u, ", (unsigned int)borland_next());
printf("%u\n", (unsigned int)borland_next());
return EXIT_SUCCESS;
}
The reason for marking both the _state variable and the _next() function static is that this way each compilation unit that includes the header file has their own copy of the variables and the functions -- here, their own copy of the PRNG. Each must be separately seeded, of course; and if seeded to the same value, will yield the same sequence.
One should generally shy away from such polymorphism attempts in C, because it leads to complicated preprocessor macro shenanigans, making the implementation much harder to understand, maintain, and modify than necessary.
However, when exploring the parameter space of some algorithm -- like here, the types of 32-bit linear congruential generators, this lets us use a single implementation for each of the generators we examine, ensuring there are no implementation differences between them. Note that even this case is more like a development tool, and not something you ought to see in a implementation provided for others to use.
If the header implements simple static inline accessor functions
Preprocessor macros are commonly used to simplify code accessing complicated structure types. static inline functions are similar, except that they also provide type checking at compile time, and can refer to their parameters several times (with macros, that is problematic).
One practical use case is a simple interface for reading files using low-level POSIX.1 I/O (using <unistd.h> and <fcntl.h> instead of <stdio.h>). I've done this myself when reading very large (dozens of megabytes to gigabytes range) text files containing real numbers (with a custom float/double parser), as the GNU C standard I/O is not particularly fast.
For example, inbuffer.h:
#ifndef INBUFFER_H
#define INBUFFER_H
typedef struct {
unsigned char *head; /* Next buffered byte */
unsigned char *tail; /* Next byte to be buffered */
unsigned char *ends; /* data + size */
unsigned char *data;
size_t size;
int descriptor;
unsigned int status; /* Bit mask */
} inbuffer;
#define INBUFFER_INIT { NULL, NULL, NULL, NULL, 0, -1, 0 }
int inbuffer_open(inbuffer *, const char *);
int inbuffer_close(inbuffer *);
int inbuffer_skip_slow(inbuffer *, const size_t);
int inbuffer_getc_slow(inbuffer *);
static inline int inbuffer_skip(inbuffer *ib, const size_t n)
{
if (ib->head + n <= ib->tail) {
ib->head += n;
return 0;
} else
return inbuffer_skip_slow(ib, n);
}
static inline int inbuffer_getc(inbuffer *ib)
{
if (ib->head < ib->tail)
return *(ib->head++);
else
return inbuffer_getc_slow(ib);
}
#endif /* INBUFFER_H */
Note that the above inbuffer_skip() and inbuffer_getc() do not check if ib is non-NULL; this is typical for such functions. These accessor functions are assumed to be "in the fast path", i.e. called very often. In such cases, even the function call overhead matters (and is avoided with static inline functions, since they are duplicated in the code at the call site).
Trivial accessor functions, like the above inbuffer_skip() and inbuffer_getc(), may also let the compiler avoid the register moves involved in function calls, because functions expect their parameters to be located in specific registers or on the stack, whereas inlined functions can be adapted (wrt. register use) to the code surrounding the inlined function.
Personally, I do recommend writing a couple of test programs using the non-inlined functions first, and compare the performance and results to the inlined versions. Comparing the results ensure the inlined versions do not have bugs (off by one type is common here!), and comparing the performance and generated binaries (size, at least) tells you whether inlining is worth it in general.
Why would you want a both global and static function? In c, functions are global by default. You only use static functions if you want to limit the access to a function to the file they are declared. So you actively restrict access by declaring it static...
The only requirement for implementations in the header file, is for c++ template functions and template class member functions.

Declaring data type of a variable using a condition in c

I want to declare data type of a variable depending on a condition in C. Is it possible?
I have written a program to implement stack using integer array,
and I want the same code to implement stack of characters which is nothing but replacing some "int"s by "char"s, So how to do that??
I trid something like,
if(x == 1)
#define DATATYPE int
else
#define DATATYPE char
and many other things too but nothing worked.
Your code could work with #if x==1 ... #endif if x is a preprocessor symbol, e.g. if you compile with -Dx=1 command-line option to gcc ; please understand that the C preprocessor is the first phase of a C compiler, which in fact sees preprocessed code (use e.g. gcc -C -E source.c > source.i to get into source.i the preprocessed form of source.c)
In general, you could implement such generic containers using huge preprocessor macros. See e.g. sglib and this question. Or you could generate your C code with some specialized source code generator (perhaps using another preprocessor like m4 or gpp, or crafting your own generator in some scripting language).
Alternatively, use a lot of void* pointers, and pass the size of data to your routines, like qsort(3) does. See e.g. Glib containers
You might be interested in learning C++11 or Ocaml (or even Common Lisp). They offer a standard library with several generic containers (in C++ with templates in the library, in Ocaml with functors in it); read also about generic programming
You probably have a design flaw. You should really ask yourself why you want to threat C as a dynamic language like Python. C is a statically language typed, so types are fixed.
Use this solution which encourage you to redesign by creating a struct for each value in the stack tagged_t,and then fill the data, I hope you get the idea.
typedef union {
int i;
char c;
float f;
} evil;
typedef struct {
evil value;
int type;
} tagged_t;
enum {
TYPE_INT, TYPE_CHAR, TYPE_FLOAT
};
tagged_t bar;
bar.value.c = 'a';
bar.type = TYPE_CHAR;
See the answer of Yann Ramin
Firstly, please learn about the pre-processor. Now, on to your question.
This does not work, due to the fact that the compiler only actually sees:
if(x == 1)
else
The # indicates that the instruction will be executed by the pre-processor. The pre-processor is really a glorified find-and-replace when we talk about the #define command. eg:
#define PI_5_DIGITS 3.14159f
The pre-processor will find all occurrences of the tag PI_5_DIGITS and replace it with 3.14159.
Should you want to use this, make x a pre-processor symbol, by adding the switch e.g. -Dx=1.
Your code would then need to be changed to:
#ifdef x
#define DATATYPE int
#else
#define DATATYPE char
#endif
Suggested reading:
http://www.phanderson.com/C/preprocess.html
http://gcc.gnu.org/onlinedocs/cpp

Struct initialisation through macro overuse

I've got some structs to initialise, which would be tedious to do manually. I'd like to create a macro that will help me with it... but I'm not sure the C preprocessor is good enough for this.
I've got structs which represent menus. They consist of function pointers only:
typedef uint8_t (*button_handler) (uint8_t);
typedef void (*pedal_handler) (void);
typedef void (*display_handler) (void);
typedef void (*menu_switch_handler) (void);
#define ON_BUTTON(x) uint8_t menu_frame_##x##_button (uint8_t button)
#define ON_PEDAL(x) void menu_frame_##x##_pedal (void)
#define ON_DISPLAY(x) void menu_frame_##x##_display (void)
#define ON_SWITCH(x) void menu_frame_##x##_switch (void)
typedef struct menu_frame {
button_handler on_button;
pedal_handler on_pedal;
display_handler on_display;
menu_switch_handler on_switch;
} menu_frame;
That allows me to write the functions and separate functions as (.c file):
ON_BUTTON(blah) { ... }
and menus as (.h file):
ON_BUTTON(blah);
ON_DISPLAY(blah);
menu_frame menu_frame_blah = {
menu_frame_blah_button,
NULL,
menu_frame_blah_display,
NULL
};
Is there any way I can fold the menu definition into one define? I could do something that expands MENU(blah, menu_frame_blah_button, NULL, menu_frame_blah_display, NULL) of course, but is there any way to:
make it shorter (NULL or some name)
remove the need of ON_BUTTON(...); from before the struct
Ideally, I'd like MENU(blah, button, NULL, display, NULL) to both define the handlers and the menu struct itself. I don't know for example how to prevent expanding the last term into ON_SWITCH(NULL).
Or maybe I should approach it from some other way?
I've written Python scripts to generate this sort of code for me before. You may want to go that route and just work the script into your build process.
You cannot do conditional macro expansion in C, so that your macro would be expanded differently depending on the arguments, as in: you cannot use #if within macro definition.
I guess the best you could get would be something like MENU(blah, ITEM(blah,button), NULL, ITEM(blah,display), NULL), and you still need a separate set for prototypes because of lack of conditional expansion.
Personally, I would write a simple script to generate that sort of boilerplate C code. One that would understand your desired syntax. In Python or whatever suits you best…
You can program conditionals, finite loops, default arguments and all such stuff in the preprocessor alone. The Boost library has an implementation of some of that in their preprocessor section. Boost is primarily for C++, but the preprocessor stuff should basically work in C as well.
By such techniques you can write complicated macros but that are simple to use. It gets a bit simpler to implement when using C99 instead of C89 (you have named initializers and VA_ARGS), but still.

#undef-ing in Practice?

I'm wondering about the practical use of #undef in C. I'm working through K&R, and am up to the preprocessor. Most of this was material I (more or less) understood, but something on page 90 (second edition) stuck out at me:
Names may be undefined with #undef,
usually to ensure that a routine is
really a function, not a macro:
#undef getchar
int getchar(void) { ... }
Is this a common practice to defend against someone #define-ing a macro with the same name as your function? Or is this really more of a sample that wouldn't occur in reality? (EG, no one in his right, wrong nor insane mind should be rewriting getchar(), so it shouldn't come up.) With your own function names, do you feel the need to do this? Does that change if you're developing a library for others to use?
What it does
If you read Plauger's The Standard C Library (1992), you will see that the <stdio.h> header is allowed to provide getchar() and getc() as function-like macros (with special permission for getc() to evaluate its file pointer argument more than once!). However, even if it provides macros, the implementation is also obliged to provid actual functions that do the same job, primarily so that you can access a function pointer called getchar() or getc() and pass that to other functions.
That is, by doing:
#include <stdio.h>
#undef getchar
extern int some_function(int (*)(void));
int core_function(void)
{
int c = some_function(getchar);
return(c);
}
As written, the core_function() is pretty meaningless, but it illustrates the point. You can do the same thing with the isxxxx() macros in <ctype.h> too, for example.
Normally, you don't want to do that - you don't normally want to remove the macro definition. But, when you need the real function, you can get hold of it. People who provide libraries can emulate the functionality of the standard C library to good effect.
Seldom needed
Also note that one of the reasons you seldom need to use the explicit #undef is because you can invoke the function instead of the macro by writing:
int c = (getchar)();
Because the token after getchar is not an (, it is not an invocation of the function-like macro, so it must be a reference to the function. Similarly, the first example above, would compile and run correctly even without the #undef.
If you implement your own function with a macro override, you can use this to good effect, though it might be slightly confusing unless explained.
/* function.h */
…
extern int function(int c);
extern int other_function(int c, FILE *fp);
#define function(c) other_function(c, stdout);
…
/* function.c */
…
/* Provide function despite macro override */
int (function)(int c)
{
return function(c, stdout);
}
The function definition line doesn't invoke the macro because the token after function is not (. The return line does invoke the macro.
Macros are often used to generate bulk of code. It's often a pretty localized usage and it's safe to #undef any helper macros at the end of the particular header in order to avoid name clashes so only the actual generated code gets imported elsewhere and the macros used to generate the code don't.
/Edit: As an example, I've used this to generate structs for me. The following is an excerpt from an actual project:
#define MYLIB_MAKE_PC_PROVIDER(name) \
struct PcApi##name { \
many members …
};
MYLIB_MAKE_PC_PROVIDER(SA)
MYLIB_MAKE_PC_PROVIDER(SSA)
MYLIB_MAKE_PC_PROVIDER(AF)
#undef MYLIB_MAKE_PC_PROVIDER
Because preprocessor #defines are all in one global namespace, it's easy for namespace conflicts to result, especially when using third-party libraries. For example, if you wanted to create a function named OpenFile, it might not compile correctly, because the header file <windows.h> defines the token OpenFile to map to either OpenFileA or OpenFileW (depending on if UNICODE is defined or not). The correct solution is to #undef OpenFile before defining your function.
Although I think Jonathan Leffler gave you the right answer. Here is a very rare case, where I use an #undef. Normally a macro should be reusable inside many functions; that's why you define it at the top of a file or in a header file. But sometimes you have some repetitive code inside a function that can be shortened with a macro.
int foo(int x, int y)
{
#define OUT_OF_RANGE(v, vlower, vupper) \
if (v < vlower) {v = vlower; goto EXIT;} \
else if (v > vupper) {v = vupper; goto EXIT;}
/* do some calcs */
x += (x + y)/2;
OUT_OF_RANGE(x, 0, 100);
y += (x - y)/2;
OUT_OF_RANGE(y, -10, 50);
/* do some more calcs and range checks*/
...
EXIT:
/* undefine OUT_OF_RANGE, because we don't need it anymore */
#undef OUT_OF_RANGE
...
return x;
}
To show the reader that this macro is only useful inside of the function, it is undefined at the end. I don't want to encourage anyone to use such hackish macros. But if you have to, #undef them at the end.
I only use it when a macro in an #included file is interfering with one of my functions (e.g., it has the same name). Then I #undef the macro so I can use my own function.
Is this a common practice to defend against someone #define-ing a macro with the same name as your function? Or is this really more of a sample that wouldn't occur in reality? (EG, no one in his right, wrong nor insane mind should be rewriting getchar(), so it shouldn't come up.)
A little of both. Good code will not require use of #undef, but there's lots of bad code out there you have to work with. #undef can prove invaluable when somebody pulls a trick like #define bool int.
In addition to fixing problems with macros polluting the global namespace, another use of #undef is the situation where a macro might be required to have a different behavior in different places. This is not a realy common scenario, but a couple that come to mind are:
the assert macro can have it's definition changed in the middle of a compilation unit for the case where you might want to perform debugging on some portion of your code but not others. In addition to assert itself needing to be #undef'ed to do this, the NDEBUG macro needs to be redefined to reconfigure the desired behavior of assert
I've seen a technique used to ensure that globals are defined exactly once by using a macro to declare the variables as extern, but the macro would be redefined to nothing for the single case where the header/declarations are used to define the variables.
Something like (I'm not saying this is necessarily a good technique, just one I've seen in the wild):
/* globals.h */
/* ------------------------------------------------------ */
#undef GLOBAL
#ifdef DEFINE_GLOBALS
#define GLOBAL
#else
#define GLOBAL extern
#endif
GLOBAL int g_x;
GLOBAL char* g_name;
/* ------------------------------------------------------ */
/* globals.c */
/* ------------------------------------------------------ */
#include "some_master_header_that_happens_to_include_globals.h"
/* define the globals here (and only here) using globals.h */
#define DEFINE_GLOBALS
#include "globals.h"
/* ------------------------------------------------------ */
If a macro can be def'ed, there must be a facility to undef.
a memory tracker I use defines its own new/delete macros to track file/line information. this macro breaks the SC++L.
#pragma push_macro( "new" )
#undef new
#include <vector>
#pragma pop_macro( "new" )
Regarding your more specific question: namespaces are often emul;ated in C by prefixing library functions with an identifier.
Blindly undefing macros is going to add confusion, reduce maintainability, and may break things that rely on the original behavior. If you were forced, at least use push/pop to preserve the original behavior everywhere else.

Resources