Is introspection on a function's arguments in C possible? - c

For example, in Python I can use getargspec from inspect to access a function's arguments in the follow way:
>>> def test(a,b,c):
... return a*b*c
...
>>> getargspec(test)
ArgSpec(args=['a', 'b', 'c'], varargs=None, keywords=None, defaults=None)
Is this possible to do in C at all? More specifically I am only interested in the arguments' names, I don't particularly care about their types.

The language doesn't include anything along this line at all.
Depending on the implementation, there's a pretty fair chance that if you want this badly enough, you can get at it. To do so, you'll typically have to compile with debugging information enabled, and use code specific to a precise combination of compiler and platform to do it. Most compilers do support creating and accessing debugging information that would include the names of the parameters to a function -- but code to do it will not be portable (and in many cases, it'll also be pretty ugly).

No, all variable names are gone during compilation (except perhaps file-scope variables with "extern" storage duration), so you can't get the declaration names of the arguments.

No, this is absolutely impossible in C, since variable names exist only at compile time.
Probably it would be solvable via macro.
You could define
#define defFunc3(resultType,name,type0,arg0name,type1,arg1name,type2,arg2name) \
...store the variable names somehow... \
...you can access the variable name strings with the # operator, e.g. #arg0name \
...or build a function with concatenation: getargspec_##name \
resultType name(type0 arg0name, type1 arg1name, type2 arg2name )
and then declare your function with this macro:
defFunc3( float, test, float, a, float, b, float, c ) {
return a * b * c;
}
In the macro body, you could somehow store the variable names (with the stringification preprocessor operator) and/or the function address somewhere or create some kind of "getargspec" function (via the concatenation preprocessor operator).
But this will be definitely ugly, error prone and tricky (since you can not execute code in such a function definition directly). I would avoid such macro magic whenever possible.

Related

How do function renaming macros work, and should one use them?

Everyone knows about classic #define DEFAULT_VALUE 100 macro where the preprocessor will just find the "token" and replace it with whatever the value is.
The problem I am having is understanding the function version of this #define my_puts(x) puts(x). I have K&R in front of me but I simply cannot find a suitable explanation. For instance:
why do I need to supply the number of arguments?
why can their name be whatever?
why don't I have to supply the type?
But mainly I would like to know how this replacement functions under the hood.
In the back of my mind I think I have a memory of someone saying somewhere that this is bad because there are no types.
In short, I would like to know if it is safe and secure to use macros to rename functions (as opposed to the alternative of manually wrapping the function in another function).
Thank you!
The problem I am having is understanding the function version of this #define my_puts(x) puts(x).
Part of your confusion might arise from thinking of this variety as a "function renaming" macro. A more conventional term is "function-like", referring to the form of the macro definition and usage. Providing aliases for function names or converting from one function name to another is a relatively minor use for this kind of macro.
Such macros are better regarded more generally, simply as macros that accept parameters. From that standpoint, your specific questions have relatively clear answers:
why do I need to supply the number of arguments?
You are primarily associating parameter names with the various positions in the macro's parameter list. This is necessary so that the preprocessor can properly expand the macro. That the number of parameters is thereby conveyed (except for variadic macros) is of secondary importance.
why can their name be whatever?
"Whatever" is a little too strong, but the answer is that the names of macro parameters are significant only within the scope of the macro definition. The preprocessor substitutes the actual arguments into each expansion in place of the parameter names whenever it expands the macro. This is analogous to bona fide functions, actually, so I'm not really sure why this particular uncertainty arises for you.
why don't I have to supply the type?
Of the macro? Because to the extent that macros have a type, they all have the same one. They all expand to sequences of zero or more tokens. You can view this as a source-to-source translation. The resulting token sequence will be interpreted by the compiler at a subsequent stage in the process.
But mainly I would like to know how this replacement functions under the hood.
Roughly speaking, wherever the name of an in-scope function like macro appears in the source code followed by a parenthesized list of arguments, the macro name and argument list are replaced by the expansion of the macro, with the macro arguments substituted appropriately.
For example, consider this function-like macro, which you might see in real source code:
#define MIN(x, y) (((x) <= (y)) ? (x) : (y))
Within the scope of that definition, this code ...
n = MIN(10, z);
... expands to
n = (((10) <= (z)) ? (10) : (z));
Note well that
the function-like macro is not providing function alias in this case.
the macro arguments are substituted into the macro expansion wherever they appear as complete tokens in the macro's defined replacement text.
In the back of my mind I think I have a memory of someone saying somewhere that this is bad because there are no types.
Well, there are no types declared in the macro definition. That doesn't prevent all the normal rules around data type from applying to the source code resulting from the preprocessing stage. Both of these factors need to be taken into account. In some ways, the MIN() macro in the above example is more flexible than any one function can be be. Is that bad? I don't mean to deny that there are arguments against, but it's a multifaceted question that is not well captured by a single consideration or a plain "good" vs. "bad" evaluation.
In short, I would like to know if it is safe and secure to use macros to rename functions (as opposed to the alternative of manually wrapping the function in another function).
That's largely a different question from any of the above. The semantics of function-like macros are well-defined. There is no inherent safety or security issue. But function-like macros do obscure what is going on, and thereby make it more difficult to analyze code. This is therefore mostly a stylistic issue.
Function-like macros do have detractors these days, especially in the C++ community. In most cases, they have little to offer to distinguish themselves as superior to functions.

How can I get the function name as text not string in a macro?

I am trying to use a function-like macro to generate an object-like macro name (generically, a symbol). The following will not work because __func__ (C99 6.4.2.2-1) puts quotes around the function name.
#define MAKE_AN_IDENTIFIER(x) __func__##__##x
The desired result of calling MAKE_AN_IDENTIFIER(NULL_POINTER_PASSED) would be MyFunctionName__NULL_POINTER_PASSED. There may be other reasons this would not work (such as __func__ being taken literally and not interpreted, but I could fix that) but my question is what will provide a predefined macro like __func__ except without the quotes? I believe this is not possible within the C99 standard so valid answers could be references to other preprocessors.
Presently I have simply created my own object-like macro and redefined it manually before each function to be the function name. Obviously this is a poor and probably unacceptable practice. I am aware that I could take an existing cpp program or library and modify it to provide this functionality. I am hoping there is either a commonly used cpp replacement which provides this or a preprocessor library (prefer Python) which is designed for extensibility so as to allow me to 'configure' it to create the macro I need.
I wrote the above to try to provide a concise and well defined question but it is certainly the Y referred to by #Ruud. The X is...
I am trying to manage unique values for reporting errors in an embedded system. The values will be passed as a parameter to a(some) particular function(s). I have already written a Python program using pycparser to parse my code and identify all symbols being passed to the function(s) of interest. It generates a .h file of #defines maintaining the values of previously existing entries, commenting out removed entries (to avoid reusing the value and also allow for reintroduction with the same value), assigning new unique numbers for new identifiers, reporting malformed identifiers, and also reporting multiple use of any given identifier. This means that I can simply write:
void MyFunc(int * p)
{
if (p == NULL)
{
myErrorFunc(MYFUNC_NULL_POINTER_PASSED);
return;
}
// do something actually interesting here
}
and the Python program will create the #define MYFUNC_NULL_POINTER_PASSED 7 (or whatever next available number) for me with all the listed considerations. I have also written a set of macros that further simplify the above to:
#define FUNC MYFUNC
void MyFunc(int * p)
{
RETURN_ASSERT_NOT_NULL(p);
// do something actually interesting here
}
assuming I provide the #define FUNC. I want to use the function name since that will be constant throughout many changes (as opposed to LINE) and will be much easier for someone to transfer the value from the old generated #define to the new generated #define when the function itself is renamed. Honestly, I think the only reason I am trying to 'solve' this 'issue' is because I have to work in C rather than C++. At work we are writing fairly object oriented C and so there is a lot of NULL pointer checking and IsInitialized checking. I have two line functions that turn into 30 because of all these basic checks (these macros reduce those lines by a factor of five). While I do enjoy the challenge of crazy macro development, I much prefer to avoid them. That said, I dislike repeating myself and hiding the functional code in a pile of error checking even more than I dislike crazy macros.
If you prefer to take a stab at this issue, have at.
__FUNCTION__ used to compile to a string literal (I think in gcc 2.96), but it hasn't for many years. Now instead we have __func__, which compiles to a string array, and __FUNCTION__ is a deprecated alias for it. (The change was a bit painful.)
But in neither case was it possible to use this predefined macro to generate a valid C identifier (i.e. "remove the quotes").
But could you instead use the line number rather than function name as part of your identifier?
If so, the following would work. As an example, compiling the following 5-line source file:
#define CONCAT_TOKENS4(a,b,c,d) a##b##c##d
#define EXPAND_THEN_CONCAT4(a,b,c,d) CONCAT_TOKENS4(a,b,c,d)
#define MAKE_AN_IDENTIFIER(x) EXPAND_THEN_CONCAT4(line_,__LINE__,__,x)
static int MAKE_AN_IDENTIFIER(NULL_POINTER_PASSED);
will generate the warning:
foo.c:5: warning: 'line_5__NULL_POINTER_PASSED' defined but not used
As pointed out by others, there is no macro that returns the (unquoted) function name (mainly because the C preprocessor has insufficient syntactic knowledge to recognize functions). You would have to explicitly define such a macro yourself, as you already did yourself:
#define FUNC MYFUNC
To avoid having to do this manually, you could write your own preprocessor to add the macro definition automatically. A similar question is this: How to automatically insert pragmas in your program
If your source code has a consistent coding style (particularly indentation), then a simple line-based filter (sed, awk, perl) might do. In its most naive form: every function starts with a line that does not start with a hash or whitespace, and ends with a closing parenthesis or a comma. With awk:
{
print $0;
}
/^[^# \t].*[,\)][ \t]*$/ {
sub(/\(.*$/, "");
sub(/^.*[ \t]/, "");
print "#define FUNC " toupper($0);
}
For a more robust solution, you need a compiler framework like ROSE.
Gnu-C has a __FUNCTION__ macro, but sadly even that cannot be used in the way you are asking.

Shall I prefer constants over defines?

In C, shall I prefer constants over defines? I've reading a lot of code lately, and all of the examples make heavy use of defines.
No, in general you should not use const-qualified objects in C to create names constants. In order to create a named constant in C you should use either macros (#define) or enums. In fact, C language has no constants, in the sense that you seem to imply. (C is significantly different from C++ in this regard)
In C language the notions of constant and constant expression are defined very differently from C++. In C constant means a literal value, like 123. Here are some examples of constants in C
123
34.58
'x'
Constants in C can be used to build constant expressions. However, since const-qualified objects of any type are not a constants in C, they cannot be used in constant expressions, and, consequently, you cannot use const-qualified objects where constant expressions are required.
For example, the following is not a constant
const int C = 123; /* C is not a constant!!! */
and since the above C is not a constant, it cannot be used to declare an array type in file scope
typedef int TArray[C]; /* ERROR: constant expression required */
It cannot be used as a case label
switch (i) {
case C: ; /* ERROR: constant expression required */
}
It cannot be used as bit-field width
struct S {
int f : C; /* ERROR: constant expression required */
};
It cannot be used as an initializer for an object with static storage duration
static int i = C; /* ERROR: constant expression required */
It cannot be used as a enum initializer
enum {
E = C /* ERROR: constant expression required */
};
i.e it cannot be used anywhere where a constant is required.
This might seem counter-intuitive, but this is how C the language is defined.
This is why you see these numerous #define-s in the code you are working with. Again, in C language const-qualified object have very limited use. They are basically completely useless as "constants", which is why in C language you are basically forced to use #define or enums to declare true constants.
Of course, in situations when a const-qualified object works for you, i.e. it does what you want it to do, it is indeed superior to macros in many ways, since it is scoped and typed. You should probably prefer such objects where applicable, however in general case you'll have to take into account the above limitations.
Constants should be preferred over defines. There are several advantages:
Type safety. While C is a weakly typed languaged, using a define loses all of the type safety, which will allow the compiler to pick up problems for you.
Ease of debugging. You can change the value of constants through the debugger, while defines are automatically changed in the code by the pre-processor to the actual value, meaning that if you want to change the value for test/debugging purposes, you need to re-compile.
Maybe I have been using them wrong but, at least in gcc, you can't use constants in case statements.
const int A=12;
switch (argc) {
case A:
break;
}
Though this question is specific to C, I guess it is good to know this:
#include<stdio.h>
int main() {
const int CON = 123;
int* A = &CON;
(*A)++;
printf("%d\n", CON); // 124 in C
}
works in C, but not in C++
One of the reasons to use #define is to avoid such things to mess up your code, specially it is a mix of C and C++.
A lot of people here are giving you "C++ style" advice. Some even say the C++ arguments apply to C. That may be a fair point. (Whether it is or not feels kind of subjective.) The people who say const sometimes means something different in the two languages are also correct.
But these are mostly minor points and personally, I think in truth there is relatively minor consequence to going either way. It's a matter of style, and I think different groups of people will give you different answers.
In terms of common usage, historical usage, and most common style, in C, it's a lot more typical to see #define. Using C++isms in C code can come off as strange to a certain narrow segment of C coders. (Including me, so that's where my biases lie.)
But I'm surprised no one has suggested a middle ground solution, that "feels right" in both languages: if it fits into a group of integer constants, use an enum.
define can be used for many purposes(very loose) and should be avoided if you can substitute that with const, which define a variable and you can do a lot more with it.
In cases like below, define has to be used
directive switch
substitution to your source line
code macros
An example where you have to use define over const is when you have version number say 3 and you want version 4 to include some methods that is not available in version 3
#define VERSION 4
...
#if VERSION==4
................
#endif
Defines have been part of the language longer than constants, so a lot of older code will use them because defines where the only way to get the job done when the code was written. For more recent code it may be simply a matter of programmer habit.
Constants have a type as well as a value, so they would be preferred when it makes sense for your value to have a type, but not when it is typeless (or polymorphic).
If it's something that isn't determined programmatically, I use #define. For example, if I want all of my UI objects to have the same space between them, I might use #define kGUISpace 20.
Apart from the excellent reasons given by AndreyT for using DEFINES rather than constants in "C" code there is another more pragmatic reason for using DEFINES.
DEFINES are easy define and use from (.h) header files, which is where any experienced C coder would expect to find constants defined. Defining consts in header files is not quite so easy -- its more code to avoid duplicate definitions etc.
Also the "typesafe" arguments are moot most compilers will pick up glaring errors suchh as assing a string to and int, or, will "do the right thing" on a slight mismatch such as assigning an integer to a float.
Macros (defines) can be used by the pre-processor and at compile time, constants cannot.
You can do compile-time checks to make sure a macro is within a valid range (and #error or #fatal if it isn't). You can use default values for a macro if it hasn't already been defined. You can use a macro in the size of an array.
A compiler can optimize with macros better than it can with constants:
const int SIZE_A = 15;
#define SIZE_B 15
for (i = 0; i < SIZE_A + 1; ++i); // if not optimized may load A and add 1 on each pass
for (i = 0; i < SIZE_B + 1; ++i); // compiler will replace "SIZE_B + 1" with 16
Most of my work is with embedded processors that don't have amazing optimizing compilers. Maybe gcc will treat SIZE_A like a macro at some optimization level.

When should you use macros instead of inline functions?

In a previous question what I thought was a good answer was voted down for the suggested use of macros
#define radian2degree(a) (a * 57.295779513082)
#define degree2radian(a) (a * 0.017453292519)
instead of inline functions. Please excuse the newbie question, but what is so evil about macros in this case?
Most of the other answers discuss why macros are evil including how your example has a common macro use flaw. Here's Stroustrup's take: http://www.research.att.com/~bs/bs_faq2.html#macro
But your question was asking what macros are still good for. There are some things where macros are better than inline functions, and that's where you're doing things that simply can't be done with inline functions, such as:
token pasting
dealing with line numbers or such (as for creating error messages in assert())
dealing with things that aren't expressions (for example how many implementations of offsetof() use using a type name to create a cast operation)
the macro to get a count of array elements (can't do it with a function, as the array name decays to a pointer too easily)
creating 'type polymorphic' function-like things in C where templates aren't available
But with a language that has inline functions, the more common uses of macros shouldn't be necessary. I'm even reluctant to use macros when I'm dealing with a C compiler that doesn't support inline functions. And I try not to use them to create type-agnostic functions if at all possible (creating several functions with a type indicator as a part of the name instead).
I've also moved to using enums for named numeric constants instead of #define.
There's a couple of strictly evil things about macros.
They're text processing, and aren't scoped. If you #define foo 1, then any subsequent use of foo as an identifier will fail. This can lead to odd compilation errors and hard-to-find runtime bugs.
They don't take arguments in the normal sense. You can write a function that will take two int values and return the maximum, because the arguments will be evaluated once and the values used thereafter. You can't write a macro to do that, because it will evaluate at least one argument twice, and fail with something like max(x++, --y).
There's also common pitfalls. It's hard to get multiple statements right in them, and they require a lot of possibly superfluous parentheses.
In your case, you need parentheses:
#define radian2degree(a) (a * 57.295779513082)
needs to be
#define radian2degree(a) ((a) * 57.295779513082)
and you're still stepping on anybody who writes a function radian2degree in some inner scope, confident that that definition will work in its own scope.
For this specific macro, if I use it as follows:
int x=1;
x = radian2degree(x);
float y=1;
y = radian2degree(y);
there would be no type checking, and x,y will contain different values.
Furthermore, the following code
float x=1, y=2;
float z = radian2degree(x+y);
will not do what you think, since it will translate to
float z = x+y*0.017453292519;
instead of
float z = (x+y)+0.017453292519;
which is the expected result.
These are just a few examples for the misbehavior ans misuse macros might have.
Edit
you can see additional discussions about this here
if possible, always use inline function. These are typesafe and can not be easily redefined.
defines can be redfined undefined, and there is no type checking.
Macros are relatively often abused and one can easily make mistakes using them as shown by your example. Take the expression radian2degree(1 + 1):
with the macro it will expand to 1 + 1 * 57.29... = 58.29...
with a function it will be what you want it to be, namely (1 + 1) * 57.29... = ...
More generally, macros are evil because they look like functions so they trick you into using them just like functions but they have subtle rules of their own. In this case, the correct way would be to write it would be (notice the paranthesis around a):
#define radian2degree(a) ((a) * 57.295779513082)
But you should stick to inline functions. See these links from the C++ FAQ Lite for more examples of evil macros and their subtleties:
inline vs. macros
macros containing if
macros with multiple lines
macros used to paste two tokens together
The compiler's preprocessor is a finnicky thing, and therefore a terrible candidate for clever tricks. As others have pointed out, it's easy to for the compiler to misunderstand your intention with the macro, and it's easy for you to misunderstand what the macro will actually do, but most importantly, you can't step into macros in the debugger!
Macros are evil because you may end up passing more than a variable or a scalar to it and this could resolve in an unwanted behavior (define a max macro to determine max between a and b but pass a++ and b++ to the macro and see what happens).
If your function is going to be inlined anyway, there is no performance difference between a function and a macro. However, there are several usability differences between a function and a macro, all of which favor using a function.
If you build the macro correctly, there is no problem. But if you use a function, the compiler will do it correctly for you every time. So using a function makes it harder to write bad code.

Defining const values in C

I have a C project where all code is organized in *.c/*.h file pairs, and I need to define a constant value in one file, which will be however also be used in other files. How should I declare and define this value?
Should it be as static const ... in the *.h file? As extern const ... in the *.h file and defined in the *.c file? In what way does it matter if the value is not a primitive datatype (int, double, etc), but a char * or a struct? (Though in my case it is a double.)
Defining stuff inside *.h files doesn't seem like a good idea generally; one should declare things in the *.h file, but define them in the *.c file. However, the extern const ... approach seems inefficient, as the compiler wouldn't be able to inline the value, it instead having to be accessed via its address all the time.
I guess the essence of this question is: Should one define static const ... values in *.h files in C, in order to use them in more that one place?
The rule I follow is to only declare things in H files and define them in C files. You can declare and define in a single C file, assuming it will only be used in that file.
By declaration, I mean notify the compiler of its existence but don't allocate space for it. This includes #define, typedef, extern int x, and so on.
Definitions assign values to declarations and allocate space for them, such as int x and const int x. This includes function definitions; including these in header files frequently lead to wasted code space.
I've seen too many junior programmers get confused when they put const int x = 7; in a header file and then wonder why they get a link error for x being defined more than once. I think at a bare minimum, you would need static const int x so as to avoid this problem.
I wouldn't be too worried about the speed of the code. The main issue with computers (in terms of speed and cost) long ago shifted from execution speed to ease of development.
If you need constants (real, compile time constants) you can do that three ways, putting them into header files (there is nothing bad with that):
enum {
FOO_SIZE = 1234,
BAR_SIZE = 5678
};
#define FOO_SIZE 1234
#define BAR_SIZE 5678
static const int FOO_SIZE = 1234;
static const int BAR_SIZE = 5678;
In C++, i tend to use the enum way, since it can be scoped into a namespace. For C, i use the macro. This basicially comes down to a matter of taste though. If you need floating point constants, you can't use the enumeration anymore. In C++ i use the last way, the static const double, in that case (note in C++ static would be redundant then; they would become static automatically since they are const). In C, i would keep using the macros.
It's a myth that using the third method will slow down your program in any way. I just prefer the enumeration since the values you get are rvalues - you can't get their address, which i regard as an added safety. In addition, there is much less boiler-plate code written. The eye is concentrated on the constants.
Do you really have a need to worry about the advantage of inline? Unless you're writing embedded code, stick to readability. If it's really a magic number of something, I'd use a define; I think const is better for things like const version strings and modifying function call arguments. That said, the define in .c, declare in .h rule is definitely a fairly universally accepted convention, and I wouldn't break it just because you might save a memory lookup.
As a general rule, you do not define things as static in a header. If you do define static variables in a header, each file that uses the header gets its own private copy of whatever is declared static, which is the antithesis of DRY principle: don't repeat yourself.
So, you should use an alternative. For integer types, using enum (defined in a header) is very powerful; it works well with debuggers too (though the better debuggers may be able to help with #define macro values too). For non-integer types, an extern declaration (optionally qualified with const) in the header and a single definition in one C file is usually the best way to go.
I'd like to see more context for your question. The type of the value is critical, but you've left it out. The meaning of the const keyword in C is quite subtle; for example
const char *p;
does not mean that pointer p is a constant; you can write p all you like. What you cannot write is the memory that p points to, and this stays true even as p's value changes. This is about the only case I really understand; in general, the meaning of the subtle placement of const eludes me. But this special case is extremely useful for function parameters because it extracts a promise from the function that the memory the argument points to will not be mutated.
There is one other special case everyone should know: integers. Almost always, constant, named integers should be defined in a .h file as enumeration literals. enum types not only allow you to group related constants together in a natural way, but also allow you the names of those constants to be seen in the debugger, which is a huge advantage.
I've written tens of thousands of lines of C; probably hundreds if I try to track it down. (wc ~/src/c/*.c says 85 thousand, but some of that is generated, and of course there's a lot of C code lurking elsewhere). Aside from the two cases about, I've never found much use for const. I would be pleased to learn a new, useful example.
I can give you an indirect answer. In C++ (as opposed to C) const implies static. Thatis to say in C++ static const is the same thing as const. So that tells you how that C++ standards body feels about the issue i.e. all consts should be static.
for autoconf environment:
You can always define constants in the configure file as well. AC_DEFINE() i guess is the macro to define across the entire build.
To answer the essence of your question:
You generally do NOT want to define a static variable in a header file.
This would cause you to have duplicated variables in each translation units (C files) that include the header.
variables in a header should really be declared extern since that is the implied visibility.
See this question for a good explanation.
Actually, the situation might not be so dire, as the compiler would probably convert a const type to a literal value. But you might not want to rely on that behavior, especially if optimizations are turned off.
In C++, you should always use
const int SOME_CONST = 17;
for constants and never
#define SOME_CONST 17
Defines will almost always come back and bite you later. Consts are in the language, and are strongly typed, so you won't get weird errors because of some hidden interaction. I would put the const in the appropriate header file. As long as it's #pragma once (or #ifndef x / #define x / #endif), you won't ever get any compile errors.
In vanilla C, you might have compatibility problems where you must use #defines.

Resources