My C header file contains about 300 various functions, their names all beginning with "foo_db_" and accepting a "db_t" as their first parameter (knowing what is exactly a db_t is no really relevant here, it's just a struct).
function foo_db_my_first_function(db_t *db, char *param1, int param2);
function foo_db_my_second_function(db_t *db, double param1, const char *param2, int param3);
(...)
function foo_db_my_Nth_function(db_t *db, int param1);
My job is to write another 300 wrapping functions named "foo_XXXX" (XXXX begin the suffix of the "foo_db_" function) with a default value for the first parameter.
static __inline function foo_my_first_function(char *param1, int param2) {
foo_db_my_first_function(DEFAULT_DB, param1, param2);
}
(...)
I was wondering if I could write some macros to ease my job: declare the "db" function and the corresponding "default" function (without the first parameter).
Unfortunately, I cannot use C99 and variadic macros arguments :( so I think I'm screwed :), but I prefer to ask first here before burning my fingers to write those 300 functions :/
Assuming the original header file for the API is regular enough, then a script in your favorite text processing language (Perl, Lua, Python, Awk, or even /bin/sh in a pinch) will likely be the simplest approach.
Your script would collect all public function declarations using a regex or simple text matching to identify them (likely based on the foo_db_ prefix). It could then write two output files. First, a suitable .h file declaring your wrappers, and second the .c source file implementing them by stuffing DEFAULT_DB into their first parameter. You will need to do a minimal amount of work to copy the rest of the parameters through, but with luck the declarations are all regular enough that the text manipulation can be as simple as "rest of line" or the like.
Having done that, I would check the script into revision control, and get it invoked at build time, treating the generated files as transient build products. However, if you don't have a sufficiently flexible build system (this is why I still perfer make to nearly everything else I've seen proposed) then you will have to find a suitable kludge to signal that your generated default wrappers are out of date when the API changes.
This approach will require investing some time in the code generator script, but you should be ahead on that well before the time you imagine hand-coding your 100th wrapper. And the second time you run it....
In extreme cases, you could end up needing to implement much of the front-end of a C compiler. In that case, I see two approaches that are both more socially acceptable than arranging a meeting with the architect in a dark alley. First, there is a GCC back-end that emits its AST in XML; the resulting XML is a bear, but has been reduced down to a tree of tokens that can be manipulated. Second, there is always LPeg, a full parser that is easily used from Lua (and I suspect that there are other PEG parsers out there for other scripting languages too). Sample code for LPeg that lexes and parses C is referenced at the Wiki page.
Do it in Excel. Create a cell with "foo_db_ (db_t *db)", drag it down as many places as you need, fill in all the blanks, then copy it all into your program (you can test that the copy will work ahead of time, but I just tried with Notepad and it seems to work as intended). Now you have all your function headers, and can fill in the rest from there.
Related
I am refactoring a program which requires types to have a global unique number which increases by one for every object (ie. the max unique number should be the the amount of types declared with this ID) in the entire project (__COUNTER__ only works for the current file).
This currently looks like
struct foo {
static const int index = __GLOBAL_COUNTER__(TYPES, _3862e1e60a2749c2bfd2add9f3ddbb25);
};
A python script is run on this which runs the normal C++ processor then uses regex to find uses of __GLOBAL_COUNTER__ and replaces them with a number. The macro argument number is the name of the counter to use and the second is a UUID (so that that the value is constant between includes).
Issues with approach is that the use of the macro doesn't work properly when mixed with other macros and the python script and regex can replace things I don't intend (eg. in strings). Also having to manually generate a UUID for each type is cumbersome.
So my question is whether it is possible to write a macro as a GCC plugin to provide this functionality and where I should start. I have searched the documentation and read some of the GCC source code, but I haven't found anything.
NOTE: this is generally merged into another macro to save a bit of typing
#define TYPE_INDEX(x) static const int index = __GLOBAL_COUNTER__(TYPES, x);
So if there are other approaches for example changing the syntax so something like this if it is easier would also work, but I am not sure how I would go about it.
indexed_struct foo {
};
How can i get a function's name without calling/invoking it, or is that even possible ?
I have an array of sorting functions, my goal is to be able to list the name of each one, dynamically, without having to invoke any.
After searching on the web, i couldn't find any solution that doesn't require the function being invoked and uses __FUNCTION__ or __func__.
The array of functions that is use:
// Pointer to functions
char *(*srtFunc[])(int *, int) = {selection, bubble, recursiveBubble, insertion, recursiveInsertion};
More information about what I want to achieve with this:
I want to loop over each function in the given array, create a file with the name of the function, invoke the function 100 times with different arguments each time, and print the time spent by the function each time in its dedicated file, redo for the remaining functions.
Unfortunately, not easily. C is not built for introspection and doesn't have features like this-- the name of function foo and the call to function foo are compiled down to just some jump and call instructions in the output; the actual name "foo" is essentially a convenience for you when programming and disappears in the compiled output.
The macro __FUNCTION__ is a preprocessor macro-- and as you note it only works within a function, because all it does it tell the preprocessor (as its churning through the text) hey, as you're scanning this token just drop in the name of the function you're currently scanning and then continue on. It's very "dumb" and is upstream of even the compiler.
There are various ways to get the effective result you want here, including most simply just manually building a table of string literals that have the same names as your functions. You can do this in fairly clean ways (see #nielsen's answer for a useful snippet) using macros. But the preprocessor/compiler can't help you derive or enforce a table from the actual functions so you will always have some risk of an issue at runtime when you make changes to it. Unfortunately C just doesn't have the capability for the kind of elegance you're looking for in this design.
You may be able to do something with smart preprocessor tricks, but your code would be difficult to read. I think I would go for the really low-tech solution here and just add an array of the function names matching the array of function pointers:
#define ARRAY_SIZE(A) (sizeof(A)/sizeof(A[0]))
// Pointer to functions
char *(*srtFunc[])(int *, int) = {selection, bubble, recursiveBubble, insertion, recursiveInsertion};
const char *srtFuncNames[] = {"selection", "bubble", "recursiveBubble", "insertion", "recursiveInsertion"};
_Static_assert(ARRAY_SIZE(srtFuncNames)==ARRAY_SIZE(srtFunc), "Function table and names out of synch!");
Having the two definitions just after each other makes it easy to keep them synchronized and the code is easy to read. The _Static_assert (available from C11) will help remembering to add new names as new functions are added.
Alternatively, a structure can be defined holding a function pointer and corresponding name. This can be initialized using a macro as follows:
typedef struct
{
char *(*srtFunc)(int *, int);
const char *srtName;
} sortMethod;
#define SORT_METHOD(S) {(S), #S}
sortMethod methods[] = {
SORT_METHOD(selection),
SORT_METHOD(bubble),
SORT_METHOD(recursiveBubble),
SORT_METHOD(insertion),
SORT_METHOD(recursiveInsertion)
};
I'm a beginner to C, but I've had a bit of experience with some other programing languages like Ruby and Python. I would very much like to create some of my own functions in C that I could use in any of my programs that just make life easier, however I'm a little bit confused about how to do this.
From what I understand the first part of this process is to create a header file that contains all of your prototypes, and I understand that, however from what I understand it is frowned upon to include anything other than declarations in your header files, so would you also need to create a .c file that contained the actual code and then #include that in all your programs along with the header file? But if so, why would you need a header file in the first place, since defining a function also declares it?
Finally, what should you put in the main() function of your header file? Do you just leave it blank, or do you not include it?
Thanks!
The declaration of a function lets the compiler know that at link time such a function will be available. The definition of the function provides that implementation, and additionally it also serves as the declaration. There is no harm in having multiple declarations, but only one implementation can be provided. Also, at least one declaration (or the only implementation) must come before any use of the function - this alone makes forward declarations necessary in cases where two functions call one another (both cannot be before the other).
So, if you have the implementation:
int foo(int a, int b) {
return a * b;
}
The corresponding declaration is simply:
int foo(int a, int b);
(The argument names do not matter in the declaration, i.e., they can be omitted or different than in the implementation. In fact you could declare only int foo(); and it would work for the above function, but this is mainly a legacy thing and not recommended. Note that to declare a function that takes no arguments, put void in the argument list, e.g., int bar(void);)
There are a number of reasons why you would want to have separate headers with only the declaration:
The implementation may be in a separate file, which allows for organisation of code into manageable pieces, and may be compiled by itself and need not be recompiled unless that file has changed - in large projects where the total compilation time can be an hour it would be absurd to re-compile everything for a small change.
The implementation source may not be available, e.g., in case of a closed-source proprietary library.
The implementation may be in a different language with a compatible calling convention.
For practical details on how to write code in multiple files and how to use libraries, please consult a book or tutorial on C programming. As for main, you need not declare it in a header unless you are specifically calling main from another function - the convention of C programs is to call main as int main(int, char**) at start of the execution.
When compiling, each .c-file (or .cpp-file) will be compiled to an own binary first.
If one binary file is using functions from another,
it just knows "there is something outside named xyz" at that time.
Then the linker will put them together in one file and rewrite the parts of each file
which are using functions of other files,
so that they actually know where to find the used functions.
What will happen if you put code in a .h file:
At compilation time, each included h-file in a c-file will be integrated in the c-file.
If you have code for xyz in a h-file and you´re including it in more thana one c-file,
each of this compiled c-files will have a xyz. Then, the linker will be confused...
So, function code have to be in a own c file.
Why use a h-file at all?
Because, if you call xyz in your code, how should the compiler know
if this is a function of another c-file (and which parameters...)
or an error because xyz does not exist?
The reason for header files in c are for when you need the same code in multiple scripts. So if you are just repeated the same code in one script then yes it would be easier to just use a function. Also for header files, yes you would need to include a .c file for all the computations.
I want to run simple analysis on C files (such as if you call foo macro with INT_TYPE as argument, then cast the response to int*), I do not want to prerprocess the file, I just want to parse it (so that, for instance, I'll have correct line numbers).
Ie, I want to get from
#include <a.h>
#define FOO(f)
int f() {FOO(1);}
an list of tokens like
<include_directive value="a.h"/>
<macro name="FOO"><param name="f"/><result/></macro>
<function name="f">
<return>int</return>
<body>
<macro_call name="FOO"><param>1</param></macro_call>
</body>
</function>
with no need to set include path, etc.
Is there any preexisting parser that does it? All parsers I know assume C is preprocessed. I want to have access to the macros and actual include instructions.
Our C Front End can parse code containing preprocesser elements can do this to fair extent and still build a usable AST. (Yes, the parse tree has precise file/line/column number information).
There are a number of restrictions, which allows it to handle most code. In those few cases it cannot handle, often a small, easy change to the source file giving equivalent code solves the problem.
Here's a rough set of rules and restrictions:
#includes and #defines can occur wherever a declaration or statement can occur, but not in the middle of a statement. These rarely cause a problem.
macro calls can occur where function calls occur in expressions, or can appear without semicolon in place of statements. Macro calls that span non-well-formed chunks are not handled well (anybody surprised?). The latter occur occasionally but not rarely and need manual revision. OP's example of "j(v,oid)*" is problematic, but this is really rare in code.
#if ... #endif must be wrapped around major language concepts (nonterminals) (constant, expression, statement, declaration, function) or sequences of such entities, or around certain non-well-formed but commonly occurring idioms, such as if (exp) {. Each arm of the conditional must contain the same kind of syntactic construct as the other arms. #if wrapped around random text used as bad kind of comment is problematic, but easily fixed in the source by making a real comment. Where these conditions are not met, you need to modify the original source code, often by moving the #if #elsif #else #end a few tokens.
In our experience, one can revise a code base of 50,000 lines in a few hours to get around these issues. While that seems annoying (and it is), the alternative is to not be able to parse the source code at all, which is far worse than annoying.
You also want more than just a parser. See Life After Parsing, to know what happens after you succeed in getting a parse tree. We've done some additional work in building symbol tables in which the declarations are recorded with the preprocessor context in which they are embedded, enabling type checking to include the preprocessor conditions.
You can have a look at this ANTLR grammar. You will have to add rules for preprocessor tokens, though.
Your specific example can be handled by writing your own parsing and ignore macro expansion.
Because FOO(1) itself can be interpreted as a function call.
When more cases are considered however, the parser is much more difficult. You can refer PDF Link to find more information.
Suppose I have a call to a function which takes a variable number of arguments in my source code. I want to do some kind of static analysis on this source code to find the type of the arguments being actually passed to the function. For example, if my function call is -
foo(a, b, c)
I want to find the data type of a, b and c and store this information.
You pretty well have to do the parse-and-build-a-symbol-table part of compiling the program.
Which means running the preprocessor, and lexing as well.
That's the bad news.
The good news is that you don't have to do much of the hard stuff. No need to build a AST, every part of the code except typedefs; struct, union, and enum definitions; variable-or-function declarations-and-definitions; and analyzing the function call arguments can be a no-op.
On further thought prompted by Chris' comments: You do have to be able to analyze the types of expressions and handle the va-arg promotions, as well.
It is still a smaller project than writing a whole compiler, but should be approached with some thought.
If this is in C++, you can hack together some RTTI using typeid etc.
http://en.wikipedia.org/wiki/Run-time_type_information