Translate headers to Cython - c

Is it feasible for Cython to have the ability to translate C headers Cython-directives? (See here, in Conditional Compilation).
A similar suggestion was made here too.
In my case, I would like these C-directives in my .h:
/* myheader.h */
#define MONIT_STEP 1 // stuff for step monit
//#define MONIT_SCATTERING 1 // monitor particle scattering
#define clight (3.0*1e10)
//#define NORM(x,y,z) (sqrt(x*x+y*y+z*z)) // maybe this is too much to ask?
to be translated to:
# cython_header.pyx
DEF MONIT_STEP = 1
DEF clight = (3.0*1e10)
So that later, I can do:
include "cython_header.pyx"
in any other .pyx code I want to compile.
Of course, I'm implying to have the hability to ignore any character after any "//" string in the myheader.h.
I left the NORM(x,y,z) commented as I don't see it trivial to implement, due to its "function" nature (i.e. it's not just copy-paste).
I thought I could catch the C-preprocessor with this (see Cython docs, in "Referencing C header files"):
cdef extern from "spam.h":
pass
but doesn't work.
Of course, I can always use this method, but I'm hoping we can be more consistent.

Related

Preprocess C files, but only expand #ifdefs? [duplicate]

Original Question
What I'd like is not a standard C pre-processor, but a variation on it which would accept from somewhere - probably the command line via -DNAME1 and -UNAME2 options - a specification of which macros are defined, and would then eliminate dead code.
It may be easier to understand what I'm after with some examples:
#ifdef NAME1
#define ALBUQUERQUE "ambidextrous"
#else
#define PHANTASMAGORIA "ghostly"
#endif
If the command were run with '-DNAME1', the output would be:
#define ALBUQUERQUE "ambidextrous"
If the command were run with '-UNAME1', the output would be:
#define PHANTASMAGORIA "ghostly"
If the command were run with neither option, the output would be the same as the input.
This is a simple case - I'd be hoping that the code could handle more complex cases too.
To illustrate with a real-world but still simple example:
#ifdef USE_VOID
#ifdef PLATFORM1
#define VOID void
#else
#undef VOID
typedef void VOID;
#endif /* PLATFORM1 */
typedef void * VOIDPTR;
#else
typedef mint VOID;
typedef char * VOIDPTR;
#endif /* USE_VOID */
I'd like to run the command with -DUSE_VOID -UPLATFORM1 and get the output:
#undef VOID
typedef void VOID;
typedef void * VOIDPTR;
Another example:
#ifndef DOUBLEPAD
#if (defined NT) || (defined OLDUNIX)
#define DOUBLEPAD 8
#else
#define DOUBLEPAD 0
#endif /* NT */
#endif /* !DOUBLEPAD */
Ideally, I'd like to run with -UOLDUNIX and get the output:
#ifndef DOUBLEPAD
#if (defined NT)
#define DOUBLEPAD 8
#else
#define DOUBLEPAD 0
#endif /* NT */
#endif /* !DOUBLEPAD */
This may be pushing my luck!
Motivation: large, ancient code base with lots of conditional code. Many of the conditions no longer apply - the OLDUNIX platform, for example, is no longer made and no longer supported, so there is no need to have references to it in the code. Other conditions are always true. For example, features are added with conditional compilation so that a single version of the code can be used for both older versions of the software where the feature is not available and newer versions where it is available (more or less). Eventually, the old versions without the feature are no longer supported - everything uses the feature - so the condition on whether the feature is present or not should be removed, and the 'when feature is absent' code should be removed too. I'd like to have a tool to do the job automatically because it will be faster and more reliable than doing it manually (which is rather critical when the code base includes 21,500 source files).
(A really clever version of the tool might read #include'd files to determine whether the control macros - those specified by -D or -U on the command line - are defined in those files. I'm not sure whether that's truly helpful except as a backup diagnostic. Whatever else it does, though, the pseudo-pre-processor must not expand macros or include files verbatim. The output must be source similar to, but usually simpler than, the input code.)
Status Report (one year later)
After a year of use, I am very happy with 'sunifdef' recommended by the selected answer. It hasn't made a mistake yet, and I don't expect it to. The only quibble I have with it is stylistic. Given an input such as:
#if (defined(A) && defined(B)) || defined(C) || (defined(D) && defined(E))
and run with '-UC' (C is never defined), the output is:
#if defined(A) && defined(B) || defined(D) && defined(E)
This is technically correct because '&&' binds tighter than '||', but it is an open invitation to confusion. I would much prefer it to include parentheses around the sets of '&&' conditions, as in the original:
#if (defined(A) && defined(B)) || (defined(D) && defined(E))
However, given the obscurity of some of the code I have to work with, for that to be the biggest nit-pick is a strong compliment; it is valuable tool to me.
The New Kid on the Block
Having checked the URL for inclusion in the information above, I see that (as predicted) there is an new program called Coan that is the successor to 'sunifdef'. It is available on SourceForge and has been since January 2010. I'll be checking it out...further reports later this year, or maybe next year, or sometime, or never.
I know absolutely nothing about C, but it sounds like you are looking for something like unifdef. Note that it hasn't been updated since 2000, but there is a successor called "Son of unifdef" (sunifdef).
Also you can try this tool http://coan2.sourceforge.net/
something like this will remove ifdef blocks:
coan source -UYOUR_FLAG --filter c,h --recurse YourSourceTree
I used unifdef years ago for just the sort of problem you describe, and it worked fine. Even if it hasn't been updated since 2000, the syntax of preprocessor ifdefs hasn't changed materially since then, so I expect it will still do what you want. I suppose there might be some compile problems, although the packages appear recent.
I've never used sunifdef, so I can't comment on it directly.
Around 2004 I wrote a tool that did exactly what you are looking for. I never got around to distributing the tool, but the code can be found here:
http://casey.dnsalias.org/exifdef-0.2.zip (that's a dsl link)
It's about 1.7k lines and implements enough of the C grammar to parse preprocessor statements, comments, and strings using bison and flex.
If you need something similar to a preprocessor, the flexible solution is Wave (from boost). It's a library designed to build C-preprocessor-like tools (including such things as C++03 and C++0x preprocessors). As it's a library, you can hook into its input and output code.

Dynamically prefix macro names with a variadic macro

Background
I've utilized a set of preprocessor macros from another question that allows me to prefix symbol names (enums, function names, struct names, etc) in my source, i.e.:
#include <stdio.h>
#define VARIABLE 3
#define PASTER(x,y) x ## _ ## y
#define EVALUATOR(x,y) PASTER(x,y)
#define NAME(fun) EVALUATOR(fun, VARIABLE)
void NAME(func)(int i);
int main(void)
{
NAME(func)(123);
return 0;
}
void NAME(func)(int i)
{
printf("i is %d in %s.\n", i, __func__);
}
Problem
This works as expected, with the following output: i is 123 in func_3.
Edit
I would like this code:
#define NAME(SOME_MACRO_CONST) (123)
#define NAME(SOME_MACRO_CONST2) (123)
To expand to:
#define 3SOME_MACRO_CONST (123)
#define 3SOME_MACRO_CONST2 (123)
I realize the macro shouldn't start with a digit. In the final code I'll be using names like LIB_A_ and LIB_B_ as prefixes.
/Edit
However, if I attempt to do the same with macros as the arguments to my NAME variadic macro, it fails like so:
Re-using NAME macro:
Code
#define NAME(MY_CONST) (3)
Output
test.c:7:0: warning: "NAME" redefined
#define NAME(MY_CONST) 3
Manually pasting prefix:
Code:
#define VARIABLE ## MY_CONST (3)
Output:
test.c:8:18: error: '##' cannot appear at either end of a macro expansion
#define VARIABLE ## MY_CONST (3)
Question
How can I create simple macro definitions (name + value) that has a common prefix for all the macros? The goal is to be able to make multiple copies of the source file and compile them with different flags so all versions can be linked together into the same final binary without symbol/macro name collisions (the macros will later be moved into header files). The final file will be too big to write in something like M4 or a template language. Ideally, the solution would involve being able to use a single macro-function/variadic-macro for all use cases, but I'm OK with one macro for symbol prefixing, and another for macro-name prefixing.
I would like this code:
#define NAME(SOME_MACRO_CONST) (123)
#define NAME(SOME_MACRO_CONST2) (124)
To expand to:
#define 3SOME_MACRO_CONST (123)
#define 3SOME_MACRO_CONST2 (124)
(I corrected the second number to 124 to make it different from the first one, for readability purposes)
This is impossible with the C preprocessor
for several reasons:
3SOME_MACRO_CONST is not a valid identifier (both for the preprocessor, and for the C compiler itself) since it does not start with a letter or an underscore. So let's assume you want your code to be expanded to:
/// new desired expansion
#define THREE_SOME_MACRO_CONST (123)
#define THREE_SOME_MACRO_CONST2 (124)
this is still impossible, because the preprocessor works before anything else and cannot generate any preprocessor directive (e.g. #define).
A workaround, if you only want to #define some numbers (computable at compile-time !!!) might be to expand to some anonymous enum like
enum {
THREE_SOME_MACRO_CONST= 123,
THREE_SOME_MACRO_CONST2= 124,
};
and you know how to do that in the details. Read also about X-macros.
However, even if you can change your requirement to something that is possible, it might be not recommendable, because your code becomes very unreadable (IMHO). You could sometimes consider writing some simple script (e.g. in sed or awk ...), or use some other preprocessor like GPP, to generate a C file from something else.
Notice that most serious build automation tools (like GNU make or ninja) -or even IDEs (they can be configured to) permit quite easily (by adding extra targets, recipes, commands, etc...) to generate some C (or C++) code from some other file, and that meta-programming practice has been routinely used since decades (e.g. bison, flex, autoconf, rpcgen, Qt moc, SWIG ...) so I am surprised you cannot do so. Generating a header file containing many #define-s is so common a practice that I am surprised you are forbidden to do so. Perhaps you just need to discuss with your manager or colleagues. Maybe you need to look for some more interesting job.
Personally, I am very fond of such meta-programming approaches (I did my PhD on these in 1990, and I would discuss them at every job interview; a job where metaprogramming is forbidden is not for me. Look for example at my past GCC MELT project, and my future project also will have metaprogramming). Another way of promoting that approach is to defend domain specific languages (and the ability to make your DSL inside some large software project; for example the GCC compiler has about a dozen of such DSLs inside it....). Then, your DSL can (naturally) be compiled to C which is a common practice. On modern operating systems that generated C code could be compiled at runtime and dynamically loaded as a (generated) plugin (using dlopen on POSIX...)
Sometimes, you can trick the compiler. For a project compiled by GCC, you could consider writing your GCC plugin..... (that is a lot more work than adding a command generating C code; your plugin could provide extra magic pragmas or builtins or attributes used by some other macros).
You could also configure the spec file of your gcc to handle specifically some C files. Beware, that could affect every future compilation!

C Header Files - Dividing the main code

I am doing my first "big/medium" project for a school work and I need to divide my code into some other c files. My doubt is if it is better have many files/header files with just a few little code, or have less files/header files and a little more code/functions into them?
Thank you!
p.s. I am a newbie programmer, so be patient and try to make the explanation easy to understand.
My experience is that having code grouped into source/headers according to functionality increases ability to understand, test, maintain and reuse it.
How much code goes into each file will really depend on how complex the encapsulated functionality is. For example, I have a source file containing functions to create and append to WAV files. They are relatively small, and because they are cohesive, I can use them in whatever project I have that needs to create WAV files without bringing in a lot of other baggage. Other files may be large (or very large) but if the functionality is cohesive, I get the same benefits.
One thing that tripped me up when I started doing this was “multiple inclusions” caused by including the same header in a project multiple times without “protecting” it. Since you say you are a newbie, I’ll add a quick sample of what you can do to prevent it.
/**
#file my_header.h
*/
ifndef MY_HEADER_H // <- Prevents multiple inclusions
#define MY_HEADER_H // <- ...
#ifdef __cplusplus // <- Allows this to be called from c++
extern "C" { // <- See "name mangling for more info.
#endif // <- ...
/**************************/
// your stuff goes here
struct my_struct
{
// ...
};
// function prototypes, etc.
/**************************/
#ifdef __cplusplus
}
#endif
#endif // MY_HEADER_H

How do I create a sophisitcated macro check for resources in a static embedded OS?

I have an embedded OS that needs its resources to be defined statically by compile time.
So e.g.
#define NUM_TASKS 200
At the moment, I have one header file where every developer needs to declare the tasks he/she needs, kind of like this:
#define ALL_TASKS ( \
1 + \ /* need one task in module A */
2 \ /* need two tasks in module B */
)
and during compilation of the OS, there is a check:
#if (ALL_TASKS > NUM_TASKS)
#error Please increase NUM_TASKS in CONF.H
#endif
So when I compile and more Tasks are needed the compilation stops and gives explicit notice that the static OS won't have enough tasks for this to work.
So far so good.
Enter the lazy programmer that forgets to add the tasks he added in module A to the global declaration file in directory x/y/z/foo/bar/baz/.
What I would like is the following construct, which I can't seem to achieve with any macro tricks I tried:
Have macros to declare the resources needed in a module like so:
OS_RESERVE_NUMBER_OF_TASKS(2)
in the modules adds 2 to the global number of Tasks.
my first rough idea was something like this:
#define OS_RESERVE_NUMBER_OF_TASKS(max_tasks) #undef RESOURCE_CALC_TEMP_MAX_NUM_TASKS \
#define RESOURCE_CALC_TEMP_MAX_NUM_TASKS RESOURCE_CALC_MAX_NUM_TASKS + max_tasks \
#undef RESOURCE_CALC_MAX_NUM_TASKS \
#define RESOURCE_CALC_MAX_NUM_TASKS RESOURCE_CALC_TEMP_MAX_NUM_TASKS \
#undef RESOURCE_CALC_TEMP_MAX_NUM_TASKS
but that doesn't work because a #define in a #define does not work.
So the question basically is:
Do you have an idea how it would be possible to split the calculation of the number of tasks into multiple files (namely the modules themselves) and have that number comapred to the defined max number of tasks during compile time?
If this isn't solvable with pure C preprocessor, I'll have to wait until we change the make system to scons...
Can you require that tasks have some ID or similar which all task users must use?
tasks.h
enum {
TASK_ID_TASK_A_1,
TASK_ID_TASK_B_1,
TASK_ID_TASK_B_2,
NUM_TASKS
};
void * giveResource(int taskId, int resourceId);
user_module_B.c
#include "tasks.h"
...
resource = giveResource(TASK_ID_TASK_B_1, resource_id_B_needs);
You can update the value of a macro in the course of translating one unit using the Boost Preprocessor library's evaluated slots functionality. It defines several "slots" that can be treated as mutable global variables by preprocessor code, which will let you add to a value as you go along rather than defining it with a single expression.
(It's pure standard C, but the code that implements it is rather complicated)
You can't do it in one line, because it relies on calls to #include, and that means you can't wrap it up into a pretty one-liner macro, but it would look something like this:
#define TASK_SLOT 2 //just pick one
#define ALL_TASKS BOOST_PP_SLOT(TASK_SLOT)
#define BOOST_PP_VALUE 1 + 2
#include BOOST_PP_ASSIGN_SLOT(TASK_SLOT) // ALL_TASKS now evals to 3
#define BOOST_PP_VALUE 3 + ALL_TASKS
#include BOOST_PP_ASSIGN_SLOT(TASK_SLOT) // ALL_TASKS now evals to 6
As Drew comments (I may have misunderstood your req.), this is only valid for one translation unit. Which is fine if you have a central NUM_TASKS and each unit is allowed to add up its own ALL_TASKS figure.
If you need the value to increment across multiple .c files (so that the final ALL_TASKS is the total across all modules, not for one), you'd need to wrap them up into a unity build for this technique to work, which most people reckon is a bad idea. A more advanced build system would then probably be appropriate, because the preprocessor is only designed to work on single units.

Sensible way to write function prototypes

I'm looking for a (clean) way of writing a function definition and a function prototype without code duplication. Since DRY is well established as a good idea and hand coding prototypes in header files is a clear violation this seems like a reasonable requirement.
The example code below indicates a (crude) way of solving the problem with the preprocessor. It seems unlikely to be optimal, but does appear to work correctly.
Using separate files and duplication:
foo.h:
#ifndef FOO_H
#define FOO_H
// Normal header file stuff
int dofoo(int a);
#endif /* FOO_H */
foo.c:
#include "foo.h"
int dofoo(int a) {
return a * 2;
}
Using the C preprocessor:
foo.h:
#ifndef FOO_H
#define FOO_H
// Normal header file stuff
#ifdef PROTOTYPE // if incorrect:
// No consequences for this test case, but we lose a sanity check
#error "PROTOTYPE set elsewhere, include mechanism will fall over"
#endif
#define PROTOTYPE // if incorrect:
// "error: redefinition of 'dofoo'" in clang & gcc,
// referring to int dofoo() line in foo.c
#include "foo.c"
#undef PROTOTYPE //if incorrect:
// No warnings, but should trigger the earlier #error statement if
// this method is used in more than one file
#endif /* FOO_H */
foo.c:
#include "foo.h"
int dofoo (int a)
#ifdef PROTOTYPE // if incorrect:
// "error: redefinition of 'dofoo'" in clang & gcc,
// referring to int dofoo() line in foo.c
;
#else
{
return a * 2;
}
#endif
The mechanism is a bit odd - the .h file doesn't conventionally include the .c file! The include guard halts the recursion. It compiles cleanly and looks reasonable when run through a standalone preprocessor. Otherwise though, embedding preprocessor conditionals throughout the source doesn't look great.
There are a couple of alternative approaches I can think of.
Don't worry about the code duplication
Change to a language which generates the interface automatically
Use a code generator (e.g. sqlite's makeheaders)
A code generator would work but seems overkill as a solution for a minor annoyance. Since C has been around for somewhere over 25 years at this point there's hopefully a community consensus on the best path to take.
Thank you for reading.
edit: Compiler warnings with gcc 4.8.2 and clang 5.1
Messing up the macro statements produces fairly coherent compiler error messages. Missing an #endif (easily done if the function definition is long) produces "error: unterminated #else" or "error: unterminated conditional directive", both referring to the #ifdef line.
Missing #else means the code is no longer valid C. gcc "error: expected identifier or '(' before '{' token" and clang adds "expected function body after function declarator". Both point to the correct line number, but neither suggest an #else is missing.
Spelling PROTOTYPE wrong produces coherent messages if the result is fatal and no warning if the result doesn't matter. The compiler warnings aren't quite as specific as they can be when definition and declaration differ, but they're probably specific enough.
The generally accepted path is your option 1), to not worry and just write the declaration twice.
The repetition coming from prototypes is only a small percentage compared to the function implementations. Macro hacks like in your question quickly become unwieldy and provide little gain. The macro machinery ends up being just as much code as the original prototypes, only that it's now much harder to understand what's going on and that you'll get more cryptic error messages. The trivial to understand duplication gets replaced by about the same amount of much harder to understand trickery.
With normal prototypes the compiler will issue warnings when things don't match up, with such a macro base solution you get hard to understand errors if you forget an #endif or something else doesn't match up. For example any mention of foo.c in an error might be with or without PROTOTYPE defined.
I would like to take a look at it from another point of view. As I like to see DRY principle, it is meaningful for the code that provides logic, not taking it as repeating strings literally.
This way it would not touch declarations, as they introduce no logic. When you see few pieces of code, that do (as in perform some task) the same, just arguments change, then it should be avoided/refactored.
And this is what you actually do. You just introduced some new pre-processing logic into code, i.e. #ifdef PROTOTYPE... #else ... #endif, that you will repeat over and over just changing the prototype and the body. If you could wrap it up into something that does not enforce to repeat the branch I'd say it is somewhat ok.
But currently you really do repeat some logic in code, just to eliminate a multiple declarations, which is basically harmless in the context you provide. If you forget something the compiler will tell you something is mismatched. It's c.
I'd say your proposed approach violates it more, than repeated declarations.

Resources