I want to implement a new language, and I would like to do it in C, with the famous flex+yacc combination. Well, the thing is, writing the whole AST code is very time consuming. Is there a tool that automatically generate the constructors for the structs?
I would like something with the following behavior:
input:
enum AgentKind {A_KIND1, A_KIND2};
typedef struct Agent_st
{
enum AgentKind kind;
union {
struct {int a, b, c} k1;
struct {int a, GList* rest} k2;
} u;
} Agent;
output:
Agent* agent_A_KIND1_new(int a, b, c)
{
Agent* a = (Agent*)malloc(sizeof(Agent));
a->kind = A_KIND1;
a->k1.a = a;
...
...
return a;
}
Agent* agent_A_KIND2_new(int a, GList* rest)
{ ... }
Thank you!
You might be able to get something working with clever use of pre-processor macros.
First the header file:
#ifndef AST_NODE
# define AST_NODE(token) \
struct AST_ ## token \
{ \
int kind; \
};
#endif
AST_NODE(TokenType1)
AST_NODE(TokenType2)
Then the source file:
#define AST_NODE(token) \
struct AST_ ## token *AST_ ## token ## _new() \
{ \
struct AST_ ## token *node = malloc(sizeof(AST_ ## token); \
node->kind = token; \
return node; \
}
#include "ast.h"
If you include the "ast.h" file in any other file, you will have two structures: AST_TokenType1 and AST_TokenType2.
The source file described above creates two functions: AST_TokenType1_new() and AST_TokenType2_new() which allocate the correct structure and sets the structure member kind.
Well, since there was no tool I decided to code something this afternoon.
I started something that looks like a nice project, and I would like to continue it.
I coded a somewhat simple (just a bunch of nested folds inside the IO monad) code generator in Haskell, based in builtin haskell types.
The AST type declaration:
http://pastebin.com/gF9xF1vf
The C code generator, based on the AST declaration:
http://pastebin.com/83Z4GH38
And the generated result:
http://pastebin.com/jJPgm5PE
How can somebody not love Haskell?
:)
ps: I coded this because the project I'm currently working on is going to suffer a huge amount of changes in the near future, and those changes will invalidade the AST, thus forcing me to code another AST module...
Now I can do it quite fast!
Thanks for the answer though.
Related
I'm trying to define a Cpp macro, say ENO_DECL(eno, func) that expands as follows
#define ENO_DECL(eno, func) { eno, #eno, func }
so I can write e.g., ENO_DECL(ENOMEM, resource_error) That works fine of course. Some platform may not define all errors though, so we'd need something like this:
#ifdef EXYZ
ENO_DECL(EXYZ, type_error)
#endif
This is of course not so elegant. I'd like to get rid of the #ifdef and have the ENO_DECL() expanded to nothing if EXYZ is not defined. I've read a couple of interesting posts and blogs with really nice cpp tricks, but this one doesn't seem to be covered. I have the impression this can be done though (portable). Is that right? ... and how?
I'd like to get rid of the #ifdef
That is not possible. You could use some code generation tool or other preprocessing tools. For example, using m4:
#include <errno.h>
/** m4 code here
define(`ENO', `#ifdef $1
ENO_DECL($1, $2),
#endif')
*/
#define ENO(x) /* just so that IDE does not complain */
#define ENO_DECL(eno, func) { eno, #eno, func }
typedef struct {
int eno;
const char* str;
void (*handler)(void);
} eno_t;
const eno_t eno [] = {
ENO(ENOMEM, abc)
ENO(E_DOES_NOT_EXISTS, def)
};
You can inspect such code with for example m4 file.c.m4 | gcc -E -xc - and the preprocessor output is:
const eno_t eno [] = {
{
# 16 "<stdin>" 3 4
12
# 16 "<stdin>"
, "ENOMEM", abc },
};
Ideally you would get rid of all these macros to begin with. These aren't "nice tricks", they are last resorts, when you are maintaining some badly written code and has to patch everything together with weird function-like macros and other desperate measures.
Anyway, it would appear that the solution you are looking for is "X macros". How they work are explained on Wikipedia etc. Basically it's an "industry de facto" standardized way to centralize all code repetition to a single place in the code base. Useful for code maintenance but should be avoided when developing new code. An example:
#define ENO_LIST(X) \
X(ENOMEM, resource_error) \
X(ENOFOO, foo_error) \
...
typedef struct
{
int eno;
const char* str;
void (*handler)(void);
} eno_t;
const eno_t eno [ENO_N] =
{
#define X(n, func) {n, #n, func},
ENO_LIST(X)
#undef X
};
This struct list will then expand to:
const eno_t eno [ENO_N] =
{
{ENOMEM, "ENOMEM", resource_error},
{ENOFOO, "ENOFOO", resource_error},
...
};
If ENOMEM is part of an enum with indices 0 to N, you could increase data integrity a bit further like this:
typedef enum
{
ENOMEM,
ENOFOO,
...
ENO_N
} enom;
...
#define X(n, func) [n] = {n, #n, func}, // [n] is a designated initializer
...
_Static_assert(sizeof eno / sizeof *eno == ENO_N, "Enum and struct don't match");
Im looking for a solution for define a struct, where the user may enable/disable struct members as in the example (pseudo-code):
#define DEF_STRUCT_1(NAME,VAL1,VAL2) \
struct my_struct_t \
{ \
#if(NAME == TRUE) \
bool name; \
#endif \
#if(VAL1 == TRUE) \
bool val1; \
#endif \
#if(VAL2 == TRUE) \
bool val2; \
#endif \
} instance1
void main() {
DEF_STRUCT_1(TRUE,FALSE,TRUE);
instance1.name = true;
//instance1.val1 = false; // error, unavailable
instance1.val2 = false;
}
I'm not sure how useful this is, but the following should do what you ask:
#define CONDITIONAL_TRUE(code) code
#define CONDITIONAL_FALSE(code)
#define DEF_STRUCT_1(NAME,VAL1,VAL2) \
struct my_struct_t \
{ \
CONDITIONAL_##NAME(bool name;) \
CONDITIONAL_##VAL1(bool val1;) \
CONDITIONAL_##VAL2(bool val2;) \
} instance1
int main() {
DEF_STRUCT_1(TRUE,FALSE,TRUE);
instance1.name = true;
//instance1.val1 = false; // error, unavailable
instance1.val2 = false;
}
All the TRUE/FALSE parameters would have to be available at compile-time. And if you want more than one version of these parameters to be used in the same program, you should make the struct name a parameter as well.
Since you say that this is intended for a library, it isn't clear how you're planning for the library code to be able to access this struct, since it would need to know which members are available. This significantly reduces the usefulness of this method.
A more common method used by libraries is to have a config.h file, editable by the library user, with definitions such as #define USE_NAME_MEMBER 1. Then you can make a normal struct definition with #if directives:
//in mylibrary.h:
#include <mylibrary_config.h>
struct my_struct_t {
#if USE_NAME_MEMBER
bool name;
#endif
/...
};
Then you would also put #if directives around any library code that accesses the name member.
Given that the struct needs to be generated differently at compile-time, given some conditions, you will be facing the problem that all code using the struct will need to be modified accordingly. Compiler switches (#ifdef FOO .... #endif) tend to scale badly with increased complexity. If there is a large number of struct members, all the needed compiler switches will make a horrible, unmaintainable mess out of the program.
There is a well-known design pattern known as "X macros", that can be used to centralize maintenance in programs to one single place, as well as allowing compile-time iteration of all items involved. They make the code hard to read too, and therefore they are a bit of a last resort. But they are a bit of de facto standard and their ugliness doesn't scale with complexity, so they are preferred over some compiler switch madness. It goes like this:
#define INSTANCE_LIST \
/* name, type */ \
X(name, bool) \
X(val1, bool) \
X(val2, bool) \
typedef struct
{
#define X(name, type) type name;
INSTANCE_LIST
#undef X
} instance_t;
This code gets pre-processed into:
typedef struct
{
bool name;
bool val1;
bool val2;
} instance_t;
The only part that needs to be maintained is the "INSTANCE_LIST". By commenting out a line in the list, that struct member will go away. This means that all code using the struct has to be using the same list accordingly. For example, lets add code to the same example, that lists the init values of each member and then sets them:
#include <stdbool.h>
#include <stdio.h>
#define INSTANCE_LIST \
/* name, type, init */ \
X(name, bool, true) \
X(val1, bool, false) \
X(val2, bool, false) \
typedef struct
{
#define X(name, type, init) type name;
INSTANCE_LIST
#undef X
} instance_t;
int main (void)
{
instance_t inst;
#define X(name, type, init) inst.name = init;
INSTANCE_LIST
#undef X
printf("%d ", inst.name);
printf("%d ", inst.val1);
printf("%d ", inst.val2);
}
Very flexible and maintainable - you can easily add more struct members without changing any other macro than the list. But as mentioned, the down-side is that the code looks quite cryptic, especially to those who aren't used to this design pattern.
I am trying to write some reusable generic type-safe code in C, using macros, similar to how klib works:
#define Fifo_define(TYPE) \
\
typedef struct { \
TYPE *head; \
TYPE *tail; \
size_t capacity; \
} Fifo_##TYPE, *pFifo_##TYPE; \
\
inline Fifo_##TYPE * Fifo_##TYPE##_init(size_t capacity) { \
Fifo_##TYPE * fifo = calloc(1, sizeof(Fifo_##TYPE)); \
TYPE * data = calloc(capacity, sizeof(TYPE)); \
fifo->head = data; \
fifo->tail = data; \
fifo->capacity = capacity; \
}
// define macros
#define Fifo(TYPE) Fifo_##TYPE
#define Fifo_init(TYPE, capacity) Fifo_##TYPE_init(capacity)
And then I just use it with any type parameter:
Fifo_define(int32_t);
...
Fifo(int32_t) *myFifo = Fifo_init(int32_t, 100);
However, writing this is rather convoluted and error prone, with no IDE editor support (IntelliSense), so I wondered if there are any tricks which might allow me to (perhaps) add a few defines and then include the file, without having to end each line with \?
Something like:
// no idea how to do this, just checking if similar concept is possible
#define FIFO_TYPE int
#define FIFO_NAME Fifo_int
#include <generic-fifo.h>
#undef FIFO_NAME
#undef FIFO_TYPE
And I would somehow get all the right structs and functions. The problem is that there is a lot of parameter concatenation in these macros, so I am not sure if this can be done in a simpler manner than the first snippet?
Not really recommended in this case, but you can do something like what you want to achieve with X-macros:
#define SUPPORTED_TYPES \
X(int) \
X(double) \
X(char)
#define X(TYPE) \
typedef struct { \
TYPE *head; \
TYPE *tail; \
size_t capacity; \
} Fifo_##TYPE, *pFifo_##TYPE;
SUPPORTED_TYPES
#undef X
#define X(TYPE) \
inline Fifo_##TYPE * Fifo_##TYPE##_init(size_t capacity) \
{ \
Fifo_##TYPE * fifo = calloc(1, sizeof(Fifo_##TYPE)); \
TYPE * data = calloc(capacity, sizeof(TYPE)); \
fifo->head = data; \
fifo->tail = data; \
fifo->capacity = capacity; \
}
SUPPORTED_TYPES
#undef X
But this didn't really improve the situation all that much. It got rid of the need for a single, ugly Fifo_define macro, so you can split up the code in several sections. But the macro mess remains.
I would recommend some completely different approach. Two suggestions:
Handle the type-generic things in the classic C way, in run-time. Use callbacks. Keep track of the used type with an enum, if needed.
C11 _Generic allows all kinds of type safety tricks and can be used to phase out such messy macros. Example that implements "functors". The macro itself is kept minimal and the different implementations for various types is typed out. (That's usually what you end up doing anyway, when you do type-generic programming.)
If you are using complex macros, consider using m4 instead of the C pre-processor. m4 is similar to the C pre-processor but is much more powerful and can do things like have multiple lines without a line continuation character.
Using code generators like m4 is called meta-programming.
Using m4 in C can be accomplished by treating it as a pre-pre-processor like this:
% grep -v '#include' file1 file2 | m4 > outfile
% m4 file1 file2 | cc
Since m4 works in a similar way to the C pre-processor at the basic level, it will generally convert any ordinary C macros correctly in addition to supporting its own advanced features.
I have a doubt about a syntax used in linux kernel code. I have an intuition of what it does but I want to know it more formally. I am using kernel v3.5.4
In file /include/linux/sched.h the following is defined
struct task_struct {
volatile long state;
//some more data members
};
and in file /include/linux/init_task.h file the following is defined:
#define INIT_TASK(tsk) {
.state = 0, \
//some more initializations
}
I am confused about two things:
a) I feel it is used for initialization but can anyone suggest some good read for this type of initialization for structures.
b) I do not understand how the following initialization works. Like how this #define and the corresponding task_struct structure are related.
[EDIT]
I noticed the following things also:
c) Is \ at the end of every line necessary.
d) There are many parts of kernel doe wrapped in #ifdef #endif. If you want to initialize a data member wrapped in #ifdef #endif can we use this form of initialization. I mean can we use #ifdef #endif inside INIT_TASK() like this
#define INIT_TASK(tsk) {
.state = 0, \
//some more initializations
#ifdef CX
.tickets = 5, \
#endif
}
struct task_struct whatever = INIT_TASK(someTsk);
This results in the following code:
struct task_struct whatever = { .state = 0 };
which is valid C syntax to initialize fields in a struct via their name instead of their position. Doing so makes the code safe against struct members that are not added at the last position.
Regarding the backslashes: Yes, they are necessary so the preprocessor knows that the macro continues on the next line.
No, you cannot use #ifdef inside a macro.
I have a C program in which I need to create a whole family of functions which have the same signatures and bodies, and differ only in their types. What I would like to do is define a macro which generates all of those functions for me, as otherwise I will spend a long time copying and modifying the original functions. As an example, one of the functions I need to generate looks like this:
int copy_key__sint_(void *key, void **args, int argc, void **out {
if ((*out = malloc(sizeof(int))) {
return 1;
}
**((_int_ **) out) = *((_int_ *) key);
return 0;
}
The idea is that I could call a macro, GENERATE_FUNCTIONS("int", "sint") or something like this, and have it generate this function. The italicized parts are what need to be plugged in.
Is this possible?
I don't understand the example function that you are giving very well, but using macros for the task is relatively easy. Just you wouldn't give strings to the macro as arguments but tokens:
#define DECLARE_MY_COPY_FUNCTION(TYPE, SUFFIX) \
int copy_function_ ## SUFFIX(unsigned count, TYPE* arg)
#define DEFINE_MY_COPY_FUNCTION(TYPE, SUFFIX) \
int copy_function_ ## SUFFIX(unsigned count, TYPE* arg) { \
/* do something with TYPE */ \
return whatever; \
}
You may then use this to declare the functions in a header file
DECLARE_MY_COPY_FUNCTION(unsigned, toto);
DECLARE_MY_COPY_FUNCTION(double, hui);
and define them in a .c file:
DEFINE_MY_COPY_FUNCTION(unsigned, toto);
DEFINE_MY_COPY_FUNCTION(double, hui);
In this version as stated here you might get warnings on superfluous `;'. But you can get rid of them by adding dummy declarations in the macros like this
#define DEFINE_MY_COPY_FUNCTION(TYPE, SUFFIX) \
int copy_function_ ## SUFFIX(unsigned count, TYPE* arg) { \
/* do something with TYPE */ \
return whatever; \
} \
enum { dummy_enum_for_copy_function_ ## SUFFIX }
Try something like this (I just tested the compilation, but not the result in an executed program):
#include "memory.h"
#define COPY_KEY(type, name) \
type name(void *key, void **args, int argc, void **out) { \
if (*out = malloc(sizeof(type))) { \
return 1; \
} \
**((type **) out) = *((type *) key); \
return 0; \
} \
COPY_KEY(int, copy_key_sint)
For more on the subject of generic programming in C, read this blog wich contains a few examples and also this book which contains interesting solutions to the problem for basic data structures and algorithm.
That should work. To create copy_key_sint, use copy_key_ ## sint.
If you can't get this to work with CPP, then write a small C program which generates a C source file.
Wouldn't a macro which just takes sizeof(*key) and calls a single function that uses memcpy be a lot cleaner (less preprocessor abuse and code bloat) than making a new function for each type just so it can do a native assignment rather than memcpy?
My view is that the whole problem is your attempt to apply C++ thinking to C. C has memcpy for a very good reason.