There is a C structure
struct a
{
int val1,val2;
}
I have made changes to the code like
struct b
{
int val2;
}
struct a
{
int val1;
struct b b_obj;
}
Now, usage of val2 in the other C files is like a_obj->val2;.
I want to replace its declaration usage and there are a lot of them, so I have defined a macro in the header file where the struct a is defined as follows:
#define a_obj->val2 (a_obj->b_obj.val2)
It's not working. Is -> illegal in the identifier part of a macro definition #define?
Could someone please tell me where am I wrong?
Edit as suggested by #Basile -
It's a legacy source code, a very huge project. Not sure of LOC.
I want to make such changes because I want to make it more modular.
For example I want to group similar fields of the structure under a same name and that's the reason I want to create another struct B with fields which are related to B feature and also common to A.
I can't use Find Replace feature of other text editors, I am using VIM.
This kind of macro magic will get you into trouble soon,
because it is making your source code unreadable and brittle (credits Basile for the phrasing).
But this should work for what you describe.
struct b
{
int val2m;
}
struct a
{
int val1;
struct b b_obj;
}
#define val2 b_obj.val2m
The trick is to give the actual identifier inside the struct declaration a new name (val2m), so that the name all the other code uses can be turned into a magic alias,
which then can contain the modified access to take a detour via the additionally introduced inner struct.
This is only a kind of band-aid for the problematic situation of having to change something backstage in existing code with many references. Only use it if there is no chance of refactoring the code cleanly. ("band-aid", appropriate image by StoryTeller, credits).
I explicitly recommend looking at Basiles answer, for a cleaner more "future-proof" way. It is the way to go to avoid the trouble I predict with using this macro magic. Use it if you are not forced by very good reasons.
As other explained, the preprocessor works only on tokens, and you can only #define a name. Read the documentation of cpp and the C11 standard n1570.
What you want to do is very ugly (and there are few occasions where it is worthwhile). It makes your code messy, unreadable, and brittle.
Learn to use better your source code editor (you probably have some interactive replace, or interactive replace with regexp-s; if you don't, switch to a better editor like GNU emacs or vim - and study the documentation of your editor). You could also use scripting tools like ed, sed, grep, awk etc... to help you in doing those replacements.
In a small project, replacing relevant occurrences of ->val2 (or .val2) with ->b_obj.val2 (or .b_obj.val2) is really easy, even if you have a hundred of them. And that keeps your code readable. Don't forget to use some version control system (to keep both old and new versions of your code).
In a large project of at least a million of lines of source code, you might ask how to find every occurrence of field usage of val2 for a given type (but you should probably name val2 well enough to have most occurrences of it be relevant; in other words, take care of the naming of your fields). That is a very different question (e.g. you could write some GCC plugin to find such occurrences and help you in replacing the relevant ones).
If you are refactoring an old and large legacy code, you need to be sure to keep it readable, and you don't want fancy macro tricks. For example, you might add some static inline function to access that field. And it could be then worthwhile to use some better tools (e.g. a compiler plugin, some kind of C parser, etc...) to help you in that refactoring.
Keep the source code readable by human developers. Otherwise, you are shooting yourself in the foot. What you want to do is unreasonable, it decreases the readability of the code base.
I can't use Find Replace feature of other text editors, I am using VIM.
vim is scriptable (e.g. in lua) and accepts plugins (so if interactive replace is not enough, consider writing some vim plugin or script to help you), and has powerful find-replace-regexp facilities. You might also use some combination of scripts to help you. In many cases they are enough. If they are not, you should explain why.
Also, you could temporarily replace the val2 field of struct a with a unique name like val2_3TYRxW1PuK7 (or whatever is appropriate, making some unique "random-looking" name is easy). Then you run your full build (e.g. after some make clean). The compiler would emit error messages for every place where you need to replace val2 used as a field of struct a (but won't mind for any other occurrence of the val2 name used for some other purpose). That could help you a lot -once you have corrected your code to get rid of all errors- (especially when combined with some editor scripting) because then you just need to replace val2_3TYRxW1PuK7 with b_obj.val2 everywhere.
Is -> illegal in #define?
Yes.
#define identifier can only be letter, number or underscore.
Macros definitions must be regular identifiers, so you can't use any special character like - or >.
I've thinked that may be you can use an union, like this:
struct b
{
int val2;
}
struct a
{
int val1;
union {
struct b b_obj;
int val2;
}
}
so you can still using a_obj->val2.
Related
I need a tool to automatically flag unused structure members in a C codebase. My definition of "unused" is simple - if the structure member definition is removed from the code, and the code compiles successfully, then the structure member is declared unused. The question is - how can this be done in an automated way? (speed isn't too much of concern as the codebase is small).
The existing stack overflow articles on this topic seem to hint that there is no existing static analysis tool that can do this today. On the other hand, given the modularity of Clang, I feel that this should be doable with AST manipulation. Let's take a single file, for example. What I would like to do is the following (this can be later generalized to a set of source files in the codebase):
Generate AST from C code.
Recursively visit all structure field defintions and remove them one-by-one. We can keep a "seen" dictionary to ensure we don't remove already seen field definition nodes.
Filter out field definitions to only those that are present in the codebase to analyzed (to avoid definitions in standard libraries, for example).
Compile the code.
If the code compiles successfully, then the corresponding field declaration is unused and is flagged.
Proceed to #1.
The keyword above is remove. How can I remove a field definition? There seems to be two ways using Clang.
At a source level, we can remove the field declaration using Clang Rewriter (there is a "RemoveText(SourceRange)" option). But, I don't know if this will work all the time (ex: for structures are autogenerated using MACRO expansion).
Delete the field declaration node from AST, and then "re-compile" the AST (whatever that means).
Among the above two options, #1 seems hacky - you'll need to create a copy of the source file, re-write it after a field definition is removed, and then re-compile the modified source. And, I am not sure how well it will work when there are complex MACROS involved for generating structure field definitions.
#2 seems clean, but from Googling, there seems to be no such thing as "deleting a AST node" (it is immutable). Please correct me if I am wrong. Even if I succeed in this, how I proceed from this point to re-evaluate the AST for missing references to structure fields? ("the compilation" step).
Any sugesstions appreciated (thanks in advance!). I've already have some initial success with #1 approach above, but I feel that this isn't the right direction.
cppcheck can do this. For example:
// test.cpp
struct Struct
{
int used;
int unused;
};
int main()
{
Struct s;
s.used = 0;
return s.used;
}
$ cppcheck test.cpp --enable=all
Checking test.cpp ...
test.cpp:5:9: style: struct member 'Struct::unused' is never used. [unusedStructMember]
int unused;
^
While I used C++ code in the example it behaves the same for C.
When developing and maintaining code, I add a new member to a structure and sometimes forget to add the code to initialize or free it which may later result in a memory leak, an ineffective assertion, or run-time memory corruption.
I try to maintain symmetry in the code where things of the same type are structured and named in a similar manner, which works for matching Construct() and Deconstruct() code but because structures are defined in separate files I can't seem to align their definitions with the functions.
Question: is there a way through coding to make myself more aware that I (or someone else) has changed a structure and functions need updating?
Efforts:
The simple:
-Have improved code organization to help minimize the problem
-Have worked to get into the habit of updating everything at once
-Have used comments to document struct members, but this just means results in duplication
-Do use IDE's auto-suggest to take a look and compare suggested entries to implemented code, but this doesn't detect changes.
I had thought that maybe structure definitions could appear multiple times as long as they were identical, but that doesn't compile. I believe duplicate structure names can appear as long as they do not share visibility.
The most effective thing I've come up with is to use a compile time assertion:
static_assert(sizeof(struct Foobar) == 128, "Foobar structure size changed, reevaluate construct and destroy functions");
It's pretty good, definitely good enough. I don't mind updating the constant when modifying the struct. Unfortunately compile time assertions are very platform (compiler) and C Standard dependent, and I'm trying to maintain the backwards compatibility and cross platform compatibility of my code.
This is a good link regarding C Compile Time Assertions:
http://www.pixelbeat.org/programming/gcc/static_assert.html
Edit:
I just had a thought; although a structure definition can't easily be relocated to a source file (unless it does not need to be shared with other source files), I believe a function can actually be relocated to a header file by inlining it.
That seems like a hacked way to make the language serve my unintended purpose, which is not what I want. I want to be professional. If the professional practice is not to approach this code-maintainability issue this way, then that is the answer.
I've been programming in C for almost 40 years, and I don't know of a good solution to this problem.
In some circles it's popular to use a set of carefully-contrived macro definitions so that you can write the structure once, not as a direct C struct declaration but as a sequence of these macros and then, by defining the macro differently and re-expanding, turn your "definition" into either a declaration or a definition or an initialization. Personally, I feel that these techniques are too obfuscatory and are more trouble than they're worth, but they can be used to decent effect.
Otherwise, the only solution -- though it's not what you're looking for -- is "Be careful."
In an ideal project (although I realize full well there's no such thing) you can define your data structures first, and then spend the rest of your time writing and debugging the code that uses them. If you never have occasion to add fields to structs, then obviously you won't have this problem. (I'm sorry if this sounds like a facetious or unhelpful comment, but I think it's part of the reason that I, just as #CoffeeTableEspresso mentioned in a comment, tend not to have too many problems like this in practice.)
It's perhaps worth noting that C++ has more or less the same problem. My biggest wishlist feature in C++ was always that it would be possible to initialize class members in the class declaration. (Actually, I think I've heard that a recent revision to the C++ standard does allow this -- in which case another not-necessarily-helpful answer to your question is "Use C++ instead".)
C doesn't let you have benign struct redefinitions but it does let you have benign macro redefinitions.
So as long as you
save the struct body in a macro (according to a fixed naming convention)
redefine the macro at the point of your constructor
you will get a warning if the struct body changes and you haven't updated the corresponding constructor.
Example:
header.h:
#define MC_foo_bod \
int x; \
double y; \
void *p
struct foo{ MC_foo_bod; };
foo__init.c
#include "header.h"
#ifdef MC_foo_bod
//try for a silent redefinition
//if it wasn't silent, the macro changed and so should this code
#define MC_foo_bod \
int x; \
double y; \
void *p
#else
#error ""
//oops--not a redefinition
//perhaps a typo in the macro name or a failure to include the header?
#endif
void foo__init(struct foo*X)
{
//...
}
I have a several structs in C and I want to write the following three functions:
get_field_list(...)
get_value_by_name(...)
set_value_by_name(...)
The first should return the list of fields defined in the struct. The second and third should get and set to the appropriate field by it's name.
I'm writing the structs. I'm willing to use any macro magic if required. It's OK if ill have a triplet of functions per each struct, but generic structures are better. Function pointers are also fine...
Basically I want some elementary reflections for structs....
Relevent:
https://natecraun.net/articles/struct-iteration-through-abuse-of-the-c-preprocessor.html
motivation
I'm trying to build a DAL (Data Access Layer) for a native app written in C. I'm using SQLite as a DB. I need to store various structures, and to be able to insert\ update\ get(select by key)\ search (select by query), and also to create\ drop the required table.
Basicly I want something like Hibernate for C ...
My best idea so far is to use MACROs, or some code generation utility, or a script, to create my structs together with meta-data I could use to dynamically build all my SQL commands. And also to have a small 'generic' module to implement all the basic procedures i need...
Different or better ideas to solve my actual problem will also be appreciated!
It can be done with "macro magic" as you suggested:
For each struct, create a header file (mystruct-fields.h) like this:
FIELD(int, field1)
FIELD(int*, field2)
FIELD(char*, string1)
Then, in another header (mystruct.h) you include that as many times as you need:
#define FIELD(T,N) T N;
struct mystruct {
#include "mystruct-fields.h"
};
#undef FIELD
#define FIELD(T,N) { STRINGIFY(T), STRINGIFY(N), offsetof(mystruct, N) },
#define STRINGIFY1(S) #S
#define STRINGIFY(S) STRINGIFY1(S)
struct mystruct_table {
struct {
const char *type, *name;
size_t offset;
} field[];
} table = {
#include "mystruct-fields.h"
{NULL, NULL, 0}
};
#undef FIELD
You can then implement your reflection functions, using the table, however you choose.
It might be possible, using another layer of header file includes, to reuse the above code for any struct without rewriting it, so your top-level code might only have to say something like:
#define STRUCT_NAME mystruct
#include "reflectable-struct.h"
#undef STRUCT_NAME
Honestly though, it's easier for the people who come after you if you just write the struct normally, and then write out the table by hand; it's much easier to read, your IDE will be able to auto-complete your types, and prominent warnings in the comments should help prevent people breaking it in future (and anyway, you do have tests for this right?)
The way to do it is to have your struct in a database format or xml or a text file or whatever format you are comfortable with. And use a C program to write a .h file for each struct. The .h file contains the struct , an enum of the fields, and array of char containing the names of each field. From there you can build anything you need. Preferably using a program generator.
Take a look at Metaresc library. It provides reflection capabilities in plain C. Metadata of types definition could be derived either from custom macro language that replaces standard C type definition semantics or from compiler debug info. Sample app is provided in README.md
I'm writing a Scheme interpreter. For each built-in type (integer, character, string, etc) I want to have the read and print functions named consistently:
READ_ERROR Scheme_read_integer(FILE *in, Value *val);
READ_ERROR Scheme_read_character(FILE *in, Value *val);
I want to ensure consistency in the naming of these functions
#define SCHEME_READ(type_) Scheme_read_##type_
#define DEF_READER(type_, in_strm_, val_) READ_ERROR SCHEME_READ(type_)(FILE *in_strm_, Value *val_)
So that now, instead of the above, in code I can write
DEF_READER(integer, in, val)
{
// Code here ...
}
DEF_READER(character, in, val)
{
// Code here ...
}
and
if (SOME_ERROR != SCHEME_READ(integer)(stdin, my_value)) do_stuff(); // etc.
Now is this considered an unidiomatic use of the preprocessor? Am I shooting myself in the foot somewhere unknowingly? Should I instead just go ahead and use the explicit names of the functions?
If not are there examples in the wild of this sort of thing done well?
I've seen this done extensively in a project, and there's a severe danger of foot-shooting going on.
The problem happens when you try to maintain the code. Even though your macro-ized function definitions are all neat and tidy, under the covers you get function names like Scheme_read_integer. Where this can become an issue is when something like Scheme_read_integer appears on a crash stack. If someone does a search of the source pack for Scheme_read_integer, they won't find it. This can cause great pain and gnashing of teeth ;)
If you're the only developer, and the code base isn't that big, and you remember using this technique years down the road and/or it's well documented, you may not have an issue. In my case it was a very large code base, poorly documented, with none of the original developers around. The result was much tooth-gnashing.
I'd go out on a limb and suggest using a C++ template, but I'm guessing that's not an option since you specifically mentioned C.
Hope this helps.
I'm usually a big fan of macros, but you should probably consider inlined wrapper functions instead. They will add negligible runtime overhead and will appear in stack backtraces, etc., when you're debugging.
Introduction
Hello folks, I recently learned to program in C! (This was a huge step for me, since C++ was the first language, I had contact with and scared me off for nearly 10 years.) Coming from a mostly OO background (Java + C#), this was a very nice paradigm shift.
I love C. It's such a beautiful language. What surprised me the most, is the high grade of modularity and code reusability C supports - of course it's not as high as in a OO-language, but still far beyond my expectations for an imperative language.
Question
How do I prevent naming conflicts between the client code and my C library code? In Java there are packages, in C# there are namespaces. Imagine I write a C library, which offers the operation "add". It is very likely, that the client already uses an operation called like that - what do I do?
I'm especially looking for a client friendly solution. For example, I wouldn't like to prefix all my api operations like "myuniquelibname_add" at all. What are the common solutions to this in the C world? Do you put all api operations in a struct, so the client can choose its own prefix?
I'm very looking forward to the insights I get through your answers!
EDIT (modified question)
Dear Answerers, thank You for Your answers! I now see, that prefixes are the only way to safely avoid naming conflicts. So, I would like to modifiy my question: What possibilities do I have, to let the client choose his own prefix?
The answer Unwind posted, is one way. It doesn't use prefixes in the normal sense, but one has to prefix every api call by "api->". What further solutions are there (like using a #define for example)?
EDIT 2 (status update)
It all boils down to one of two approaches:
Using a struct
Using #define (note: There are many ways, how one can use #define to achieve, what I desire)
I will not accept any answer, because I think that there is no correct answer. The solution one chooses rather depends on the particular case and one's own preferences. I, by myself, will try out all the approaches You mentioned to find out which suits me best in which situation. Feel free to post arguments for or against certain appraoches in the comments of the corresponding answers.
Finally, I would like to especially thank:
Unwind - for his sophisticated answer including a full implementation of the "struct-method"
Christoph - for his good answer and pointing me to Namespaces in C
All others - for Your great input
If someone finds it appropriate to close this question (as no further insights to expect), he/she should feel free to do so - I can not decide this, as I'm no C guru.
I'm no C guru, but from the libraries I have used, it is quite common to use a prefix to separate functions.
For example, SDL will use SDL, OpenGL will use gl, etc...
The struct way that Ken mentions would look something like this:
struct MyCoolApi
{
int (*add)(int x, int y);
};
MyCoolApi * my_cool_api_initialize(void);
Then clients would do:
#include <stdio.h>
#include <stdlib.h>
#include "mycoolapi.h"
int main(void)
{
struct MyCoolApi *api;
if((api = my_cool_api_initialize()) != NULL)
{
int sum = api->add(3, 39);
printf("The cool API considers 3 + 39 to be %d\n", sum);
}
return EXIT_SUCCESS;
}
This still has "namespace-issues"; the struct name (called the "struct tag") needs to be unique, and you can't declare nested structs that are useful by themselves. It works well for collecting functions though, and is a technique you see quite often in C.
UPDATE: Here's how the implementation side could look, this was requested in a comment:
#include "mycoolapi.h"
/* Note: This does **not** pollute the global namespace,
* since the function is static.
*/
static int add(int x, int y)
{
return x + y;
}
struct MyCoolApi * my_cool_api_initialize(void)
{
/* Since we don't need to do anything at initialize,
* just keep a const struct ready and return it.
*/
static const struct MyCoolApi the_api = {
add
};
return &the_api;
}
It's a shame you got scared off by C++, as it has namespaces to deal with precisely this problem. In C, you are pretty much limited to using prefixes - you certainly can't "put api operations in a struct".
Edit: In response to your second question regarding allowing users to specify their own prefix, I would avoid it like the plague. 99.9% of users will be happy with whatever prefix you provide (assuming it isn't too silly) and will be very UNHAPPY at the hoops (macros, structs, whatever) they will have to jump through to satisfy the remaining 0.1%.
As a library user, you can easily define your own shortened namespaces via the preprocessor; the result will look a bit strange, but it works:
#define ns(NAME) my_cool_namespace_ ## NAME
makes it possible to write
ns(foo)(42)
instead of
my_cool_namespace_foo(42)
As a library author, you can provide shortened names as desribed here.
If you follow unwinds's advice and create an API structure, you should make the function pointers compile-time constants to make inlinig possible, ie in your .h file, use the follwoing code:
// canonical name
extern int my_cool_api_add(int x, int y);
// API structure
struct my_cool_api
{
int (*add)(int x, int y);
};
typedef const struct my_cool_api *MyCoolApi;
// define in header to make inlining possible
static MyCoolApi my_cool_api_initialize(void)
{
static const struct my_cool_api the_api = { my_cool_api_add };
return &the_api;
}
Unfortunately, there's no sure way to avoid name clashes in C. Since it lacks namespaces, you're left with prefixing the names of global functions and variables. Most libraries pick some short and "unique" prefix (unique is in quotes for obvious reasons), and hope that no clashes occur.
One thing to note is that most of the code of a library can be statically declared - meaning that it won't clash with similarly named functions in other files. But exported functions indeed have to be carefully prefixed.
Since you are exposing functions with the same name client cannot include your library header files along with other header files which have name collision. In this case you add the following in the header file before the function prototype and this wouldn't effect client usage as well.
#define add myuniquelibname_add
Please note this is a quick fix solution and should be the last option.
For a really huge example of the struct method, take a look at the Linux kernel; 30-odd million lines of C in that style.
Prefixes are only choice on C level.
On some platforms (that support separate namespaces for linkers, like Windows, OS X and some commercial unices, but not Linux and FreeBSD) you can workaround conflicts by stuffing code in a library, and only export the symbols from the library you really need. (and e.g. aliasing in the importlib in case there are conflicts in exported symbols)