On the Doxygen documentation I am writting, I have set ENABLE_PREPROCESSING = NO, because I want all of the code to be documented, independently of any #if statements.
The problems is that there is a #define that I need to be documented, but since I have disabled the preprocessor, nothing is generated for it (the other structures on that file are being documented just fine).
One option would be to enable the preprocessor and use the PREDEFINED option to set all the #if, but that is not realistically achievable in my case (too many of them).
Are there any other ways to achieve the intended result?
Thanks!
On the Doxygen documentation I am writting, I have set ENABLE_PREPROCESSING = NO, because I want all of the code to be documented, independently of any #if statements.
That's got some code smell to it. The interface presented by your code should be documented according to how it was built. It's pretty pointless to document features that could have been built but weren't, or to document alternative ways in which your features could have been built. Generally speaking, that means having Doxygen pre-process conditional-compilation directives.
And if you have conditional compilation that you intend for users of your library to trigger when they build their own programs, then I suggest taking a different approach: split your headers, so that your users select which headers to include instead of relying on conditional compilation to customize the content of a single header.
HOWEVER, if you must document all the code in every conditional-compilation branch in a single set of documentation, and you also want to document macros, then you could consider leaving preprocessing on, and filtering out the conditional compilation directives with an input filter. The latter part might be specified like this, for example:
INPUT_FILTER = "sed '/^[ ]*#[ ]*\(if\|el\|endif\)/ d'"
That does not account for line continuations, so as to keep it relatively simple, but even in that form it might be sufficient for your purposes. It could be augmented to handle line continuations if needed.
Related
I am working on an open source C driver for a cheap sensor that is used mostly for Arduino projects. The project is set up in such a way that it is possible to support multiple platforms outside the Arduino ecosystem, like the Raspberry Pi.
The project is set up with a platform.h file, with the intention of having different implementations of this header file. Like the example below:
platform.h
platform_arduino.c
platform_rpi.c
platform_windows.c
There is this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I have come up with some solutions like just adding the requirements for each platform at the top of the file.
#if SOME_REQUIREMENT
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}
#endif //SOME_REQUIREMENT
But this seems like a clunky solution.
It impacts readability of the code.1
It will probably make debugging conflicting requirements a nightmare.
1 Many editors (Like VS Code) try to gray out code which does not match requirements. While I want this most of the time, it is really annoying when working on cross-platform drivers. I could just disable it for the entirety of the project, but in other parts of the project it is useful. I understand that it could probably be solved using VS Code thing. However, I am asking for alternative methods of selecting the right file/code for the platform because I am interested in seeing what other strategies there are.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic. My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If it cannot be done without makefile magic, then that is an answer too.
For reference, here is a simplified example of the header file and implementation
platform.h
#ifndef __PLATFORM__
#define __PLATFORM__
int8_t t_open(void);
#endif //__PLATFORM__
platform_arduino.c
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}
this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I don't see why you say that. The first suggestions in the two highest-scoring answers are variations on the idea of using conditional macros, which not only is valid in C, but is a traditional approach. You yourself present an alternative along these lines.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic.
I take you to mean that the approach to platform adaptation has to be encoded somehow into the C source, as opposed to being handled via the build system. Frankly, this is an unusual constraint, except inasmuch as it can be addressed by use of the various system-identification macros provided by C compilers of interest.
Even if you don't want to rely specifically on makefiles, you should consider attributing some responsibility to the build system, which you can do even without knowing specifically what build system that is. For example, you can designate macro names, such as for_windows, etc that request builds for non-default platforms. You then leave it to the person building an instance of the driver to figure out how to configure their tools to provide the appropriate macro definition for their needs (which generally is not hard), based on your build documentation.
My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If the solution needs to be embodied entirely in the C source, then you have three main alternatives:
write code that just works correctly on all platforms, or
perform runtime detection and adaptation, or
use conditional compilation based on macros automatically defined by supported compilers.
If you're prepared to rely on macro definitions supplied by the user at build time, then the last becomes simply
use conditional compilation
Do not dismiss the first out of hand, but it can be a difficult path, and it might not be fully possible for your particular problem (and probably isn't if you're writing a driver or other code for a freestanding implementation).
Runtime adaptation could be viewed as a specific case of code that just works, but what I have in mind for this is a higher level of organization that performs runtime analysis of the host environment and chooses function variants and internal parameters suited to that, as opposed to those choices being made at compile time. This is a real thing that is occasionally done, but it may or may not be viable for your particular case.
On the other hand, conditional compilation is the traditional basis for platform adaptation in C, and the general form does not have the caveat of the other two that it might or might not work in your particular situation. The level of readability and maintainability you achieve this way is a function of the details of how you implement it.
I have come up with some solutions like just adding the requirements for each platform at the top of the file. [...] But this seems like a clunky solution.
If you must include a source file in your build but you don't want anything in it to actually contribute to the target then that's exactly what you must do. You complain that "It will probably make debugging conflicting requirements a nightmare", but to the extent that that's a genuine issue, I think it's not so much a question of syntax as of the whole different code for different platforms plan.
You also complain that the conditional compilation option might be a practical difficulty for you with your choice of development tools. It certainly seems to me that there ought to be good workarounds for that available from your tools and development workflow. But if you must have a workaround grounded only in the C language, then there is one (albeit a bad one): introduce a level of preprocessing indirection. That is, put the conditional compilation directives in a different source file, like so:
platform.c
#if defined(for_windows)
#include "platform_windows.c"
#else
#if defined(for_rpi)
#include "platform_rpi.c"
#else
#include "platform_arduino.c"
#endif
#endif
You then designate platform.c as a file to be built, but not (directly) any of the specific-platform files.
This solves your tool-presentation issue because when you are working on one of the platform-specific .c files, the editor is unlikely to be able to tell whether it would actually be included in a build or not.
Do note well that it is widely considered bad practice to #include files containing function implementations, or those not ending with an extension conventionally designating a header. I don't say otherwise about the above, but I would say that if the whole platform.c contains nothing else, then that's about the least bad variation that I can think of within the category.
I have two boards, each with the same mcu as target. The difference is that the peripherals are not 100% the same (lets say they are by maybe 90%). So far my colleague has two macros and he either comments them or not so that #ifdef/#endif can be used to tell the preprocessor which includes to use and which to ignore.
I'm thinking of better ways to do this. I dont like the idea of people having to search for the correct line to comment each time they want the correct build for their hardware system, this should be automated and or better documented imho.
Best I came up with are multiple "build-sets" that would then by called "hardware-1" and "hardware-2" or something (of course more descriptive...). These build sets would then each have different "-I"-options to define the two macros my colleague used already before.
For cmake I found this thread:
Define preprocessor macro through CMake?
Is this the way to go or are there better ways that are more elegant? How would you solve this situation? The question maybe also goes into "What are the best practices to tackle this"
Thanks for your input
J
Best I came up with are multiple "build-sets" that would then by
called "hardware-1" and "hardware-2" or something (of course more
descriptive...). These build sets would then each have different
"-I"-options to define the two macros my colleague used already
before.
You mean -D, not -I, but yes, defining the macros via the compiler command line is one of the traditional approaches to this. How you might achieve that depends somewhat on your build system, but with a hand-rolled makefile, it is common to define make variables for target-specific flags, and to put put those, appropriately commented, at the top of the top-level makefile. Sometimes these are intended to be modified at build time, but sometimes there are just different makefiles, or else which set of flags to used is controlled by the target requested on the make command line.
For cmake I found [...]. Is this the way to go or are there better ways that are more elegant?
If you are using cmake already then yes, cmake's facilities for adding macro definitions to the compiler command line would be a great approach. If you are not using cmake then no, switching to a cmake-based build system would be way overkill for just solving the problem described. For systems where CMake will generate makefiles, it is basically a wrapper for what I already described.
I happen to be a fan of the Autotools. If you have an Autotools-based build system then there are different ways to set up this sort of thing, but if you don't, then setting up autotooling for just this purpose would be overkill. It is perhaps worth mentioning, however, that a standard Autotools approach would work by putting the definitions of the adjustable control macros in a header file, and having all the source files include that header. The Autotools would generate that header programmatically, but that's not essential -- you could set up such a header manually and update it as needed, and that would still solve the problem of knowing where to look for the macro definitions.
Normally one can specify preprocessor defines as part of the compilation command.
gcc -Wall -Darduino embedded.c
So assuming Linux/Make you could use
make clean arduino
or
make clean atmega2560
and simply have two targets named that in the make file.
Each one having a -darduino or -datmega2560 as part of the compile command.
If you are using some sort of IDE like MSVC, on the project properties page, under C/C++ you would find a Preprocessor area, and you can add one or the other as part of the preprocessor defines.
Preprocessor Definitions arduino;_DEBUG;_CONSOLE;%(PreprocessorDefinitions)
I need to do some source-to-source manipulations in Linux kernel. I tried to use clang for this purpose but there is a problem. Clang does preprocessing of the source code, i.e. macro and include expansion. This causes clang to sometimes produce broken C code in terms of Linux kernel. I can't maintain all the changes manually, since I expect to have thousands of changes per single file.
I tried ANTLR, but the public grammars available are incomplete and not suitable for such projects as Linux kernel.
So my question is the following. Are there any ways to perform source-to-source manipulations for a C code without preprocessing it?
So assume following code.
#define AAA 1
void f1(int a){
if(a == AAA)
printf("hello");
}
After applying source-to-source manipulation I want to get this
#define AAA 1
void f1(int a){
if(functionCall(a == AAA))
printf("hello");
}
But Clang, for instance, produces following code which does not fit my requirements, i.e. it expands macro AAA
#define AAA 1
void f1(int a){
if(functionCall(a == 1))
printf("hello");
}
I hope I was clear enough.
Edit
The above code is only an example. The source-to-source manipulations I want to do are not restricted with if() statement substitution, but also inserting unary operator in front of expression, replace arithmetic expression with its positive or negative value, etc.
Solution
There is one solution I found for my self. I use gcc in order to produce preprocessed source code and then apply Clang. Then I don't have any issues with macro expansion and includes, since that job is done by gcc. Thanks for the answers!
You may consider http://coccinelle.lip6.fr/ : it provides a nice semantics patching framwork.
An idea would be to replace all occurrences of
if(a == AAA)
with
if(functionCall(a == AAA))
You can do this easily using, e.g., the sed tool.
If you have a finite collection of patterns to be replaced you can write a sed script to perform the substitution.
Would this solve your problem?
Handling the preprocessor is one of the most difficult problems in applying transformations to C (and C++) code.
Our DMS Software Reengineering Toolkit with its C Front End come relatively close to doing this. DMS can parse C source code, preserving most preprocessor conditionals, macro defintions and uses.
It does so by allow preprocessor actions in "well-structured" places. Examples: #defines are allowed where declarations or statements can occur, macro calls and conditionals as replacements for many of the nonterminals in the language (e.g., function head, expression, statement, declarations) and in many non-structured places that people commonly place them (e.g, #if fooif (...) {#endif). It parses the source code and preprocessor directives as if they were part of one language (they ARE, its called "C"), and builds corresponding ASTs, which can be transformed and will regenerate correctly with the captured preprocessor directives. [This level of capability handles OP's example perfectly.]
Some directives are poorly placed (both in the syntax sense, e.g., across multiple fragments of the language, and the "you've got to be kidding" understandability sense). These DMS handles by expanding them away, with some guidance from the advance engineer ("alway expand this macro"). A less satisfactory approach is to hand-convert the unstructured preprocessor conditionals/macro calls into structured ones; this is a bit painful but more workable than one might expect since the bad cases occur with considerably less frequency than the good ones.
To do better than this, one needs to have symbol tables and flow analysis that take into account the preprocessor conditions, and capture all the preprocessor conditionals. We've done some experimental work with DMS to capture conditional declarations in the symbol table (seems to work fine), and we're just starting work on a scheme for the latter.
Not easy being green.
Clang maintains extremely accurate information about the original source code.
Most notably, the SourceManager is able to tell if a given token has been expanded from a macro or written as is, and Chandler Caruth recently implemented macro diagnosis which are able to display the actual macro expansion stack (at the various stages of expansions) tracing back to the actual written code (3.0).
Therefore, it is possible to use the generated AST and then rewrite the source code with all its macros still in place. You would have to query virtually every node to know whether it comes from a macro expansion or not, and if it does retrieve the original code of the expansion, but still it seems possible.
There is a rewriter module in Clang
You can dig up Chandler's code on the macro diagnosis stack
So I guess you should have all you need :) (And hope so because I won't be able to help much more :p)
I would advise to resort to Rose framework. Source is available on github.
A project I'm working on (in C) has a lot of sections of code that can be included or omitted based on compile-time configuration, using preprocessor directives.
I'm interested in estimating how many lines of code different configurations are adding to, or subtracting from, my core project. In other words, I'd like to write a few #define and #undef lines somewhere, and get a sense of what that does to the LOC count.
I'm not familiar with LOC counters, but from a cursory search, it doesn't seem like most of the easily-available tools do that. I'm assuming this isn't a difficult problem, but just a rather uncommon metric to measure.
Is there an existing tool that would do what I'm looking for, or some easy way to do it myself? Excluding comments and blank lines would be a major nice-to-have, too.
Run it through a preprocessor. For example, under gcc, use the option -E, I believe, to get just the kind of output you seem to want.
-E Stop after the preprocessing stage; do not run the compiler proper.
The output is in the form of preprocessed source code, which is sent
to the standard output.
You could get the preprocessor output from your compiler, but this might have other unwanted side effects, like expanding complex multi-line macros, and adding to the LOC count in ways you didn't expect.
Why not write your own simple pre-processor, and use your own include/exclude directives? You can make them trivially simple to parse, and then pipe your code through this pre-processor before sending it to a full featured LOC counter like CLOC.
I'm working on a refactoring tool for C with preprocessor support...
I don't know the kind of refactoring involved in large C projects and I would like to know what people actually do when refactoring C code (and preprocessor directives)
I'd like to know also if some features that would be really interesting are not present in any tool and so the refactoring has to be done completely manually... I've seen for instance that Xref could not refactor macros that are used as iterators (don't know exactly what that means though)...
thanks
Anybody interested in this (specific to C), might want to take a look at the coccinelle tool:
Coccinelle is a program matching and transformation engine which provides the language SmPL (Semantic Patch Language) for specifying desired matches and transformations in C code. Coccinelle was initially targeted towards performing collateral evolutions in Linux. Such evolutions comprise the changes that are needed in client code in response to evolutions in library APIs, and may include modifications such as renaming a function, adding a function argument whose value is somehow context-dependent, and reorganizing a data structure. Beyond collateral evolutions, Coccinelle is successfully used (by us and others) for finding and fixing bugs in systems code.
Huge topic!
The stuff I need to clean up is contorted nests of #ifdefs. A refactoring tool would understand when conditional stuff appears in argument lists (function declaration or definitions), and improve that.
If it was really good, it would recognize that
#if defined(SysA) || defined(SysB) || ... || defined(SysJ)
was really equivalent to:
#if !defined(SysK) && !defined(SysL)
If you managed that, I'd be amazed.
It would allow me to specify 'this macro is now defined - which code is visible' (meaning, visible to the compiler); it would also allow me to choose to see the code that is invisible.
It would handle a system spread across over 100 top-level directories, with varying levels of sub-directories under those. It would handle tens of thousands of files, with lengths of 20K lines in places.
It would identify where macro definitions come from makefiles instead of header files (aargh!).
Well, since it is part of the preprocessor... #include refactoring is a huge huge topic and I'm not aware of any tools that do it really well.
Trivial problems a tool could tackle:
Enforcing consistent case and backslash usage in #includes
Enforce a consistent header guarding convention, automatically add redundant external guards, etc.
Harder problems a tool could tackle:
Finding and removing spurious includes.
Suggest the use of predeclarations wherever practical.
For macros... perhaps some sort of scoping would be interesting, where if you #define a macro inside a block, the tool would automatically #undef it at the end of a block. Other quick things I can think of:
A quick analysis on macro safety could be helpful as a lot of people still don't know to use do { } while (0) and other techniques.
Alternately, find and flag spots where expressions with side-effects are passed as macro arguments. This could possibly be really helpful for things like... asserts with unintentional side-effects.
Macros can often get quite complex, so I wouldn't try supporting much more than simple renaming.
I will tell you honestly that there are no good tools for refactoring C++ like there are for Java. Most of it will be painful search and replace, but this depends on the actual task. Look at Netbeans and Eclipse C++ plugins.
I've seen for instance that Xref could
not refactor macros that are used as
iterators (don't know exactly what
that means though)
To be honest, you might be in over your head - consider if you are the right person for this task.
If you can handle reliable renaming of various types, variables and macros over a big project with an arbitrarily complex directory hierarchy, I want to use your product.
Just discovered this old question, but I wanted to mention that I've rescued the free version of Xrefactory for C, now named c-xrefactory, which manages to do some refactorings in macros such as rename macro, rename macro parameter. It is an Emacs plugin.