Using Sparse to check C code - c

Does anyone have experience with Sparse? I seem unable to find any documentation, so the warnings, and errors it produces are unclear to me. I tried checking the mailing list and man page but there really isn't much in either.
For instance, I use INT_MAX in one of my files. This generates an error (undefined identifier) even though I #include limits.h.
Is there any place where the errors and warnings have been explained?

Sparse isn't intended to be a lint, per say. Sparse is intended to produce a parse tree of arbitrary code so that it can be further analyzed.
In your example, you either want to define GNU_SOURCE (which I believe turns on __GNUC__), which exposes the bits you need in limits.h
I would avoid defining __GNUC__ on its own, as several things it activates might behave in an undefined way without all of the other switches that GNU_SOURCE turns on being defined.
My point isn't to help you squash error by error, its to reiterate that sparse is mostly used as a library, not as a stand alone static analysis tool.
From my copy of the README (not sure if I have the current version) :
This means that a user of the library will literally just need to do
struct string_list *filelist = NULL;
char *file;
action(sparse_initialize(argc, argv, filelist));
FOR_EACH_PTR_NOTAG(filelist, file) {
action(sparse(file));
} END_FOR_EACH_PTR_NOTAG(file);
and he is now done - having a full C parse of the file he opened. The
library doesn't need any more setup, and once done does not impose any
more requirements. The user is free to do whatever he wants with the
parse tree that got built up, and needs not worry about the library ever
again. There is no extra state, there are no parser callbacks, there is
only the parse tree that is described by the header files. The action
function takes a pointer to a symbol_list and does whatever it likes with it.
The library also contains (as an example user) a few clients that do the
preprocessing, parsing and type evaluation and just print out the
results. These clients were done to verify and debug the library, and
also as trivial examples of what you can do with the parse tree once it
is formed, so that users can see how the tree is organized.
The included clients are more 'functional test suites and examples' than anything. Its a very useful tool, but you might consider another usage angle if you want to employ it. I like it because it doesn't use *lex / bison , which makes it remarkably easier to hack.

If you look at limits.h you'll see that INT_MAX is defined inside this #if
/* If we are not using GNU CC we have to define all the symbols ourself.
Otherwise use gcc's definitions (see below). */
#if !defined __GNUC__ || __GNUC__ < 2
so to get it to work you should undefine __GNUC__ before including limits.h

Related

Cross-Platform C single header file and multiple implementations

I am working on an open source C driver for a cheap sensor that is used mostly for Arduino projects. The project is set up in such a way that it is possible to support multiple platforms outside the Arduino ecosystem, like the Raspberry Pi.
The project is set up with a platform.h file, with the intention of having different implementations of this header file. Like the example below:
platform.h
platform_arduino.c
platform_rpi.c
platform_windows.c
There is this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I have come up with some solutions like just adding the requirements for each platform at the top of the file.
#if SOME_REQUIREMENT
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}
#endif //SOME_REQUIREMENT
But this seems like a clunky solution.
It impacts readability of the code.1
It will probably make debugging conflicting requirements a nightmare.
1 Many editors (Like VS Code) try to gray out code which does not match requirements. While I want this most of the time, it is really annoying when working on cross-platform drivers. I could just disable it for the entirety of the project, but in other parts of the project it is useful. I understand that it could probably be solved using VS Code thing. However, I am asking for alternative methods of selecting the right file/code for the platform because I am interested in seeing what other strategies there are.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic. My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If it cannot be done without makefile magic, then that is an answer too.
For reference, here is a simplified example of the header file and implementation
platform.h
#ifndef __PLATFORM__
#define __PLATFORM__
int8_t t_open(void);
#endif //__PLATFORM__
platform_arduino.c
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}
this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I don't see why you say that. The first suggestions in the two highest-scoring answers are variations on the idea of using conditional macros, which not only is valid in C, but is a traditional approach. You yourself present an alternative along these lines.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic.
I take you to mean that the approach to platform adaptation has to be encoded somehow into the C source, as opposed to being handled via the build system. Frankly, this is an unusual constraint, except inasmuch as it can be addressed by use of the various system-identification macros provided by C compilers of interest.
Even if you don't want to rely specifically on makefiles, you should consider attributing some responsibility to the build system, which you can do even without knowing specifically what build system that is. For example, you can designate macro names, such as for_windows, etc that request builds for non-default platforms. You then leave it to the person building an instance of the driver to figure out how to configure their tools to provide the appropriate macro definition for their needs (which generally is not hard), based on your build documentation.
My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If the solution needs to be embodied entirely in the C source, then you have three main alternatives:
write code that just works correctly on all platforms, or
perform runtime detection and adaptation, or
use conditional compilation based on macros automatically defined by supported compilers.
If you're prepared to rely on macro definitions supplied by the user at build time, then the last becomes simply
use conditional compilation
Do not dismiss the first out of hand, but it can be a difficult path, and it might not be fully possible for your particular problem (and probably isn't if you're writing a driver or other code for a freestanding implementation).
Runtime adaptation could be viewed as a specific case of code that just works, but what I have in mind for this is a higher level of organization that performs runtime analysis of the host environment and chooses function variants and internal parameters suited to that, as opposed to those choices being made at compile time. This is a real thing that is occasionally done, but it may or may not be viable for your particular case.
On the other hand, conditional compilation is the traditional basis for platform adaptation in C, and the general form does not have the caveat of the other two that it might or might not work in your particular situation. The level of readability and maintainability you achieve this way is a function of the details of how you implement it.
I have come up with some solutions like just adding the requirements for each platform at the top of the file. [...] But this seems like a clunky solution.
If you must include a source file in your build but you don't want anything in it to actually contribute to the target then that's exactly what you must do. You complain that "It will probably make debugging conflicting requirements a nightmare", but to the extent that that's a genuine issue, I think it's not so much a question of syntax as of the whole different code for different platforms plan.
You also complain that the conditional compilation option might be a practical difficulty for you with your choice of development tools. It certainly seems to me that there ought to be good workarounds for that available from your tools and development workflow. But if you must have a workaround grounded only in the C language, then there is one (albeit a bad one): introduce a level of preprocessing indirection. That is, put the conditional compilation directives in a different source file, like so:
platform.c
#if defined(for_windows)
#include "platform_windows.c"
#else
#if defined(for_rpi)
#include "platform_rpi.c"
#else
#include "platform_arduino.c"
#endif
#endif
You then designate platform.c as a file to be built, but not (directly) any of the specific-platform files.
This solves your tool-presentation issue because when you are working on one of the platform-specific .c files, the editor is unlikely to be able to tell whether it would actually be included in a build or not.
Do note well that it is widely considered bad practice to #include files containing function implementations, or those not ending with an extension conventionally designating a header. I don't say otherwise about the above, but I would say that if the whole platform.c contains nothing else, then that's about the least bad variation that I can think of within the category.

Gathering test symbols into an array statically in C/C++

Short version of question
Is it possible to gather specific symbols in C into a single list/array into the executable statically at compile time, without relying on crt initialization (I frequently support embedded targets, and have limited support on dynamic memory).
EDIT: I'm 100% ok with this happening at link time and also ok with not having symbols cross library boundaries.
EDIT 2: I'm also OK with compiler specific answers if it's gcc or clang but would prefer cross platform if possible.
Longer version with more background
This has been a pain in my side for a while.
Right now I have a number of built-in self tests that I like to run in order.
I enforce the same calling convention on all functions and am manually gathering all the tests into an array statically.
// ThisLibrary_testlist.h
#define DECLARE_TEST(TESTNAME) void TESTNAME##_test(void * test_args)
DECLARE_TEST(test1);
DECLARE_TEST(test2);
DECLARE_TEST(test3);
// ThisLibrary_some_module.c
#include "ThisLibrary_testlist.h"
DECLARE_TEST(test1)
{
// ... do hood stuff here
}
// ThisLibrary_testarray.c
#include "ThisLibrary_testlist.h"
typedef void (*testfunc_t) (void*);
#define LIST_TEST(TESTNAME)
testfunc_t tests[] =
{
&LIST_TEST(test1),
&LIST_TEST(test2)
};
// now it's an array... you know what to do.
So far this has kept me alive but it's getting kind of ridiculous that I have to basically modify the code in 3 separate locations if I want to update a test.
Not to mention the absolute #ifdef nightmare that comes with conditionally compiled tests.
Is there a better way?
With a bit of scripting magic you could do the following: After compiling your source files (but before linking) you search the object files for symbols that match your test name pattern. See man nm how to obtain symbol names from object files (well, on Unix, that is - no idea about windows, sorry). Based on the list of object names found, you auto-create the file ThisLibrary_testarray.c, putting in all the extern declarations and then the function pointer table. After generation of this file, you compile it and finally link everything.
This way you only have to add new test functions to the source files. No need to maintain the header file ThisLibrary_testlist.h, but you have to make sure the test functions have external linkage, follow the naming pattern - and be sure no other symbol uses the naming pattern :-)

Annotate functions with macros

I'm looking into Flutters embedder API because I might be calling it via Rust in an upcoming project and find this:
FLUTTER_EXPORT
FlutterResult FlutterEngineShutdown(FlutterEngine engine);
The FLUTTER_EXPORT part is (from what I can see) a macro defined as:
#ifndef FLUTTER_EXPORT
#define FLUTTER_EXPORT
#endif // FLUTTER_EXPORT
I'm by no means a C guru but I have done my fair share of C programming but have never had the opportunity to use something like this. What is it? I've tried to Google it but don't really know what to call it other than "annotation" which doesn't really feel like a perfect fit.
Flutter embedder.h
Also as pointed out - these macro can't be expanded to something meaningful - for example if you write this macro couple of time it will affect nothing.
Another nice use can be - (useful in testing of code) from here
You can pass -DFLUTTER_EXPORT=SOMETHING while using gcc. This will be useful in running your code for testing purposes without touching the codebase. Also this can be used in providing different kind of expnasion based on the compile time passed parameter - which can be used in different ways.
Also a significant part of my answer boasts of the visibility using the empty macro, gcc also provides a way to realize the same thing as described here using __attribute__ ((visibility ("default"))) (as IharobAlAsimi/nemequ mentioned) (-fvisibility=hidden) etc for the macro FLUTTER_EXPORT. The name also gives us a idea that it might be necessary to add the attribute __declspec(dllimport) (which means it will be imported from a dll). An example regarding gcc's usage of visibility support over there will be helpful.
It can useful in associating some kind of debug operation like this (By this I mean that these empty macros can be used like this also - though the name suggests that this was not the intended use)
#ifdef FLUTTER_EXPORT
#define ...
#else
#define ...
#endif
Here #define here will specify some printing or logging macro. And if not defined then it will be replaced with something blank statement. do{}while(0) etc.
This is going to be a bit rambling, but I wanted to talk about some stuff that came up in the comments in addition to your initial question.
On Windows, under certain situations the compiler needs to be explicitly told that certain symbols are to be publicly exposed by a DLL. In Microsoft's compiler (heretofore referred to as MSVC), this is done by adding a __declspec(dllexport) annotation to the function, so you would end up with something like
__declspec(dllexport)
FlutterResult FlutterEngineShutdown(FlutterEngine engine);
Alas, the declspec syntax is non-standard. While GCC does actually support it (IIRC it's ignored on non-Windows platforms), other conformant compilers may not, so it should only be emitted for compilers which support it. The path that the Flutter devs have taken is one easy way to do this; if FLUTTER_EXPORT isn't defined elsewhere, they simply define it to nothing, so that on compilers where __declspec(dllexport) is unnecessary the prototype you've posted would become
FlutterResult FlutterEngineShutdown(FlutterEngine engine);
But on MSVC you'll get the declspec.
Once you have the default value (nothing) set, you can start thinking about how to define special cases. There are a few ways to do this, but the most popular solutions would be to include a platform-specific header which define macros to the correct value for that platform, or to use the build system to pass a definition to the compiler (e.g., -DFLUTTER_EXPORT="__declspec(dllexport)"). I'm prefer to keep logic in the code instead of the build system when possible to make it easier to reuse the code with different build systems, so I'll assume that's the method for the rest of the answer, but you should be able to see the parallels if you choose the build-system route.
As you can imagine, the fact that there is a default definition (which, in this case, is empty) makes maintenance easier; instead of having to define every macro in each platform-specific header, you only have to define it in the headers where a non-default value is required. Furthermore, if you add a new macro you needn't add it to every header immediately.
I think that's pretty much the end of the answer to your initial question.
Now, if we're not building Flutter, but instead using the header in a library which links to the Flutter DLL, __declspec(dllexport) isn't right. We're not exporting the FlutterEngineShutdown function in our code, we're importing it from a DLL. So, if we want to use the same header (which we do, otherwise we introduce the possibility of headers getting out of sync), we actually want to map FLUTTER_EXPORT to __declspec(dllimport). AFAIK this isn't usually necessary even on Windows, but there are situations where it is.
The solution here is to define a macro when we're building Flutter, but never define it in the public headers. Again, I like to use a separate header, something like
#define FLUTTER_COMPILING
#include "public-header.h"
I'd also throw some include guards in, and a check to make sure the public API wasn't accidentally included first, but I'm to lazy to type it here.
Then you can define FLUTTER_EXPORT with something like
#if defined(FLUTTER_COMPILING)
#define FLUTTER_EXPORT __declspec(dllexport)
#else
#define FLUTTER_EXPORT __declspec(dllimport)
#endif
You may also want to add a third case, where neither is defined, for situations where you're building the Flutter SDK into an executable instead of building Flutter as a shared library then linking to it from your executable. I'm not sure if Flutter supports that or not, but for now let's just focus on the Flutter SDK as a shared library.
Next, let's look at a related issue: visibility. Most compilers which aren't MSVC masquerade as GCC; they define __GNUC__, __GNUC_MINOR__, and __GNUC_PATCHLEVEL__ (and other macros) to the appropriate values and, more importantly, if they're pretending to be GCC ≥ 4.2 they support the visibility attribute.
Visibility isn't quite the same as dllexport/dllimport. Instead, it's more like telling the compiler whether the symbol is internal ("hidden") or publicly visible ("default"). This is a bit like the static keyword, but while static restricts a symbol's visibility to the current compilation unit (i.e., source file), hidden symbols can be used in other compilation units but they're not exposed by the linker.
Hiding symbols which needn't be public can be a huge performance win, especially for C++ (which tends to expose a lot more symbols than people think). Obviously having a smaller symbol table makes the linking a lot faster, but perhaps more significant is that the compiler can perform a lot of optimizations which it can't or doesn't for public symbols. Sometimes this is as simple as inlining a function which otherwise wouldn't be, but another huge performance gain can come from the compiler being able to assume data from a caller is aligned, which in turn allows vectorization without unnecessary shuffling. Another might allow the compiler to assume that a pointer isn't aliased, or maybe a function is never called and can be pruned. Basically, there are lots of optimizations the compiler can only do if it knows it can see all the invocations of a function, so if you care about runtime efficiency you should never expose more than is necessary.
This is also a good chance to note that FLUTTER_EXPORT isn't a very good name. Something like FLUTTER_API or FLUTTER_PUBLIC would be better, so let's use that from now on.
By default, symbols are publicly visible, but you can change this by passing -fvisibility=hidden to the compiler. Whether or not you do this will determine whether you need to annotate public functions with __attribute__((visibility("default"))) or private functions with __attribute__((visibility("hidden"))), but I would suggest passing the argument so that if you forget to annotate something you'll end up with an error when you try to use it from a different module instead of silently exposing it publicly.
Two other things also came up in the comments: debugging macros and function annotations. Because of where FLUTTER_EXPORT is used, we know it's not a debugging macro, but we can talk about them for a second anyways. The idea is that you can insert extra code into your compiled software depending on whether or not a macro is defined. The idea is something like this:
#if defined(DISABLE_LOGGING)
# define my_log_func(msg)
#else
# define my_log_func(msg) my_log_func_ex(expr)
#endif
This is actually a pretty bad idea; think about code like this:
if (foo)
my_log_func("Foo is true!");
bar();
With the above definitions, if you compile with DISABLE_LOGGING defined, you'll end up with
if (foo)
bar();
Which probably isn't what you wanted (unless you're entering an obfuscated C competition, or trying to insert a backdoor).
Instead, what you usually want is (as coderredoc mentioned) basically a no-op statement:
#if defined(DISABLE_LOGGING)
# define my_log_func(msg) do{}while(0)
#else
# define my_log_func(msg) my_log_func_ex(expr)
#endif
You may end up with a compiler error in some weird situations, but compile-time errors are vastly preferable to difficult-to-find bugs like you can end up with for the first version.
Annotations for static analysis is another situation which was mentioned in the comments, and it's something I'm a huge fan of. For example, let's say we have a function in our public API which takes a printf-style format string. We can add an annotation:
__attribute__((format(2,3)))
void print_warning(Context* ctx, const char* fmt, ...);
Now if you try to pass an int to a %f, or forget an argument, the compiler can emit a diagnostic at compile time just like with printf itself. I'm not going to go into each and every one of these, but taking advantage of them is a great way to get the compiler to catch bugs before they make it into production code.
Now for some self-promotion. Pretty much all this stuff is highly platform-dependent; whether a feature is available and how to use it properly can depend on the compiler, compiler version, the OS, and more. If you like to keep your code portable you end up with a lots preprocessor cruft to get it all right. To deal with this, I put together a project called Hedley a while back. It's a single header you can drop into your source tree which should make it a lot easier to take advantage of this type of functionality without making people's eyes bleed when they look at your headers.

Is commenting out a #include a safe way to see if it's unneeded?

I like to keep my files clean, so I prefer to take out includes I don't need. Lately I've been just commenting the includes out and seeing if it compiles without warnings (-Wall -Wextra -pedantic, minus a couple very specific ones). I figure if it compiles without warnings I didn't need it.
Is this actually a safe way to check if an include is needed or can it introduce UB or other problems? Are there any specific warnings I need to be sure are enabled to catch potential problems?
n.b. I'm actually using Objective C and clang, so anything specific to those is appreciated, but given the flexibility of Objective C I think if there's any trouble it will be a general C thing. Certainly any problems in C will affect Objective C.
In principle, yes.
The exception would be if two headers interact in some hidden way. Say, if you:
include two different headers which define the same symbol differently,
both definitions are syntactically valid and well-typed,
but one definition is good, the other breaks your program at run-time.
Hopefully, your header files are not structured like that. It's somewhat unlikely, though not inconceivable.
I'd be more comfortable doing this if I had good (unit) tests.
Usually just commenting out the inclusion of the header is safe, meaning: if the header is needed then there will be compiler errors when you remove it, and (usually) if the header is not needed, the code will still compile fine.
This should not be done without inspecting the header to see what it adds though, as there is the (not exactly typical) possibility that a header only provides optional #define's (or #undef's) which will alter, but not break, the way a program is compiled.
The only way to be sure is to build your code without the header (if it's able to build in the first place) and run a proper regimen of testing to ensure its behavior has not changed.
No. Apart from the reasons already mentioned in other answers, it's possible that the header is needed and another header includes it indirectly. If you remove the #include, you won't see an error but there may be errors on other platforms.
In general, no. It is easy to introduce silent changes.
Suppose header.h defines some macros like
#define WITH_FEATURE_FOO
The C file including header.h tests the macro
#ifdef WITH_FEATURE_FOO
do_this();
#else
do_that();
#endif
Your files compile cleanly and with all warnings enabled with or without the inclusion of header.h, but the result behaves differently. The only way to get a definitive answer is to analyze which identifiers a header defines/declares and see if at least one of them appears in the preprocessed C file.
One tool that does this is FlexeLint from Gimpel. I don't get paid for saying this, even though they should :-) If you want to avoid shelling out big bucks, an approach I have been taking is compiling a C file to an object file with and without the header, if both succeed check for identical object files. If they are the same you don't need the header
(but watch our for include directives wrapped in #ifdefs that are enabled by a -DWITH_FEATURE_FOO option).

What is this IN part of the parameter to a function?

I am trying to do some work on Windows drivers but I am having trouble understanding one part of the example source code. I have never seen this before in my C experience and I couldn't find anything on it. Anyways, I was wondering what the "IN" part of the parameter variables are? Below is an example of the header of a function. It is also possible for it to be a few other things like "OUT", "INOUT", "INOPT", and maybe more (couldn't find anything else).
VOID
PLxReadRequestComplete(
IN WDFDMATRANSACTION DmaTransaction,
IN NTSTATUS Status
)
Those are simply markers (from the early days of the Windows DDK) that describe the intended use of the parameter.
In normal builds the macros are defined as nothing, however they could conceivably be defined to implementation-specific keywords that allow the compiler (using SAL or other static code analysis tools) to perform deeper analysis about the correct use of the argument/parameter. I don't think that they're used for SAL because they simply aren't 'rich' enough to describe all the attributes that SAL likes to take into account. So I think they're mainly intended to communicate intent to programmers.
That's not standard C. Most likely, IN has been defined to have some other value using a #define -- i.e., a macro. Search your *.h files for #define IN, #define OUT, etc, and see if you can find out what.

Resources