Stubbing functions in simulations - c

I'm working on an embedded C project that depends on some external HW. I wish to stub out the code accessing these parts, so I can simulate the system without using any HW. Until now I have used some macros but this forces me to change a little on my production code, which I would like to avoid.
Example:
stub.h
#ifdef _STUB_HW
#define STUB_HW(name) Stub_##name
#else /*_STUB_HW*/
#define STUB_HW(name) name
#endif /*_STUB_HW*/
my_hw.c
WORD STUB_HW(clear_RX_TX)()
{ /* clear my rx/tx buffer on target HW */ }
test_my_hw.c
#ifdef _STUB_HW
WORD clear_RX_TX()
{ /* simulate clear rx/tx buffer on target HW */ }
With this code I can turn on/off the stubbing with the preprocessor tag _STUB_HW
Is there a way to acomplish this without having to change my prod code, and avoiding a lot of ifdefs. And I won't mix prod and test code in the same file if I can avoid it. I don't care how the test code looks as long as I can keep as much as possible out of the production code.
Edit:
Would be nice if it was posible to select/rename functions without replacing the whole file. Like take all functions starting on nRF_## and giving then a new name and then inserting test_nRF_## to nRF_## if it is posible

I just make two files ActualDriver.c and StubDriver.c containing exactly the same function names. By making two builds linking the production code against the different objects there is no naming conflicts. This way the production code contains no testing or conditional code.

As Gerhard said, use a common header file "driver.h" and separate hardware layer implementation files containing the actual and stubbed functions.
In eclipse, I have two targets and I "exclude from build" the driver.c file that is not to be used and make sure the proper one is included in the build. Eclipse then generates the makefile at build time.
Another issue to point out is to ensure you are defining fixed size integers so your code behaves the same from an overflow perspective. (Although from your code sample I can see you are doing that.)

I agree with the above. The standard solution to this is to define an opaque abstracted set of function calls that are the "driver" to the hw, and then call that in the main program. Then provide two different driver implementations, one for hw, one for sw. The sw variant will simulate the IO effect of the hw in some appropriate way.
Note that if the goal is at a lower level, i.e., writing code where each hardware access is to be simulated rather than entire functions, it might be a bit tricker. But here, different "write_to_memory" and "read_from_memory" functions (or macros, if speed on target is essential) could be defined.
There is no need in either case to change the names of functions, just have two different batch files, make files, or IDE build targets (depending on what tools you are using).
Finally, in many cases a better technical solution is to go for a full-blown target system simulator, such as Qemu, Simics, SystemC, CoWare, VaST, or similar. This lets you run the same code all the time, and instead you build a model of the hardware that works like the actual hardware from the perspective of the software. It does take a much larger up-front investment, but for many projects it is well worth the effort. It basically gets rid of the nasty issue of having different builds for target and host, and makes sure you always use your cross-compiler with deployment build options. Note that many embedded compiler suites come with some basic such simulation ability built in.

Related

Cross-Platform C single header file and multiple implementations

I am working on an open source C driver for a cheap sensor that is used mostly for Arduino projects. The project is set up in such a way that it is possible to support multiple platforms outside the Arduino ecosystem, like the Raspberry Pi.
The project is set up with a platform.h file, with the intention of having different implementations of this header file. Like the example below:
platform.h
platform_arduino.c
platform_rpi.c
platform_windows.c
There is this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I have come up with some solutions like just adding the requirements for each platform at the top of the file.
#if SOME_REQUIREMENT
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}
#endif //SOME_REQUIREMENT
But this seems like a clunky solution.
It impacts readability of the code.1
It will probably make debugging conflicting requirements a nightmare.
1 Many editors (Like VS Code) try to gray out code which does not match requirements. While I want this most of the time, it is really annoying when working on cross-platform drivers. I could just disable it for the entirety of the project, but in other parts of the project it is useful. I understand that it could probably be solved using VS Code thing. However, I am asking for alternative methods of selecting the right file/code for the platform because I am interested in seeing what other strategies there are.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic. My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If it cannot be done without makefile magic, then that is an answer too.
For reference, here is a simplified example of the header file and implementation
platform.h
#ifndef __PLATFORM__
#define __PLATFORM__
int8_t t_open(void);
#endif //__PLATFORM__
platform_arduino.c
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}
this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I don't see why you say that. The first suggestions in the two highest-scoring answers are variations on the idea of using conditional macros, which not only is valid in C, but is a traditional approach. You yourself present an alternative along these lines.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic.
I take you to mean that the approach to platform adaptation has to be encoded somehow into the C source, as opposed to being handled via the build system. Frankly, this is an unusual constraint, except inasmuch as it can be addressed by use of the various system-identification macros provided by C compilers of interest.
Even if you don't want to rely specifically on makefiles, you should consider attributing some responsibility to the build system, which you can do even without knowing specifically what build system that is. For example, you can designate macro names, such as for_windows, etc that request builds for non-default platforms. You then leave it to the person building an instance of the driver to figure out how to configure their tools to provide the appropriate macro definition for their needs (which generally is not hard), based on your build documentation.
My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If the solution needs to be embodied entirely in the C source, then you have three main alternatives:
write code that just works correctly on all platforms, or
perform runtime detection and adaptation, or
use conditional compilation based on macros automatically defined by supported compilers.
If you're prepared to rely on macro definitions supplied by the user at build time, then the last becomes simply
use conditional compilation
Do not dismiss the first out of hand, but it can be a difficult path, and it might not be fully possible for your particular problem (and probably isn't if you're writing a driver or other code for a freestanding implementation).
Runtime adaptation could be viewed as a specific case of code that just works, but what I have in mind for this is a higher level of organization that performs runtime analysis of the host environment and chooses function variants and internal parameters suited to that, as opposed to those choices being made at compile time. This is a real thing that is occasionally done, but it may or may not be viable for your particular case.
On the other hand, conditional compilation is the traditional basis for platform adaptation in C, and the general form does not have the caveat of the other two that it might or might not work in your particular situation. The level of readability and maintainability you achieve this way is a function of the details of how you implement it.
I have come up with some solutions like just adding the requirements for each platform at the top of the file. [...] But this seems like a clunky solution.
If you must include a source file in your build but you don't want anything in it to actually contribute to the target then that's exactly what you must do. You complain that "It will probably make debugging conflicting requirements a nightmare", but to the extent that that's a genuine issue, I think it's not so much a question of syntax as of the whole different code for different platforms plan.
You also complain that the conditional compilation option might be a practical difficulty for you with your choice of development tools. It certainly seems to me that there ought to be good workarounds for that available from your tools and development workflow. But if you must have a workaround grounded only in the C language, then there is one (albeit a bad one): introduce a level of preprocessing indirection. That is, put the conditional compilation directives in a different source file, like so:
platform.c
#if defined(for_windows)
#include "platform_windows.c"
#else
#if defined(for_rpi)
#include "platform_rpi.c"
#else
#include "platform_arduino.c"
#endif
#endif
You then designate platform.c as a file to be built, but not (directly) any of the specific-platform files.
This solves your tool-presentation issue because when you are working on one of the platform-specific .c files, the editor is unlikely to be able to tell whether it would actually be included in a build or not.
Do note well that it is widely considered bad practice to #include files containing function implementations, or those not ending with an extension conventionally designating a header. I don't say otherwise about the above, but I would say that if the whole platform.c contains nothing else, then that's about the least bad variation that I can think of within the category.

Test embedded code by replacing static symbols at compile time

Background
I'm building a C application for an embedded Cortex M4 TI-RTOS SYS/BIOS target, however this question should apply to all embedded targets where a single binary is loaded onto some microprocessor.
What I want
I'd like to do some in situ regression tests on the target where I just replace a single function with some test function instead. E.g. a GetAdcMeasurement() function would return predefined values from a read-only array instead of doing the actual measurement and returning that value.
This could of course be done with a mess of #ifndefs, but I'd rather keep the production code as untouched as possible.
My attempt
I figure one way to achieve this would be to have duplicate symbol definitions at the linker stage, and then have the linker prioritise the definitions from the test suite (given some #define).
I've looked into using LD_PRELOAD, but that doesn't really seem to apply here (since I'm using only static objects).
Details
I'm using TI Code Composer, with TI-RTOS & SYS/BIOS on the Sitara AM57xx platform, compiling for the M4 remote processor (denoted IPU1).
Here's the path to the compiler and linker
/opt/ti/ccsv7/tools/compiler/ti-cgt-arm_16.9.6.LTS/bin/armcl
One solution could be to have multiple .c files for each module, one the production code and one the test code, and compile and link with one of the two. The globals and function signatures in both .c file must be at least the same (at least: there may be more symbols but not less).
Another solution, building on the previous one, is to have two libraries, one with the production code and one with the test code, and link with one of both. You could ieven link with both lubraries, with the test version first, as linkers often resolve symbols in the order they are encountered.
And, as you said, you could work with a bunch of #ifdefs, which would have the advantage of having just one .c file, but making tyhe code less readable.
I would not go for #ifdefs on the function level, i.e. defining just one function of a .c file for test and keeping the others as is; however, if necessary, it could be away. And if necessary, you could have one .c file (two) for each function, but that would negate the module concept.
I think the first approach would be the cleanest.
One additional approach (apart from Paul Ogilvie's) would be to have your mocking header also create a define which will replace the original function symbol at the pre-processing stage.
I.e. if your mocking header looks like this:
// mock.h
#ifdef MOCKING_ENABLED
adcdata_t GetAdcMeasurement_mocked(void);
stuff_t GetSomeStuff_mocked(void);
#define GetAdcMeasurement GetAdcMeasurement_mocked
#define GetSomeStuff GetSomeStuff_mocked
#endif
Then whenever you include the file, the preprocessor will replace the calls before it even hits the compiler:
#include "mock.h"
void SomeOtherFunc(void)
{
// preprocessor will change this symbol into 'GetAdcMeasurement_mocked'
adcdata_t data = GetAdcMeasurement();
}
The approach might confuse the unsuspected reader of your code, because they won't necessarily realize that you are calling a different function altogether. Nevertheless, I find this approach to have the least impact to the production code (apart from adding the include, obviously).
(This is a quick sum up the discussion in the comments, thanks for answers)
A function can be redefined if it has the weak attribute, see
https://en.wikipedia.org/wiki/Weak_symbol
On GCC that would be the weak attribute, e.g.
int __attribute__((weak)) power2(int x);
and on the armcl (as in my question) that would be the pragma directive
#pragma weak power2
int power2(int x);
Letting the production code consist of partly weak functions will allow a test framework to replace single functions.

Why not concatenate C source files before compilation? [duplicate]

This question already has answers here:
#include all .cpp files into a single compilation unit?
(6 answers)
The benefits / disadvantages of unity builds? [duplicate]
(3 answers)
Closed 6 years ago.
I come from a scripting background and the preprocessor in C has always seemed ugly to me. None the less I have embraced it as I learn to write small C programs. I am only really using the preprocessor for including the standard libraries and header files I have written for my own functions.
My question is why don't C programmers just skip all the includes and simply concatenate their C source files and then compile it? If you put all of your includes in one place you would only have to define what you need once, rather than in all your source files.
Here's an example of what I'm describing. Here I have three files:
// includes.c
#include <stdio.h>
// main.c
int main() {
foo();
printf("world\n");
return 0;
}
// foo.c
void foo() {
printf("Hello ");
}
By doing something like cat *.c > to_compile.c && gcc -o myprogram to_compile.c in my Makefile I can reduce the amount of code I write.
This means that I don't have to write a header file for each function I create (because they're already in the main source file) and it also means I don't have to include the standard libraries in each file I create. This seems like a great idea to me!
However I realise that C is a very mature programming language and I'm imagining that someone else a lot smarter than me has already had this idea and decided not to use it. Why not?
Some software are built that way.
A typical example is SQLite. It is sometimes compiled as an amalgamation (done at build time from many source files).
But that approach has pros and cons.
Obviously, the compile time will increase by quite a lot. So it is practical only if you compile that stuff rarely.
Perhaps, the compiler might optimize a bit more. But with link time optimizations (e.g. if using a recent GCC, compile and link with gcc -flto -O2) you can get the same effect (of course, at the expense of increased build time).
I don't have to write a header file for each function
That is a wrong approach (of having one header file per function). For a single-person project (of less than a hundred thousand lines of code, a.k.a. KLOC = kilo line of code), it is quite reasonable -at least for small projects- to have a single common header file (which you could pre-compile if using GCC), which will contain declarations of all public functions and types, and perhaps definitions of static inline functions (those small enough and called frequently enough to profit from inlining). For example, the sash shell is organized that way (and so is the lout formatter, with 52 KLOC).
You might also have a few header files, and perhaps have some single "grouping" header which #include-s all of them (and which you could pre-compile). See for example jansson (which actually has a single public header file) and GTK (which has lots of internal headers, but most applications using it have just one #include <gtk/gtk.h> which in turn include all the internal headers). On the opposite side, POSIX has a big lot of header files, and it documents which ones should be included and in which order.
Some people prefer to have a lot of header files (and some even favor putting a single function declaration in its own header). I don't (for personal projects, or small projects on which only two or three persons would commit code), but it is a matter of taste. BTW, when a project grows a lot, it happens quite often that the set of header files (and of translation units) changes significantly. Look also into REDIS (it has 139 .h header files and 214 .c files i.e. translation units totalizing 126 KLOC).
Having one or several translation units is also a matter of taste (and of convenience and habits and conventions). My preference is to have source files (that is translation units) which are not too small, typically several thousand lines each, and often have (for a small project of less than 60 KLOC) a common single header file. Don't forget to use some build automation tool like GNU make (often with a parallel build through make -j; then you'll have several compilation processes running concurrently). The advantage of having such a source file organization is that compilation is reasonably quick. BTW, in some cases a metaprogramming approach is worthwhile: some of your (internal header, or translation units) C "source" files could be generated by something else (e.g. some script in AWK, some specialized C program like bison or your own thing).
Remember that C was designed in the 1970s, for computers much smaller and slower than your favorite laptop today (typically, memory was at that time a megabyte at most, or even a few hundred kilobytes, and the computer was at least a thousand times slower than your mobile phone today).
I strongly suggest to study the source code and build some existing free software projects (e.g. those on GitHub or SourceForge or your favorite Linux distribution). You'll learn that they are different approaches. Remember that in C conventions and habits matter a lot in practice, so there are different ways to organize your project in .c and .h files. Read about the C preprocessor.
It also means I don't have to include the standard libraries in each file I create
You include header files, not libraries (but you should link libraries). But you could include them in each .c files (and many projects are doing that), or you could include them in one single header and pre-compile that header, or you could have a dozen of headers and include them after system headers in each compilation unit. YMMV. Notice that preprocessing time is quick on today's computers (at least, when you ask the compiler to optimize, since optimizations takes more time than parsing & preprocessing).
Notice that what goes into some #include-d file is conventional (and is not defined by the C specification). Some programs have some of their code in some such file (which should then not be called a "header", just some "included file"; and which then should not have a .h suffix, but something else like .inc). Look for example into XPM files. At the other extreme, you might in principle not have any of your own header files (you still need header files from the implementation, like <stdio.h> or <dlfcn.h> from your POSIX system) and copy and paste duplicated code in your .c files -e.g. have the line int foo(void); in every .c file, but that is very bad practice and is frowned upon. However, some programs are generating C files sharing some common content.
BTW, C or C++14 do not have modules (like OCaml has). In other words, in C a module is mostly a convention.
(notice that having many thousands of very small .h and .c files of only a few dozen lines each may slow down your build time dramatically; having hundreds of files of a few hundred lines each is more reasonable, in term of build time.)
If you begin to work on a single-person project in C, I would suggest to first have one header file (and pre-compile it) and several .c translation units. In practice, you'll change .c files much more often than .h ones. Once you have more than 10 KLOC you might refactor that into several header files. Such a refactoring is tricky to design, but easy to do (just a lot of copy&pasting chunk of codes). Other people would have different suggestions and hints (and that is ok!). But don't forget to enable all warnings and debug information when compiling (so compile with gcc -Wall -g, perhaps setting CFLAGS= -Wall -g in your Makefile). Use the gdb debugger (and valgrind...). Ask for optimizations (-O2) when you benchmark an already-debugged program. Also use a version control system like Git.
On the contrary, if you are designing a larger project on which several persons would work, it could be better to have several files -even several header files- (intuitively, each file has a single person mainly responsible for it, with others making minor contributions to that file).
In a comment, you add:
I'm talking about writing my code in lots of different files but using a Makefile to concatenate them
I don't see why that would be useful (except in very weird cases). It is much better (and very usual and common practice) to compile each translation unit (e.g. each .c file) into its object file (a .o ELF file on Linux) and link them later. This is easy with make (in practice, when you'll change only one .c file e.g. to fix a bug, only that file gets compiled and the incremental build is really quick), and you can ask it to compile object files in parallel using make -j (and then your build goes really fast on your multi-core processor).
You could do that, but we like to separate C programs into separate translation units, chiefly because:
It speeds up builds. You only need to rebuild the files that have changed, and those can be linked with other compiled files to form the final program.
The C standard library consists of pre-compiled components. Would you really want to have to recompile all that?
It's easier to collaborate with other programmers if the code base is split up into different files.
Your approach of concatenating .c files is completely broken:
Even though the command cat *.c > to_compile.c will put all functions into a single file, order matters: You must have each function declared before its first use.
That is, you have dependencies between your .c files which force a certain order. If your concatenation command fails to honor this order, you won't be able to compile the result.
Also, if you have two functions that recursively use each other, there is absolutely no way around writing a forward declaration for at least one of the two. You may as well put those forward declarations into a header file where people expect to find them.
When you concatenate everything into a single file, you force a full rebuild whenever a single line in your project changes.
With the classic .c/.h split compilation approach, a change in the implementation of a function necessitates recompilation of exactly one file, while a change in a header necessitates recompilation of the files that actually include this header. This can easily speed up the rebuild after a small change by a factor of 100 or more (depending on the count of .c files).
You loose all the ability for parallel compilation when you concatenate everything into a single file.
Have a big fat 12 core processor with hyper-threading enabled? Pity, your concatenated source file is compiled by a single thread. You just lost a speedup of a factor greater than 20... Ok, this is an extreme example, but I have build software with make -j16 already, and I tell you, it can make a huge difference.
Compilation times are generally not linear.
Usually compilers contain at least some algorithms that have a quadratic runtime behavior. Consequently, there is usually some threshold from which on aggregated compilation is actually slower than compilation of the independent parts.
Obviously, the precise location of this threshold depends on the compiler and the optimization flags you pass to it, but I have seen a compiler take over half an hour on a single huge source file. You don't want to have such an obstacle in your change-compile-test loop.
Make no mistake: Even though it comes with all these problems, there are people who use .c file concatenation in practice, and some C++ programmers get pretty much to the same point by moving everything into templates (so that the implementation is found in the .hpp file and there is no associated .cpp file), letting the preprocessor do the concatenation. I fail to see how they can ignore these problems, but they do.
Also note, that many of these problems only become apparent with larger project sizes. If your project is less than 5000 lines of code, it's still relatively irrelevant how you compile it. But when you have more than 50000 lines of code, you definitely want a build system that supports incremental and parallel builds. Otherwise, you are wasting your working time.
With modularity, you can share your library without sharing the code.
For large projects, if you change a single file, you would end up
compiling the complete project.
You may run out of memory more easily when you attempt to compile large projects.
You may have circular dependencies in modules, modularity helps in maintaining those.
There may be some gains in your approach, but for languages like C, compiling each module makes more sense.
Because splitting things up is good program design. Good program design is all about modularity, autonomous code modules, and code re-usability. As it turns out, common sense will get you very far when doing program design: Things that don't belong together shouldn't be placed together.
Placing non-related code in different translation units means that you can localize the scope of variables and functions as much as possible.
Merging things together creates tight coupling, meaning awkward dependencies between code files that really shouldn't even have to know about each other's existence. This is why a "global.h" which contains all the includes in a project is a bad thing, because it creates a tight coupling between every non-related file in your whole project.
Suppose you are writing firmware to control a car. One module in the program controls the car FM radio. Then you re-use the radio code in another project, to control the FM radio in a smart phone. And then your radio code won't compile because it can't find brakes, wheels, gears, etc. Things that doesn't make the slightest sense for the FM radio, let alone the smart phone to know about.
What's even worse is that if you have tight coupling, bugs escalate throughout the whole program, instead of staying local to the module where the bug is located. This makes the bug consequences far more severe. You write a bug in your FM radio code and then suddenly the brakes of the car stop working. Even though you haven't touched the brake code with your update that contained the bug.
If a bug in one module breaks completely non-related things, it is almost certainly because of poor program design. And a certain way to achieve poor program design is to merge everything in your project together into one big blob.
Header files should define interfaces - that's a desirable convention to follow. They aren't meant to declare everything that's in a corresponding .c file, or a group of .c files. Instead, they declare all functionality in the .c file(s) that is available to their users. A well designed .h file comprises a basic document of the interface exposed by the code in the .c file even if there isn't a single comment in it. One way to approach the design of a C module is to write the header file first, and then implement it in one or more .c files.
Corollary: functions and data structures internal to the implementation of a .c file don't normally belong in the header file. You might need forward declarations, but those should be local and all variables and functions thus declared and defined should be static: if they are not a part of the interface, the linker shouldn't see them.
While you can still write your program in a modular way and build it as a single translation unit, you will miss all the mechanisms C provides to enforce that modularity. With multiple translation units you have fine control on your modules' interfaces by using e.g. extern and static keywords.
By merging your code into a single translation unit, you will miss any modularity issues you might have because the compiler won't warn you about them. In a big project this will eventually result in unintended dependencies spreading around. In the end, you will have trouble changing any module without creating global side-effects in other modules.
The main reason is compilation time. Compiling one small file when you change it may take a short amount of time. If you would however compile the whole project whenever you change single line, then you would compile - for example - 10,000 files each time, which could take a lot longer.
If you have - as in the example above - 10,000 source files and compiling one takes 10 ms, then the whole project builds incrementally (after changing single file) either in (10 ms + linking time) if you compile just this changed file, or (10 ms * 10000 + short linking time) if you compile everything as a single concatenated blob.
If you put all of your includes in one place you would only have to define what you need once, rather than in all your source files.
That's the purpose of .h files, so you can define what you need once and include it everywhere. Some projects even have an everything.h header that includes every individual .h file. So, your pro can be achieved with separate .c files as well.
This means that I don't have to write a header file for each function I create [...]
You're not supposed to write one header file for every function anyway. You're supposed to have one header file for a set of related functions. So your con is not valid either.
This means that I don't have to write a header file for each function I create (because they're already in the main source file) and it also means I don't have to include the standard libraries in each file I create. This seems like a great idea to me!
The pros you noticed are actually a reason why this is sometimes done in a smaller scale.
For large programs, it's impractical. Like other good answers mentioned, this can increase build times substantially.
However, it can be used to break up a translation unit into smaller bits, which share access to functions in a way reminiscent of Java's package accessibility.
The way the above is achieved involves some discipline and help from the preprocessor.
For example, you can break your translation unit into two files:
// a.c
static void utility() {
}
static void a_func() {
utility();
}
// b.c
static void b_func() {
utility();
}
Now you add a file for your translation unit:
// ab.c
static void utility();
#include "a.c"
#include "b.c"
And your build system doesn't build either a.c or b.c, but instead builds only ab.o out of ab.c.
What does ab.c accomplish?
It includes both files to generate a single translation unit, and provides a prototype for the utility. So that the code in both a.c and b.c could see it, regardless of the order in which they are included, and without requiring the function to be extern.

How do you include standard CUDA libraries to link with NVRTC code?

Specifically, my issue is that I have CUDA code that needs <curand_kernel.h> to run. This isn't included by default in NVRTC. Presumably then when creating the program context (i.e. the call to nvrtcCreateProgram), I have to send in the name of the file (curand_kernel.h) and also the source code of curand_kernel.h? I feel like I shouldn't have to do that.
It's hard to tell; I haven't managed to find an example from NVIDIA of someone needing standard CUDA files like this as a source, so I really don't understand what the syntax is. Some issues: curand_kernel.h also has includes... Do I have to do the same for each of these? I am not even sure the NVRTC compiler will even run correctly on curand_kernel.h, because there are some language features it doesn't support, aren't there?
Next: if you've sent in the source code of a header file to nvrtcCreateProgram, do I still have to #include it in the code to be executed / will it cause an error if I do so?
A link to example code that does this or something like it would be appreciated much more than a straightforward answer; I really haven't managed to find any.
You have to send the "filename" and the source of each header separately.
When the preprocessor does its thing, it'll use any #include filenames as a key to find the source for the header, based on the collection that you provide.
I suspect that, in this case, the compiler (driver) doesn't have file system access, so you have to give it the source in much the same way that you would for shader includes in OpenGL.
So:
Include your header's name when calling nvrtcCreateProgram. The compiler will, internally, generate the equivalent of a std::map<string,string> containing the source of each header indexed by the given name.
In your kernel source, use #include "foo.cuh" as usual.
The compiler will use foo.cuh as an index or key into its internal map (created when you called nvrtcCreateProgram), and will retrieve the header source from that collection
Compilation proceeds as normal.
One of the reasons that nvrtc provides only a "subset" of features is that the compiler plays in a somewhat sandboxed environment, without necessarily having all of the supporting tools and utilities lying around that you have with offline compilation. So, you have to manually handle a lot of the stuff that the normal nvcc + (gcc | MSVC| clang) combination provides.
A possible, but non-ideal, solution would be to preprocess the file that you need in your IDE, save the result and then #include that. However, I bet there is a better way to do that. if you just want curand, consider diving into the library and extracting the part you need (blech) or using another GPU-friendly rand implementation. On older CUDA versions, I just generated a big array of random floats on the host, uploaded it to the GPU, and sampled it in the kernels.
This related link may be helpful.
You do not need to load curand_kernel.h yourself and add it to the include "aliases" mechanism.
Instead, you can simply add the CUDA include directory to your (set of) include paths, e.g. by adding --include-path=/usr/local/cuda/include to your NVRTC compiler options.
(I do this in my GPU-kernel-runner test harness, by default, to be on the safe side.)

Compile-time test if function is optimized out

I'm writing a small operating system for microcontrollers in C (not C++, so I can't use templates). It makes heavy use of some gcc features, one of the most important being the removal of unused code. The OS doesn't load anything at runtime; the user's program and the OS source are compiled together to form a single binary.
This design allows gcc to include only the OS functions that the program actually uses. So if the program never uses i2c or USB, support for those won't be included in the binary.
The problem is when I want to include optional support for those features without introducing a dependency. For example, a debug console should provide functions to debug i2c if it's being used, but including the debug console shouldn't also pull in i2c if the program isn't using it.
The methods that come to mind to achieve this aren't ideal:
Have the user explicitly enable the modules they need (using #define), and use #if to only include support for them in the debug console if enabled. I don't like this method, because currently the user doesn't have to do this, and I'd prefer to keep it that way.
Have the modules register function pointers with the debug module at startup. This isn't ideal, because it adds some runtime overhead and means the debug code is split up over several files.
Do the same as above, but using weak symbols instead of pointers. But I'm still not sure how to actually accomplish this.
Do a compile-time test in the debug code, like:
if(i2cInit is used) {
debugShowi2cStatus();
}
The last method seems ideal, but is it possible?
This seems like an interesting problem. Here's an idea, although it's not perfect:
Two-pass compile.
What you can do is first, compile the program with a flag like FINDING_DEPENDENCIES=1. Surround all the dependency checks with #ifs for this (I'm assuming you're not as concerned about adding extra ifs there.)
Then, when the compile is done (without any optional features), use nm or similar to detect the usage of functions/features in the program (such as i2cInit), and format this information into a .h file.
#ifndef FINDING_DEPENDENCIES
#include "dependency_info.h"
#endif
Now the optional dependencies are known.
This still doesn't seem like a perfect solution, but ultimately, it's mostly a chicken-and-the-egg problem. When compiling, the compiler doesn't know what symbols are going to be gc'd out. You basically need to get this information from the linker stage and feed it back to the compilation stage.
Theoretically, this might not increase build times much, especially if you used a temp file for the generated h, and then only replaced it if it was different. You'd need to use different object dirs, though.
Also this might help (pre-strip, of course):
How can I view function names and parameters contained in an ELF file?

Resources