How do you include standard CUDA libraries to link with NVRTC code?

How do you include standard CUDA libraries to link with NVRTC code? - c

Specifically, my issue is that I have CUDA code that needs <curand_kernel.h> to run. This isn't included by default in NVRTC. Presumably then when creating the program context (i.e. the call to nvrtcCreateProgram), I have to send in the name of the file (curand_kernel.h) and also the source code of curand_kernel.h? I feel like I shouldn't have to do that.
It's hard to tell; I haven't managed to find an example from NVIDIA of someone needing standard CUDA files like this as a source, so I really don't understand what the syntax is. Some issues: curand_kernel.h also has includes... Do I have to do the same for each of these? I am not even sure the NVRTC compiler will even run correctly on curand_kernel.h, because there are some language features it doesn't support, aren't there?
Next: if you've sent in the source code of a header file to nvrtcCreateProgram, do I still have to #include it in the code to be executed / will it cause an error if I do so?
A link to example code that does this or something like it would be appreciated much more than a straightforward answer; I really haven't managed to find any.

You have to send the "filename" and the source of each header separately.
When the preprocessor does its thing, it'll use any #include filenames as a key to find the source for the header, based on the collection that you provide.
I suspect that, in this case, the compiler (driver) doesn't have file system access, so you have to give it the source in much the same way that you would for shader includes in OpenGL.
So:
Include your header's name when calling nvrtcCreateProgram. The compiler will, internally, generate the equivalent of a std::map<string,string> containing the source of each header indexed by the given name.
In your kernel source, use #include "foo.cuh" as usual.
The compiler will use foo.cuh as an index or key into its internal map (created when you called nvrtcCreateProgram), and will retrieve the header source from that collection
Compilation proceeds as normal.
One of the reasons that nvrtc provides only a "subset" of features is that the compiler plays in a somewhat sandboxed environment, without necessarily having all of the supporting tools and utilities lying around that you have with offline compilation. So, you have to manually handle a lot of the stuff that the normal nvcc + (gcc | MSVC| clang) combination provides.
A possible, but non-ideal, solution would be to preprocess the file that you need in your IDE, save the result and then #include that. However, I bet there is a better way to do that. if you just want curand, consider diving into the library and extracting the part you need (blech) or using another GPU-friendly rand implementation. On older CUDA versions, I just generated a big array of random floats on the host, uploaded it to the GPU, and sampled it in the kernels.
This related link may be helpful.

You do not need to load curand_kernel.h yourself and add it to the include "aliases" mechanism.
Instead, you can simply add the CUDA include directory to your (set of) include paths, e.g. by adding --include-path=/usr/local/cuda/include to your NVRTC compiler options.
(I do this in my GPU-kernel-runner test harness, by default, to be on the safe side.)

Related

Cross-Platform C single header file and multiple implementations

I am working on an open source C driver for a cheap sensor that is used mostly for Arduino projects. The project is set up in such a way that it is possible to support multiple platforms outside the Arduino ecosystem, like the Raspberry Pi.
The project is set up with a platform.h file, with the intention of having different implementations of this header file. Like the example below:
platform.h
platform_arduino.c
platform_rpi.c
platform_windows.c
There is this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I have come up with some solutions like just adding the requirements for each platform at the top of the file.
#if SOME_REQUIREMENT
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}
#endif //SOME_REQUIREMENT
But this seems like a clunky solution.
It impacts readability of the code.1
It will probably make debugging conflicting requirements a nightmare.
1 Many editors (Like VS Code) try to gray out code which does not match requirements. While I want this most of the time, it is really annoying when working on cross-platform drivers. I could just disable it for the entirety of the project, but in other parts of the project it is useful. I understand that it could probably be solved using VS Code thing. However, I am asking for alternative methods of selecting the right file/code for the platform because I am interested in seeing what other strategies there are.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic. My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If it cannot be done without makefile magic, then that is an answer too.
For reference, here is a simplified example of the header file and implementation
platform.h
#ifndef __PLATFORM__
#define __PLATFORM__
int8_t t_open(void);
#endif //__PLATFORM__
platform_arduino.c
#include "platform.h"
int8_t t_open(void)
{
// Implementation here
}

this (Cross-Platform C++ code and single header - multiple implementations) Stack Overflow post that goes fairly in depth in how to handle this for C++ but I feel like none of those examples really apply to this C implementation.
I don't see why you say that. The first suggestions in the two highest-scoring answers are variations on the idea of using conditional macros, which not only is valid in C, but is a traditional approach. You yourself present an alternative along these lines.
Part of the "problem" is that support for Arduino is the primary focus, which means it can't easily be solved with makefile magic.
I take you to mean that the approach to platform adaptation has to be encoded somehow into the C source, as opposed to being handled via the build system. Frankly, this is an unusual constraint, except inasmuch as it can be addressed by use of the various system-identification macros provided by C compilers of interest.
Even if you don't want to rely specifically on makefiles, you should consider attributing some responsibility to the build system, which you can do even without knowing specifically what build system that is. For example, you can designate macro names, such as for_windows, etc that request builds for non-default platforms. You then leave it to the person building an instance of the driver to figure out how to configure their tools to provide the appropriate macro definition for their needs (which generally is not hard), based on your build documentation.
My question is, what are alternative ways of implementing a solution to this problem, that are still readable?
If the solution needs to be embodied entirely in the C source, then you have three main alternatives:
write code that just works correctly on all platforms, or
perform runtime detection and adaptation, or
use conditional compilation based on macros automatically defined by supported compilers.
If you're prepared to rely on macro definitions supplied by the user at build time, then the last becomes simply
use conditional compilation
Do not dismiss the first out of hand, but it can be a difficult path, and it might not be fully possible for your particular problem (and probably isn't if you're writing a driver or other code for a freestanding implementation).
Runtime adaptation could be viewed as a specific case of code that just works, but what I have in mind for this is a higher level of organization that performs runtime analysis of the host environment and chooses function variants and internal parameters suited to that, as opposed to those choices being made at compile time. This is a real thing that is occasionally done, but it may or may not be viable for your particular case.
On the other hand, conditional compilation is the traditional basis for platform adaptation in C, and the general form does not have the caveat of the other two that it might or might not work in your particular situation. The level of readability and maintainability you achieve this way is a function of the details of how you implement it.
I have come up with some solutions like just adding the requirements for each platform at the top of the file. [...] But this seems like a clunky solution.
If you must include a source file in your build but you don't want anything in it to actually contribute to the target then that's exactly what you must do. You complain that "It will probably make debugging conflicting requirements a nightmare", but to the extent that that's a genuine issue, I think it's not so much a question of syntax as of the whole different code for different platforms plan.
You also complain that the conditional compilation option might be a practical difficulty for you with your choice of development tools. It certainly seems to me that there ought to be good workarounds for that available from your tools and development workflow. But if you must have a workaround grounded only in the C language, then there is one (albeit a bad one): introduce a level of preprocessing indirection. That is, put the conditional compilation directives in a different source file, like so:
platform.c
#if defined(for_windows)
#include "platform_windows.c"
#else
#if defined(for_rpi)
#include "platform_rpi.c"
#else
#include "platform_arduino.c"
#endif
#endif
You then designate platform.c as a file to be built, but not (directly) any of the specific-platform files.
This solves your tool-presentation issue because when you are working on one of the platform-specific .c files, the editor is unlikely to be able to tell whether it would actually be included in a build or not.
Do note well that it is widely considered bad practice to #include files containing function implementations, or those not ending with an extension conventionally designating a header. I don't say otherwise about the above, but I would say that if the whole platform.c contains nothing else, then that's about the least bad variation that I can think of within the category.

Compile-time test if function is optimized out

I'm writing a small operating system for microcontrollers in C (not C++, so I can't use templates). It makes heavy use of some gcc features, one of the most important being the removal of unused code. The OS doesn't load anything at runtime; the user's program and the OS source are compiled together to form a single binary.
This design allows gcc to include only the OS functions that the program actually uses. So if the program never uses i2c or USB, support for those won't be included in the binary.
The problem is when I want to include optional support for those features without introducing a dependency. For example, a debug console should provide functions to debug i2c if it's being used, but including the debug console shouldn't also pull in i2c if the program isn't using it.
The methods that come to mind to achieve this aren't ideal:
Have the user explicitly enable the modules they need (using #define), and use #if to only include support for them in the debug console if enabled. I don't like this method, because currently the user doesn't have to do this, and I'd prefer to keep it that way.
Have the modules register function pointers with the debug module at startup. This isn't ideal, because it adds some runtime overhead and means the debug code is split up over several files.
Do the same as above, but using weak symbols instead of pointers. But I'm still not sure how to actually accomplish this.
Do a compile-time test in the debug code, like:
if(i2cInit is used) {
debugShowi2cStatus();
}
The last method seems ideal, but is it possible?

This seems like an interesting problem. Here's an idea, although it's not perfect:
Two-pass compile.
What you can do is first, compile the program with a flag like FINDING_DEPENDENCIES=1. Surround all the dependency checks with #ifs for this (I'm assuming you're not as concerned about adding extra ifs there.)
Then, when the compile is done (without any optional features), use nm or similar to detect the usage of functions/features in the program (such as i2cInit), and format this information into a .h file.
#ifndef FINDING_DEPENDENCIES
#include "dependency_info.h"
#endif
Now the optional dependencies are known.
This still doesn't seem like a perfect solution, but ultimately, it's mostly a chicken-and-the-egg problem. When compiling, the compiler doesn't know what symbols are going to be gc'd out. You basically need to get this information from the linker stage and feed it back to the compilation stage.
Theoretically, this might not increase build times much, especially if you used a temp file for the generated h, and then only replaced it if it was different. You'd need to use different object dirs, though.
Also this might help (pre-strip, of course):
How can I view function names and parameters contained in an ELF file?

Is it right to simply include all header files?

Remembering the names of system header files is a pain...
Is there a way to include all existing header files at once?
Why doesn't anyone do that?

Including unneeded header files is a very bad practice. The issue of slowing down compilation might or might not matter; the bigger issue is that it hides dependencies. The set of header files you include in a source file should is the documentation of what functionality the module depends upon, and unlike external documentation or comments, it is automatically checked for completeness by the compiler (failing to include needed header files will result in an error). Ensuring the absence of unwanted dependencies not only improves portability; it also helps you track down unneeded and potentially dangerous interactions, for instance cases where a module which should be purely computational or purely data structure management is accessing the filesystem.
These principles apply whether the headers are standard system headers or headers for modules within your own program or third-party libraries.

Your source code files are preprocessed before the compiler looks at them, and the #include statement is one of the directives that the preprocessor uses. When being preprocessed, #include statements are replaced with the entire contents of the file being included. The result of including all of the system files would be very large source files that the compiler then needs to work through, which will cost a lot of time during compilation.

No one includes all the header files. There are too many, and a few of them are mutually exclusive with other files (like ncurses.h and curses.h).
It really is not that bad when writing a program even from scratch. A few are quite easy to remember: stdio.h for any FILE stuff; ctype.h for any character classification, alloc.h for any use of malloc(), etc.
If you don't remember one:
leave the #include out
compile
examine first few error messages for indication of a missing header file, such as some type not declared, or calling a function with assumed parameter types
figure out which function call is the cause
look at the man page (or whatever documentation your compiler has) for that function
notice the #include shown by the documentation and add it
repeat until all errors fixed
It is quite a bit easier for adding to an existing code base. You could go hundreds or thousands of working hours and never have to add a #include.

No it is a terrible idea and will massively increase your compile times and possible make your exe a lot larger by including massive amounts of unused code.

I know what you're talking about, but I need to double-check the function prototypes for the functions I'm using (for ones I don't use daily, anyway) -- I'll just copy and paste the #includes straight out of the manpage for the associated functions. I'm already looking at the manpage (it's a simple K in vim(1)), so it doesn't feel like an extra burden.

You can create a "master" header, where you put all your includes into. Then in everything else include it! Beware of conflicting definitions and circular references... So.... Master1.h, master2.h, ...
Not advocating it. Just saying.

Any good reason to #include source (.c .cpp) files?

i've been working for some time with an opensource library ("fast artificial neural network"). I'm using it's source in my static library. When i compile it however, i get hundreds of linker warnings which are probably caused by the fact that the library includes it's *.c files in other *.c files (as i'm only including some headers i need and i did not touch the code of the lib itself).
My question: Is there a good reason why the developers of the library used this approach, which is strongly discouraged? (Or at least i've been told all my life that this is bad and from my own experience i believe it IS bad). Or is it just bad design and there is no gain in this approach?
I'm aware of this related question but it does not answer my question. I'm looking for reasons that might justify this.
A bonus question: Is there a way how to fix this without touching the library code too much? I have a lot of work of my own and don't want to create more ;)

As far as I see (grep '#include .*\.c'), they only do this in doublefann.c, fixedfann.c, and floatfann.c, and each time include the reason:
/* Easy way to allow for build of multiple binaries */
This exact use of the preprocessor for simple copy-pasting is indeed the only valid use of including implementation (*.c) files, and relatively rare. (If you want to include some code for another reason, just give it a different name, like *.h or *.inc.) An alternative is to specify configuration in macros given to the compiler (e.g. -DFANN_DOUBLE, -DFANN_FIXED, or -DFANN_FLOAT), but they didn't use this method. (Each approach has drawbacks, so I'm not saying they're necessarily wrong, I'd have to look at that project in depth to determine that.)
They provide makefiles and MSVS projects which should already not link doublefann.o (from doublefann.c) with either fann.o (from fann.c) or fixedfann.o (from fixedfann.c) and so on, and either their files are screwed up or something similar has gone wrong.
Did you try to create a project from scratch (or use your existing project) and add all the files to it? If you did, what is happening is each implementation file is being compiled independently and the resulting object files contain conflicting definitions. This is the standard way to deal with implementation files and many tools assume it. The only possible solution is to fix the project settings to not link these together. (Okay, you could drastically change their source too, but that's not really a solution.)
While you're at it, if you continue without using their project settings, you can likely skip compiling fann.c, et. al. and possibly just removing those from the project is enough – then they won't be compiled and linked. You'll want to choose exactly one of double-/fixed-/floatfann to use, otherwise you'll get the same link errors. (I haven't looked at their instructions, but would not be surprised to see this summary explained a bit more in-depth there.)

Including C/C++ code leads to all the code being stuck together in one translation unit. With a good compiler, this can lead to a massive speed boost (as stuff can be inlined and function calls optimized away).
If actual code is going to be included like this, though, it should have static in most of its declarations, or it will cause the warnings you're seeing.

If you ever declare a single global variable or function in that .c file, it cannot be included in two places which both compile to the same binary, or the two definitions will collide. If it is included in even one place, it cannot also be compiled on its own while still being linked into the same binary as its user.
If the file is only included in one place, why not just make it a discrete compilation unit (and use its globals via extern declarations)? Why bother having it included at all?
If your C files declare no global variables or functions, they are header files and should be named as such.
Therefore, by exhaustive search, I can say that the only time you would ever potentially want to include C files is if the same C code is used in building multiple different binaries. And even there, you're increasing your compile time for no real gain.
This is assuming that functions which should be inlined are marked inline and that you have a decent compiler and linker.
I don't know of a quick way to fix this.

I don't know that library, but as you describe it, it is either bad practice or your understanding of how to use it is not good enough.
A C project that wants to be included by others should always provide well structured .h files for others and then the compiled library for linking. If it wants to include function definitions in header files it should either mark them as static (old fashioned) or as inline (possible since C99).

I haven't looked at the code, but it's possible that the .c or .cpp files being included actually contain code that works in a header. For example, a template or an inline function. If that is the case, then the warnings would be spurious.

I'm doing this at the moment at home because I'm a relative newcomer to C++ on Linux and don't want to get bogged down in difficulties with the linker. But I wouldn't recommend it for proper work.
(I also once had to include a header.dat into a C++ program, because Rational Rose didn't allow headers to be part of the issued software and we needed that particular source file on the running system (for arcane reasons).)

Stubbing functions in simulations

I'm working on an embedded C project that depends on some external HW. I wish to stub out the code accessing these parts, so I can simulate the system without using any HW. Until now I have used some macros but this forces me to change a little on my production code, which I would like to avoid.
Example:
stub.h
#ifdef _STUB_HW
#define STUB_HW(name) Stub_##name
#else /*_STUB_HW*/
#define STUB_HW(name) name
#endif /*_STUB_HW*/
my_hw.c
WORD STUB_HW(clear_RX_TX)()
{ /* clear my rx/tx buffer on target HW */ }
test_my_hw.c
#ifdef _STUB_HW
WORD clear_RX_TX()
{ /* simulate clear rx/tx buffer on target HW */ }
With this code I can turn on/off the stubbing with the preprocessor tag _STUB_HW
Is there a way to acomplish this without having to change my prod code, and avoiding a lot of ifdefs. And I won't mix prod and test code in the same file if I can avoid it. I don't care how the test code looks as long as I can keep as much as possible out of the production code.
Edit:
Would be nice if it was posible to select/rename functions without replacing the whole file. Like take all functions starting on nRF_## and giving then a new name and then inserting test_nRF_## to nRF_## if it is posible

I just make two files ActualDriver.c and StubDriver.c containing exactly the same function names. By making two builds linking the production code against the different objects there is no naming conflicts. This way the production code contains no testing or conditional code.

As Gerhard said, use a common header file "driver.h" and separate hardware layer implementation files containing the actual and stubbed functions.
In eclipse, I have two targets and I "exclude from build" the driver.c file that is not to be used and make sure the proper one is included in the build. Eclipse then generates the makefile at build time.
Another issue to point out is to ensure you are defining fixed size integers so your code behaves the same from an overflow perspective. (Although from your code sample I can see you are doing that.)

I agree with the above. The standard solution to this is to define an opaque abstracted set of function calls that are the "driver" to the hw, and then call that in the main program. Then provide two different driver implementations, one for hw, one for sw. The sw variant will simulate the IO effect of the hw in some appropriate way.
Note that if the goal is at a lower level, i.e., writing code where each hardware access is to be simulated rather than entire functions, it might be a bit tricker. But here, different "write_to_memory" and "read_from_memory" functions (or macros, if speed on target is essential) could be defined.
There is no need in either case to change the names of functions, just have two different batch files, make files, or IDE build targets (depending on what tools you are using).
Finally, in many cases a better technical solution is to go for a full-blown target system simulator, such as Qemu, Simics, SystemC, CoWare, VaST, or similar. This lets you run the same code all the time, and instead you build a model of the hardware that works like the actual hardware from the perspective of the software. It does take a much larger up-front investment, but for many projects it is well worth the effort. It basically gets rid of the nasty issue of having different builds for target and host, and makes sure you always use your cross-compiler with deployment build options. Note that many embedded compiler suites come with some basic such simulation ability built in.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight