Proper feature checks for kernel module source code - c

Scenario 1: I'm trying to install IBM GPFS driver onto RHEL6 with a vanilla kernel 3.10 (actually, kernel-lt from Elrepo). The GPL part won't compile due to:
Too many/too few arguments passed to function
struct x has no such member
type mismatch
Their code compiles fine on stock RHEL/Suse kernels older or newer than mine, but fails here.
Scenario 2:
I'm trying to compile the opensource softiwarp driver on RHEL6 with stock kernel, but it fails with same errors as in scenario 1. However, it compiles fine on a vanilla kernel.
This all is because their feature-check headers look like this:
#if LINUX_KERNEL_VERSION >= 2061300
#define FOO <newer variant>
#else
#define FOO <older variant>
#endif
But RHEL and Suse have many backports and bugfixes, so their 3.10.101 is not the same as vanilla 3.10.101.
How to write code that will check features, not version number? In a user-space program I would use autoconf macros AC_CHECK_MEMBER/AC_CHECK_FUNC

How to write code that will check features, not version number? In a user-space program I would use autoconf macros AC_CHECK_MEMBER/AC_CHECK_FUNC
The standard preprocessor's capabilities are much less than some people seem to think. It has no ability to do what you want directly. Autoconf provides no magic in this regard, either; it simply performs tests at configuration time, often simply by checking whether the compiler accepts a given piece of code, and it communicates the results to the compiler largely by causing preprocessor macros to be defined. (And you are responsible for using those macros as needed in conditional tests much like the one in your example.)
Since we're talking about Autoconf, however, as long as it runs against the kernel headers that correspond to the kernel you're building for, at least some Autoconf macros should work for you, and you should be able to write custom Autoconf tests for others. Indeed, any issue that that the compiler can detect at build time, Autoconf should also be able to test for.
Of course, there is also the option of giving the module builder the ability to indicate needed configuration details explicitly where a thorny issue such as this arises. For example, adjust the feature selection macros to pay attention also to a symbol reserved for the builder to use to modulate the results.

Related

Conditional compilation based on functionality in Linux kernel headers

Consider the case where I'm using some functionality from the Linux headers exported to user space, such as perf_event_open from <linux/perf_event.h>.
The functionality offered by this API has changed over time, as members have been added to the perf_event_attr, such as perf_event_attr.cap_user_time.
How can I write source that compiles and uses these new functionalities if they are available locally, but falls back gracefully if they aren't and doesn't use them?
In particular, how can I detect in the pre-processor whether this stuff is available?
I've used this perf_event_attr as an example, but my question is a general one because structure members, new structures, definitions and functions are added all the time.
Note that here I'm only considering the case where a process is compiled on the same system that it will run on: if you want to compile on one host and run on another you need a different set of tricks.
Use the macros from /usr/include/linux/version.h:
#include <linux/version.h>
int main() {
#if LINUX_VERSION_CODE <= KERNEL_VERSION(2,6,16)
// ^^^^^^ change for the proper version when `perf_event_attr.cap_user_time` was introduced
// use old interface
#else
// use new interface
// use perf_event_attr.cap_user_time
#endif
}
You might go into this with the following assumptions
The features available in the header files correspond to those documented for the specific Linux version.
The kernel running during execution corresponds to <linux/version.h> during compilation
Ideally, I suggest not to rely on these two assumptions at all.
The first assumption fails primarily due to backports, e.g. in enterprise Linux versions based on ancient kernels. If you care about different versions, you probably care about them.
Instead, I recommend utilizing the methods for checking for struct members and include files in build system, e.g. for CMake:
CHECK_STRUCT_HAS_MEMBER("struct perf_event_attr" cap_user_time linux/perf_event.h HAVE_PERF_CAP_USER_TIME)
CHECK_INCLUDE_FILES can also be useful.
The second assumption can fail for many reasons, even if the binary is not moved between systems; E.g. updating the kernel but not recompiling the binary or simply booting another kernel. Specifically perf_event_open fails with EINVAL if a reserved bit is set. This allows you to retry with an alternative implementation not using the requested feature.
In short, statically check for the feature instead of the version. Dynamically, try and retry the legacy implementation if it failed.
Just in addition to other answers.
If you're aiming for supporting both cross-version and cross-distro code, you should also keep in mind that there are distros (Centos/RHEL) which pull some recent changes from new kernels to old. So you may encounter a situation in which you'll have LINUX_VERSION_CODE equal to some old kernel version, but there will be some changes (new fields in data structures, new functions, etc.) from recent kernel. In such case this macro is insufficient.
You can add something like (to avoid preprocessor errors in case it is not a Centos distro):
#ifndef RHEL_RELEASE_CODE
#define RHEL_RELEASE_CODE 0
#endif
#ifndef RHEL_RELEASE_VERSION
#define RHEL_RELEASE_VERSION(x,y) 1
#endif
And use it with > or >= where you need:
#if LINUX_VERSION_CODE >= KERNEL_VERSION(4,3,0) || RHEL_RELEASE_CODE > RHEL_RELEASE_VERSION(7,2)
...
for Centos/RHEL custom kernels support.
P.S. of course it's necessary to examine an appropriate versions of Centos/RHEL, and understand when and what exactly has changed in the code sections that affect you.

Compile-time test if function is optimized out

I'm writing a small operating system for microcontrollers in C (not C++, so I can't use templates). It makes heavy use of some gcc features, one of the most important being the removal of unused code. The OS doesn't load anything at runtime; the user's program and the OS source are compiled together to form a single binary.
This design allows gcc to include only the OS functions that the program actually uses. So if the program never uses i2c or USB, support for those won't be included in the binary.
The problem is when I want to include optional support for those features without introducing a dependency. For example, a debug console should provide functions to debug i2c if it's being used, but including the debug console shouldn't also pull in i2c if the program isn't using it.
The methods that come to mind to achieve this aren't ideal:
Have the user explicitly enable the modules they need (using #define), and use #if to only include support for them in the debug console if enabled. I don't like this method, because currently the user doesn't have to do this, and I'd prefer to keep it that way.
Have the modules register function pointers with the debug module at startup. This isn't ideal, because it adds some runtime overhead and means the debug code is split up over several files.
Do the same as above, but using weak symbols instead of pointers. But I'm still not sure how to actually accomplish this.
Do a compile-time test in the debug code, like:
if(i2cInit is used) {
debugShowi2cStatus();
}
The last method seems ideal, but is it possible?
This seems like an interesting problem. Here's an idea, although it's not perfect:
Two-pass compile.
What you can do is first, compile the program with a flag like FINDING_DEPENDENCIES=1. Surround all the dependency checks with #ifs for this (I'm assuming you're not as concerned about adding extra ifs there.)
Then, when the compile is done (without any optional features), use nm or similar to detect the usage of functions/features in the program (such as i2cInit), and format this information into a .h file.
#ifndef FINDING_DEPENDENCIES
#include "dependency_info.h"
#endif
Now the optional dependencies are known.
This still doesn't seem like a perfect solution, but ultimately, it's mostly a chicken-and-the-egg problem. When compiling, the compiler doesn't know what symbols are going to be gc'd out. You basically need to get this information from the linker stage and feed it back to the compilation stage.
Theoretically, this might not increase build times much, especially if you used a temp file for the generated h, and then only replaced it if it was different. You'd need to use different object dirs, though.
Also this might help (pre-strip, of course):
How can I view function names and parameters contained in an ELF file?

What does gcc -D_REENTRANT really do?

I am writing Java bindings for a C library, and therefore working with JNI. Oracle specifies, reasonably, that native libraries for use with Java should be compiled with multithread-aware compilers.
The JNI docs give the specific example that for gcc, this multithread-awareness requirement should be met by defining one of the macros _REENTRANT or _POSIX_C_SOURCE. That seems odd to me. _REENTRANT and _POSIX_C_SOURCE are feature-test macros. GCC and POSIX documentation describe their effects in terms of defining symbols and making declarations visible, just as I would expect for any feature-test macro.
If I do not need the additional symbols or functions, then do these macros in fact do anything useful for me? Does one or both cause gcc to generate different code than it otherwise would? Do they maybe cause my code's calls to standard library functions to be linked to different implementations? Or is Oracle just talking out of its nether regions?
Edit:
Additionally, it occurs to me that reentrancy is a separate consideration from threading. Non-reentrancy can be an issue even for single-threaded programs, so Oracle's suggestion that defining _REENTRANT makes gcc multithread-aware now seems even more dubious.
The Oracle recommendation was written for Solaris, not for Linux.
On Solaris, if you compiled a .so without _REENTRANT and ended up loaded by a multi-threaded application then very bad things could happen (e.g. random data corruption of libc internals). This was because without the define you ended up with unlocked variants of some routines by default.
This was the case when I first read this documentation, which was maybe 15 years ago, the mention of the -mt flag for the sun studio compiler was added after I last read this document in any detail.
This is no longer the case - You always get the same routine now whether or not you compile with the _REENTRANT flag; it's now only a feature macro, and not a behaviour macro.

Can I run GCC as a daemon (or use it as a library)?

I would like to use GCC kind of as a JIT compiler, where I just compile short snippets of code every now and then. While I could of course fork a GCC process for each function I want to compile, I find that GCC's startup overhead is too large for that (it seems to be about 50 ms on my computer, which would make it take 50 seconds to compile 1000 functions). Therefore, I'm wondering if it's possible to run GCC as a daemon or use it as a library or something similar, so that I can just submit a function for compilation without the startup overhead.
In case you're wondering, the reason I'm not considering using an actual JIT library is because I haven't found one that supports all the features I want, which include at least good knowledge of the ABI so that it can handle struct arguments (lacking in GNU Lightning), nested functions with closure (lacking in libjit) and having a C-only interface (lacking in LLVM; I also think LLVM lacks nested functions).
And no, I don't think I can batch functions together for compilation; half the point is that I'd like to compile them only once they're actually called for the first time.
I've noticed libgccjit, but from what I can tell, it seems very experimental.
My answer is "No (you can't run GCC as a daemon process, or use it as a library)", assuming you are trying to use the standard GCC compiler code. I see at least two problems:
The C compiler deals in complete translation units, and once it has finished reading the source, compiles it and exits. You'd have to rejig the code (the compiler driver program) to stick around after reading each file. Since it runs multiple sub-processes, I'm not sure that you'll save all that much time with it, anyway.
You won't be able to call the functions you create as if they were normal statically compiled and linked functions. At the least you will have to load them (using dlopen() and its kin, or writing code to do the mapping yourself) and then call them via the function pointer.
The first objection deals with the direct question; the second addresses a question raised in the comments.
I'm late to the party, but others may find this useful.
There exists a REPL (read–eval–print loop) for c++ called Cling, which is based on the Clang compiler. A big part of what it does is JIT for c & c++. As such you may be able to use Cling to get what you want done.
The even better news is that Cling is undergoing an attempt to upstream a lot of the Cling infrastructure into Clang and LLVM.
#acorn pointed out that you'd ruled out LLVM and co. for lack of a c API, but Clang itself does have one which is the only one they guarantee stability for: https://clang.llvm.org/doxygen/group__CINDEX.html

removing unneeded code from gcc andd mingw

i noticed that mingw adds alot of code before calling main(), i assumed its for parsing command line parameters since one of those functions is called __getmainargs(), and also lots of strings are added to the final executable, such as mingwm.dll and some error strings (incase the app crashed) says mingw runtime error or something like that.
my question is: is there a way to remove all this stuff? i dont need all these things, i tried tcc (tiny c compiler) it did the job. but not cross platform like gcc (solaris/mac)
any ideas?
thanks.
Yes, you really do need all those things. They're the startup and teardown code for the C environment that your code runs in.
Other than non-hosted environments such as low-level embedded solutions, you'll find pretty much all C environments have something like that. Things like /lib/crt0.o under some UNIX-like operating systems or crt0.obj under Windows.
They are vital to successful running of your code. You can freely omit library functions that you don't use (printf, abs and so on) but the startup code is needed.
Some of the things that it may perform are initialisation of atexit structures, argument parsing, initialisation of structures for the C runtime library, initialisation of C/C++ pre-main values and so forth.
It's highly OS-specific and, if there are things you don't want to do, you'll probably have to get the source code for it and take them out, in essence providing your own cut-down replacement for the object file.
You can safely assume that your toolchain does not include code that is not needed and could safely be left out.
Make sure you compiled without debug information, and run strip on the resulting executable. Anything more intrusive than that requires intimate knowledge of your toolchain, and can result in rather strange behaviour that will be hard to debug - i.e., if you have to ask how it could be done, you shouldn't try to do it.

Resources