Which system call macro to use - c

There are some Linux system calls (e.g., gettid) which have no glibc wrapper and thus must be invoked via the syscall function.
Looking in /usr/include/x86_64-linux-gnu/bits/syscall.h, I see definitions like
#ifdef __NR_gettid
# define SYS_gettid __NR_gettid
#endif
What is the reason for two aliases for the same system call number? Is one more portable/stable? I guess my question boils down to: Which macro should I be using in my code?

Related

Linux Kernel should I use asmlinkage for a function that implements a system call?

I am trying to implement a new syscall in linux kernel, so I wrote:
asmlinkage int my_func(void) {
return my_func_internal();
}
my question, should I define my_func_internal as asmlinkage or not?
in other words, should I write A or B?
A) asmlinkage int my_func_internal(void) {return 1;}
B) int my_func_internal(void) {return 1;}
I would like some explanation too
Note: I have added my_func to syscalls.h should I add the internal one too (probably the answer is no)
It doesn't matter (for correctness) what calling convention you use for functions that aren't called directly by hand-written asm. (Which syscall implementation functions might be on some architectures, that's why they should be asmlinkage.) As long as all callers can see a prototype that matches the definition, it will work.
If asmlinkinkage is a different calling convention from the default one (e.g. on i386, asmlinkage means to use stack args, overriding the -mregparm=3 build option that makes internal functions use register args), the compiler will have to emit a definition for my_func that handles the difference if it calls a function that isn't asmlinkage. Or simply inline my_func_internal() into it.
If they use the same calling convention, and the compiler chooses not to inline, it could just do an optimized tailcall to my_func_internal, e.g. on x86 jmp my_func_internal. So there's a possible efficiency advantage to using the same calling convention if there's a possibility of an optimized tailcall. Otherwise don't; asmlinkage makes the calling convention less efficient on i386.
(IIRC, asmlinkage has no effect on x86-64 and most other modern ISAs with register-args calling conventions; the default calling convention on x86 is already good so the kernel doesn't need to override it with -mregparm=3 like it does on i386.)
In your example where there are no args, there's no difference.
BTW, the usual naming convention for the function name is sys_foo to implement a system-call called foo. i.e. the function that will get called when user-space passes __NR_foo as the call number.
Note: I have added my_func to syscalls.h should I add the internal one too (probably the answer is no)
Of course not, unless my_func_internal implements a different system call that you want user-space to be able to call directly.

Removing functions included from a header from scope of the next files

In my project we are heavily using a C header which provides an API to comunicate to an external software. Long story short, in our project's bugs show up more often on the calling of the functions defined in those headers (it is an old and ugly legacy code).
I would like to implement an indirection on the calling of those functions, so I could include some profiling before calling the actual implementation.
Because I'm not the only person working on this project, I would like to make those wrappers in a such way that if someone uses the original implementations directly it should cause a compile error.
If those headers were C++ sources, I would be able to simply make a namespace, wrap the included files in it, and implement my functions using it (the other developers would be able to use the original implementation using the :: operator, but just not being able to call it directly is enough encapsulation to me). However the headers are C sources (which I have to include with extern "C" directive to include), so namespaces won't help me AFAIK.
I tried to play around with defines, but with no luck, like this:
#define my_func api_func
#define api_func NULL
What I wanted with the above code is to make my_func to be translated to api_func during the preprocessing, while making a direct call to api_func give a compile error, but that won't work because it will actually make my_func to be translated to NULL too.
So, basically, I would like to make a wrapper, and make sure the only way to access the API is through this wrapper (unless the other developers make some workaround, but this is inevitable).
Please note that I need to wrap hundreds of functions, which show up spread in the whole code several times.
My wrapper necessarily will have to include those C headers, but I would like to make them leave scope outside the file of my wrapper, and make them to be unavailable to every other file who includes my wrapper, but I guess this is not possible in C/C++.
You have several options, none of them wonderful.
if you have the sources of the legacy software, so that you can recompile it, you can just change the names of the API functions to make room for the wrapper functions. If you additionally make the original functions static and put the wrappers in the same source files, then you can ensure that the originals are called only via the wrappers. Example:
static int api_func_real(int arg);
int api_func(int arg) {
// ... instrumentation ...
int result = api_func_real(arg);
// ... instrumentation ...
return result;
}
static int api_func_real(int arg) {
// ...
}
The preprocessor can help you with that, but I hesitate to recommend specifics without any details to work with.
if you do not have sources for the legacy software, or if otherwise you are unwilling to modify it, then you need to make all the callers call your wrappers instead of the original functions. In this case you can modify the headers or include an additional header before that uses #define to change each of the original function names. That header must not be included in the source files containing the API function implementations, nor in those providing the wrapper function implementations. Each define would be of the form:
#define api_func api_func_wrapper
You would then implement the various api_func_wrapper() functions.
Among the ways those cases differ is that if you change the legacy function names, then internal calls among those functions will go through the wrappers bearing the original names (unless you change the calls, too), but if you implement wrappers with new names then they will be used only when called explicitly, which will not happen for internal calls within the legacy code (unless, again, you modify those calls).
You can do something like
[your wrapper's include file]
int origFunc1 (int x);
int origFunc2 (int x, int y);
#ifndef WRAPPER_IMPL
#define origFunc1 wrappedFunc1
#define origFunc2 wrappedFunc2
#else
int wrappedFunc1(int x);
int wrappedFunc2(int x, int y);
#endif
[your wrapper implementation]
#define WRAPPER_IMPL
#include "wrapper.h"
int wrapperFunc1 (...) {
printf("Wrapper1 called\n");
origFunc1(...);
}
Your wrapper's C file obviously needs to #define WRAPPER_IMPL before including the header.
That is neither nice nor clean (and if someone wants to cheat, he could simply define WRAPPER_IMPL), but at least some way to go.
There are two ways to wrap or override C functions in Linux:
Using LD_PRELOAD:
There is a shell environment variable in Linux called LD_PRELOAD,
which can be set to a path of a shared library,
and that library will be loaded before any other library (including glibc).
Using ‘ld --wrap=symbol‘:
This can be used to use a wrapper function for symbol.
Any further reference to symbol will be resolved to the wrapper function.
a complete writeup can be found at:
http://samanbarghi.com/blog/2014/09/05/how-to-wrap-a-system-call-libc-function-in-linux/

Purpose of `#ifdef MODULE` around module_exit()?

I am currently looking through the code of a "third-party" driver in an attempt to figure out/learn how it functions. I've had a look at sites such as this one, so I sort of understand how the basic premise works, but I don't understand the purpose of #ifdef MODULE here. Google isn't really much help, but I think the definition refers to a kernel module? (I am also completely new to this.)
module_init(os_driver_init);
#ifdef MODULE
module_exit(os_driver_cleanup);
#endif
My question is, what happens if I remove the #ifdef statement? Also, why/when would it be necessary to include the #ifdef statement?
In the Linux kernel, most drivers can be either statically linked (built-in) to the kernel image itself, or built as dynamically-loaded modules (.ko files).
The MODULE macro is defined for a C file when it is being compiled as part of a module, and undefined when it is being built directly into the kernel.
The code you're showing is only defining os_driver_cleanup as a module-exit function when it is being compiled as a module. However, this construct is unnecessary in modern kernel code; include/linux/init.h defines module_exit() as a macro, whose implementation depends on #ifdef MODULE.
Basically, you should always provide an exit function, and leave off the #ifdef around module_exit(). You should also mark your exit function with __exit, which will properly control inclusion of the code for your in the modular/non-modular case.
Here's an example of proper init/exit code.
static int __init foo_init(void)
{
/* Register driver, etc. */
}
static void __exit foo_cleanup(void)
{
/* Unregister driver, etc. */
}
module_init(foo_init);
module_exit(foo_cleanup);

What is __FUNCT__ for?

I was looking at some PETSc example code, and I came across this snippet:
#undef __FUNCT__
#define __FUNCT__ "main"
right before main begins.
Is setting __FUNCT__ or something like it before every function (or just main?) a standard C programming convention?
If so, why is this done?
From comments in the PETSc source code (${PETSC_DIR}/src/snes/examples/tutorials/ex3.c, lines 33-40):
Note that immediately before each routine below,
we define the macro __FUNCT__ to be a string containing the routine name.
If defined, this macro is used in the PETSc error handlers to provide a
complete traceback of routine names. All PETSc library routines use this
macro, and users can optionally employ it as well in their application
codes. Note that users can get a traceback of PETSc errors regardless of
whether they define __FUNCT__ in application codes; this macro merely
provides the added traceback detail of the application routine names.
Looking at petsc.h, there appear to be a bunch of macros which pass __FUNCT__ as a parameter to a function, e.g.:
#define PetscFree(a) ((a) ? ((*PetscTrFree)((a),__LINE__,__FUNCT__,__FILE__,__SDIR__) || ((a = 0),0)) : 0)
My guess is that PetscTrFree() (etc.) take these arguments for debugging/logging purposes.
This appears to be shorthand or a work-around for a C compiler that doesn't support the __FUNCTION__ standard macro.
First all previous declaration of __FUNCT__ are ignored by the compile using #undef, next the identifier is declared again and set to the string "main" in the line #define __FUNCT__ "main"
Personally I've never seen anyone do this setting it to "main", I can see it being useful if you want to use a library or something but don't want to use their declared function name of course I don't know why you would make this an identifier instead of just creating another function taking the same parameters and calling it what ever you want.
In any case, I do not believe this is a standar C programming convention and from the limited code snippet it is not clear exactly what it is being used for or why it is done.

Question about a wrapper macro function

I was reading the jemalloc's realloc function and noticed that all the non-static functions(at least the ones I've seen) in jemalloc is wrapped with JEMALLOC_P macro and JEMALLOC_P is:
#define JEMALLOC_P(s) s
Why would they need such a thing?
From the jemalloc configure script:
AC_DEFINE_UNQUOTED([JEMALLOC_P(string_that_no_one_should_want_to_use_as_a_jemalloc_API_prefix)], [${JEMALLOC_PREFIX}##string_that_no_one_should_want_to_use_as_a_jemalloc_API_prefix])
I'd guess that it is intended to provide a prefix for all of the jemalloc functions.
You'll also see things like this in jemalloc.h:
void *JEMALLOC_P(malloc)(size_t size)
So, by default, jemalloc takes over the malloc() name but if you need to still use plain malloc() then you could
#define JEMALLOC_P(s) je_##s
and get je_malloc() and plain malloc() at the same time.
You should look at the context that line is in. The code is actually:
#ifndef JEMALLOC_P
# define JEMALLOC_P(s) s
#endif
This means that, prior to including the header file, you could have provided your version of the JEMALLOC_P(). If you haven't that is the default.

Resources