Does kallsyms have all the symbol of kernel functions?

Does kallsyms have all the symbol of kernel functions? - c

In the Linux kernel I want to probe the kernel function effective_prio(). It defined as static.
When I go to search the symbol of it in the kallsyms I cannot find it. Does kallsyms have all the symbol of the kernel functions? If not, which symbols are not included?

There are two possibilities for a function not appearing in /proc/kallsyms:
If the function is marked as static, and the compiler decides to inline the function (with or without the inline keyword)
If a config option or another #define removes a function from being compiled, e.g.:
#ifdef CONFIG_OPT
void foo(void) {
}
#endif
As far as I know, if a function does not appear in /proc/kallsyms, it is not possible to call or probe it from a module.
However, /proc/kallsyms contains all functions of the kernel, not just the ones exported via EXPORT_SYMBOL/EXPORT_SYMBOL_GPL.

CONFIG_KALLSYMS_ALL=y is also required to see non-static variables, e.g.:
grep sysctl_sched_nr_migrate /proc/kallsyms
which is defined as:
const_debug unsigned int sysctl_sched_nr_migrate = 32;

kallsyms only lists the symbols exported by EXPORT_SYMBOL and EXPORT_SYMBOL_GPL macros.
This is done for security. We don'T usually want modules to be able to access for example internal or security functions. Those just go against the idea of making kernel modules as safe as possible, but allowing them to do as much as it is possible.

Related

Include standard library functions in a specific memory section

In C, we can force the linker to put a specific function in a specific section from the source code, using something like the following example.
Here, the function my_function is tagged with the preprocessor macro PUT_IN_USER_SECTION in order to tell the linker to put it in the section .user_section.
#define PUT_IN_USER_SECTION __attribute__((__section__(".user_section"))) __attribute__ ((noinline))
double PUT_IN_USER_SECTION my_function(double a, double b)
{
// Function content
}
Now, what I'd like to know is, when we use standard functions (for example log functions from the math.h library) from the GLIBC, MUSL, ... and we perform static linking: is it possible to put those functions in specific sections? and how to that?

There is a technique called overlaying that comes from a time when memory resources were limited. It is still used in Embedded Systems with restricted resources :
https://en.wikipedia.org/wiki/Overlay_(programming)
source : https://de.wikipedia.org/wiki/Overlay_(Programmierung)#/media/Datei:Overlay_Programming.svg
In https://developer.arm.com/documentation/100066/0611/mapping-code-and-data-to-target-memory/manual-overlay-support/writing-an-overlay-manager-for-manually-placed-overlays is an implementation of overlaying, the load addresses are in the scatter file.
https://sourceware.org/gdb/onlinedocs/gdb/How-Overlays-Work.html
For the addresses of functions in memory see https://ftp.gnu.org/pub/old-gnu/Manuals/ld-2.9.1/html_node/ld_22.html
https://www.keil.com/support/man/docs/armlink/armlink_pge1362066004071.htm

Does XC8 compiler support weak symbols?

gcc has __attribute__((weak)) which allows to create a weak symbol such as a function. This allows the user to redefine a function. I would like to have the same behavior in XC8.
More info:
I am writing a driver for XC8 and I would like to delegate low level initialization to a user defined function.
I know it is possible to redefine a function: there is the putch function that is implemented in XC8's source file and which is called by the printf function. The user is allowed to reimplement putch inside his application. There are two functions with the same name, but no error is raised.
putch's implementation in XC8's source files has a comment saying "Weak implementation. User implementation may be required", so it must be possible.
I looked at pragmas in XC8's user guide, but there is no directive related to this question.

A linker will only search static libraries to resolve a symbol that is not already resolved by input object files, so replacing static library functions can be done without weak linkage. Weak linkage is useful for code provided as source or object code rather then as a static library.
So if no weak linkage directive is supported, you could create a static library for the "weak" symbols and link that.
The XC8 manual documents behaviour for both the IAR compatibility directive __weak and a weak pragma, and in both cases the directives are ignored (supported only in XC16 and XC32), so you will have to use the above suggested method, which is in any case far more portable - if somewhat inconvenient.
In the case of putch() I suspect that this is not working as you believe. I would imagine that this is not a matter of weak linkage at all; in the static library containing printf() an unresolved link to putch() exists, and the linker resolves it with whatever you provide; if you were to compile and link both the Microchip implementation and yours from source code you would get a linker error; equally if you were to provide no implementation whatsoever you would get a linker error.

XC8 compiler does support the "weak" attribute.
The weak attribute causes the declaration to be emitted as a weak symbol. A weak symbol indicates that if a global version of the same symbol is available, that version should be used instead. When the weak attribute is applied to a reference to an external symbol, the symbol is not required for linking.
For example:
extern int __attribute__((weak)) s;
int foo(void)
{
if (&s)
return s;
return 0; /* possibly some other value */
}
In the above program, if s is not defined by some other module, the program will still link but s will not be given an address.
The conditional verifies that s has been defined (and returns its value if it has). Otherwise '0' is returned.
There are many uses for this feature, mostly to provide generic code that can link with an optional library.
A variable can also be qualified with the "weak" attribute.
For example:
char __attribute__((weak)) input;
char input __attribute__((weak));

How static init and exit functions in a module are called outside the file by the kernel?

I am reading LDD3 and had a doubt regarding the usage of static storage class in the __init and __exit function calls.
http://static.lwn.net/images/pdf/LDD3/ch02.pdf
"Initialization functions should be declared static, since they are
not meant to be visible outside the specific file; there is no hard
rule about this, though, as no function is exported to the rest of the
kernel unless explicitly requested"
But then the kernel is able to use init and exit function using insmod and rmmod system calls. If static functions are functions that are only visible to other functions in the same file, then how is the kernel able to use the __init and __exit functions defined static in our module?

Static function in one translation unit can be exported through back-doors like function pointers. The dynamic linking uses the same approach.
In your driver code you must also be writing
module_init(__init_function_name__) and module_exit(exit_function_name).
You may see how these macros are implemented.

Technically in C, by exporting function pointer of static function in one file and calling them from another file using extern function pointer is possible. And many such framework like Linux kernel are doing the same.
module_inint(your_func);
module_init is macro which expands such that your your_func's function pointer is getting added in one array. Which getting called at boot up time or module loading time.
A great explannation about module_init() is at https://stackoverflow.com/a/18606561/775964

How can I share an array between the C files of a library, without the array being visible to the outside? [duplicate]

I am writing a C (shared) library. It started out as a single translation unit, in which I could define a couple of static global variables, to be hidden from external modules.
Now that the library has grown, I want to break the module into a couple of smaller source files. The problem is that now I have two options for the mentioned globals:
Have private copies at each source file and somehow sync their values via function calls - this will get very ugly very fast.
Remove the static definition, so the variables are shared across all translation units using extern - but now application code that is linked against the library can access these globals, if the required declaration is made there.
So, is there a neat way for making private global variable shared across multiple, specific translation units?

You want the visibility attribute extension of GCC.
Practically, something like:
#define MODULE_VISIBILITY __attribute__ ((visibility ("hidden")))
#define PUBLIC_VISIBILITY __attribute__ ((visibility ("default")))
(You probably want to #ifdef the above macros, using some configuration tricks à la autoconfand other autotools; on other systems you would just have empty definitions like #define PUBLIC_VISIBILITY /*empty*/ etc...)
Then, declare a variable:
int module_var MODULE_VISIBILITY;
or a function
void module_function (int) MODULE_VISIBILITY;
Then you can use module_var or call module_function inside your shared library, but not outside.
See also the -fvisibility code generation option of GCC.
BTW, you could also compile your whole library with -Dsomeglobal=alongname3419a6 and use someglobal as usual; to really find it your user would need to pass the same preprocessor definition to the compiler, and you can make the name alongname3419a6 random and improbable enough to make the collision improbable.
PS. This visibility is specific to GCC (and probably to ELF shared libraries such as those on Linux). It won't work without GCC or outside of shared libraries.... so is quite Linux specific (even if some few other systems, perhaps Solaris with GCC, have it). Probably some other compilers (clang from LLVM) might support also that on Linux for shared libraries (not static ones). Actually, the real hiding (to the several compilation units of a single shared library) is done mostly by the linker (because the ELF shared libraries permit that).

The easiest ("old-school") solution is to simply not declare the variable in the intended public header.
Split your libraries header into "header.h" and "header-internal.h", and declare internal stuff in the latter one.
Of course, you should also take care to protect your library-global variable's name so that it doesn't collide with user code; presumably you already have a prefix that you use for the functions for this purpose.
You can also wrap the variable(s) in a struct, to make it cleaner since then only one actual symbol is globally visible.

You can obfuscate things with disguised structs, if you really want to hide the information as best as possible. e.g. in a header file,
struct data_s {
void *v;
};
And somewhere in your source:
struct data_s data;
struct gbs {
// declare all your globals here
} gbss;
and then:
data.v = &gbss;
You can then access all the globals via: ((struct gbs *)data.v)->

I know that this will not be what you literally intended, but you can leave the global variables static and divide them into multiple source files.
Copy the functions that write to the corresponding static variable in the same source file also declared static.
Declare functions that read the static variable so that external source files of the same module can read it's value.
In a way making it less global. If possible, best logic for breaking big files into smaller ones, is to make that decision based on the data.
If it is not possible to do it this way than you can bump all the global variables into one source file as static and access them from the other source files of the module by functions, making it official so if someone is manipulating your global variables at least you know how.
But then it probably is better to use #unwind's method.

How to create modules in C

I have an interface with which I want to be able to statically link modules. For example, I want to be able to call all functions (albeit in seperate files) called FOO or that match a certain prototype, ultimately make a call into a function in the file without a header in the other files. Dont say that it is impossible since I found a hack that can do it, but I want a non hacked method. (The hack is to use nm to get functions and their prototypes then I can dynamically call the function). Also, I know you can do this with dynamic linking, however, I want to statically link the files. Any ideas?

Put a table of all functions into each translation unit:
struct functions MOD1FUNCS[]={
{"FOO", foo},
{"BAR", bar},
{0, 0}
};
Then put a table into the main program listing all these tables:
struct functions* ALLFUNCS[]={
MOD1FUNCS,
MOD2FUNCS,
0
};
Then, at run time, search through the tables, and lookup the corresponding function pointer.

This is somewhat common in writing test code. e.g., you want to call all functions that start with test_. So you have a shell script that grep's through all your .C files and pulls out the function names that match test_.*. Then that script generates a test.c file that contains a function that calls all the test functions.
e.g., generated program would look like:
int main() {
initTestCode();
testA();
testB();
testC();
}
Another way to do it would be to use some linker tricks. This is what the Linux kernel does for its initialization. Functions that are init code are marked with the qualifier __init. This is defined in linux/init.h as follows:
#define __init __section(.init.text) __cold notrace
This causes the linker to put that function in the section .init.text. The kernel will reclaim memory from that section after the system boots.
For calling the functions, each module will declare an initcall function with some other macros core_initcall(func), arch_initcall(func), et cetera (also defined in linux/init.h). These macros put a pointer to the function into a linker section called .initcall.
At boot-time, the kernel will "walk" through the .initcall section calling all of the pointers there. The code that walks through looks like this:
extern initcall_t __initcall_start[], __initcall_end[], __early_initcall_end[];
static void __init do_initcalls(void)
{
initcall_t *fn;
for (fn = __early_initcall_end; fn < __initcall_end; fn++)
do_one_initcall(*fn);
/* Make sure there is no pending stuff from the initcall sequence */
flush_scheduled_work();
}
The symbols __initcall_start, __initcall_end, etc. get defined in the linker script.
In general, the Linux kernel does some of the cleverest tricks with the GCC pre-processor, compiler and linker that are possible. It's always been a great reference for C tricks.

You really need static linking and, at the same time, to select all matching functions at runtime, right? Because the latter is a typical case for dynamic linking, i'd say.
You obviusly need some mechanism to register the available functions. Dynamic linking would provide just this.

I really don't think you can do it. C isn't exactly capable of late-binding or the sort of introspection you seem to be requiring.
Although I don't really understand your question. Do you want the features of dynamically linked libraries while statically linking? Because that doesn't make sense to me... to static link, you need to already have the binary in hand, which would make dynamic loading of functions a waste of time, even if you could easily do it.