What is zalloc in embedded programming? - c

I am looking into programming the ESP8266 serial-wifi chip. In its SDK examples it makes extensive use of a function called os_zalloc where I would expect malloc.
Occasionally though, os_malloc is used as well. So they do not appear to be identical in function.
Unfortunately there is no documentation. Can anybody make an educated guess from the following header file?
#ifndef __MEM_H__
#define __MEM_H__
//void *pvPortMalloc( size_t xWantedSize );
//void vPortFree( void *pv );
//void *pvPortZalloc(size_t size);
#define os_malloc pvPortMalloc
#define os_free vPortFree
#define os_zalloc pvPortZalloc
#endif

Since os_zalloc is a macro, and the definition is given in mem.h, a better question to ask would be about what pvPortZalloc does.
Given the function names pvPortMalloc, vPortFree and pvPortZalloc it would appear that the OS in use is FreeRTOS (or it's commercially licensed equivalent OpenRTOS), which is documented - although not specifically pvPortZalloc, but it would be strange if it was not simply allocate and zero initialise - that is for example what it means here. The functions are part of the target porting layer for FreeRTOS, and are not normally called by the application level, but I imagine here the macro wrapper is used to access the porting layer code for application user rather than write it twice.
In an RTOS kernel RTOS aware dynamic memory allocation functions are required to ensure thread safety, although some standard library implementations include thread safety stubs that you implement using the RTOS mutex calls, which is a better method since existing libaries and C++ new/delete can be more easily used.

I would say "allocate memory and fill with zeros"

Related

Portable way to implement variadic arguments in kernel space?

I am wondering if it is possible to implement the variadic macros in C or assembly.
I would prefer to have at least va_start() be a C macro but looks like this might not be possible. I have seen other answers to different questions saying it is not possible to do in C because you have to rely on undefined behaviour.
For context I am writing a kernel and I do not want to rely on any specific C89 compiler or unix-like assembler. Building the source with any C compiler is important for the project. Keeping it simple is another goal, unfortunately supporting something like variadic arguments seems to be complex on some architectures (amd64 ABI).
I know the __builtin_va_start(v,l), __builtin_va_arg(v, l), etc. macros exist but these are only available to specific compilers?
Right now I have the kernel printf(, ...) and panic(, ...) routines written in assembly (i386 ABI) which setup the va_list (pointer to first va argument on the stack) and pass it to vprintf(, va_list) which then uses the va_arg() macro (written in C). This does not rely on any undefined or implementation defined behaviour but I would prefer that all the macros are written in C.
Summary: Just #include <stdarg.h> and use va_start and friends as you normally would. A standard-conformant C compiler will support this, even without what we normally think of as a "C library", and it is perfectly usable in a kernel that must run on the bare metal without OS support. This is also the most portable solution, and avoids needing an architecture-, compiler- or ABI-dependent solution.
Of course when writing a kernel, you are used to not using library facilities like the functions from <stdio.h>, <stdlib.h>, and even <string.h> (printf, malloc, strcpy, etc), or having to write your own. But <stdarg.h> is in a different category. Its functionality can be provided by the compiler without OS support or extensive library code, and is in some sense more a part of the compiler/language than the "library".
From the point of view of the C standard, there are two kinds of conforming implementations (see C17 section 4, "Conformance"). Application programmers mostly think about conforming hosted implementations, which must provide printf and all that. But for a kernel or embedded code or anything else that runs on the bare metal, what you want is a conforming freestanding implementation (I'll write CFI for short). This is, informally speaking, "just the compiler" without "the standard library". But there are a few standard headers whose contents a CFI must still support, and <stdarg.h> is one of them. The others are things like <limits.h>, <stddef.h> and <stdint.h> that are mainly constants, macros and typedefs.
(This same distinction has existed all the way back to C89, with the same guarantee of <stdarg.h> being available.)
If your kernel will build with any CFI, that's pretty much the gold standard of portability for a kernel. In fact, you'll be pretty hard-pressed not to use some more compiler-specific feature at some points (inline assembly is awfully useful, for instance). But <stdarg.h> doesn't have to be one of them; you're really not giving up any portability by using it. You can expect it to be supported by any usable compiler targeting any given architecture, and that includes cross compilers (which will be configured to use the correct header for the target). For instance, in the case of a GNU system, <stdarg.h> ships with the gcc compiler itself, and not with the glibc standard library.
As some further assurance, until very recently, the Linux kernel itself used <stdarg.h> in precisely this way. (About a month ago there was a commit to create their own <linux/stdarg.h> file, which just copy-pastes from an old version of gcc's <stdarg.h> and defines the macros as their gcc-specific __builtin versions. Linux only supports building with gcc anyway, so this doesn't hurt them. But my best guess is that this was done for licensing reasons - the commit message emphasizes that they copied a GPL 2 version - rather than based on anything technical.)
By contrast, writing your variadic functions in assembly will naturally tie you to that specific architecture, and they'd be one more thing to be rewritten if you ever want to port to another architecture. And trying to access variadic arguments on the stack from C, with tricks like arg = *((int *)&fixed_arg + 1), is (a) ABI-dependent, (b) only possible at all for ABIs which actually pass args on the stack, which these days isn't much besides x86-32, and (c) is undefined behavior that might be "miscompiled" by some compilers. Finally, things like __builtin_va_start are strictly compiler-dependent (gcc and clang in this case), and using <stdarg.h> is no worse because gcc's <stdarg.h> simply contains macros like #define va_start __builtin_va_start.
Since you are highlighting kernel space, is it right that you want a user space function which is implemented via some sort of kernel call and is variadic?
This is a bit problematic; as a typical kernel entry point transitions the flow of control onto a kernel stack. Your va_start(), va_arg() implementations would have to be aware of how to traverse to the user's stack, and possibly map bits of a register save area into the vector.
An easier approach would be to have the user function:
int ufunc(char *fmt, ...) {
va_list v;
int n;
va_start(v, fmt);
n = __ufunc(fmt, v);
va_end(v);
return n;
}
And implement __ufunc in the kernel. Traditionally this is how the execl and execv family of functions co-operate to make handy interfaces, but only use one kernel call.
Your kernel will still have a bit of work dealing with the user stack though. For example, I could craft a va_list value for your call that caused the kernel to read out some private data. But if you are able to sort that the va_list points somewhere valid, and whatever processing you are doing with va_arg() supplied values are also valid, you would be able to use the stock compiler provided implementation.
Do note that if the user program used a different calling convention than your kernel, you could be in for a bit of work. For example, microsoft ignored the published ABI for the amd64, so that might cause a problem.

Linux kernel NULL-pointer dereference in memset from kzalloc

Quite by chance stumbled upon some code in kernel jungles and was a bit confused. There are two implementations of kzalloc(): in tools/virtio/linux/kernel.h and the main one in linux/slab.h. Obviously, in most cases the second one is used. But sometimes the "virtio" kzalloc() is used.
"virtio" kzalloc() looks like this:
static inline void *kzalloc(size_t s, gfp_t gfp)
{
void *p = kmalloc(s, gfp);
memset(p, 0, s);
return p;
}
My confusion is that "fake" kmalloc() used inside "tools" directory can return NULL-pointer. Also it looks like the memset() implementation doesn't check NULL-pointers so there could be NULL-pointer dereference.
Is it a bug or am I missing something?
Yes, that definitely looks like a bug.
The tools/ subdirectory is a collection of user space tools (as the name suggests). You can also see this by the fact that several C standard library headers are included. So this of course is not a kernel bug (that would have been very bad), just a minor oversight in the virtio testing tool.
That virtio testing tool seems to re-define some kernel APIs to mock their behavior in userspace. That function though doesn't seem to be ever used in practice, just merely defined.
marco:~/git/linux/tools/virtio$ grep -r kzalloc
linux/kernel.h:static inline void *kzalloc(size_t s, gfp_t gfp)
ringtest/ptr_ring.c:static inline void *kzalloc(unsigned size, gfp_t flags)
marco:~/git/linux/tools/virtio$
It's probably meant to be used by someone who wishes to test some virtio kernel code in userspace.
In any case, you could try reporting the bug. The get_mantainer.pl script suggests:
$ perl scripts/get_maintainer.pl -f tools/virtio/linux/kernel.h
Bad divisor in main::vcs_assign: 0
"Michael S. Tsirkin" <mst#redhat.com> (maintainer:VIRTIO CORE AND NET DRIVERS)
Jason Wang <jasowang#redhat.com> (maintainer:VIRTIO CORE AND NET DRIVERS)
virtualization#lists.linux-foundation.org (open list:VIRTIO CORE AND NET DRIVERS)
linux-kernel#vger.kernel.org (open list)
The header is mainly used for userspace testing, such as virtio_test.
From the git-log of tools/virtio/virtio_test.c:
This is the userspace part of the tool: it includes a bunch of stubs for
linux APIs, somewhat simular to linuxsched. This makes it possible to
recompile the ring code in userspace.
A small test example is implemented combining this with vhost_test
module.
So yes, the code is a bit unsafe (clean coding would test for a NULL pointer prior to memset() and bail out with an appropriate error message), but since it is just a testing tool, it seems to have been considered uncritical to skip this test.

Thread-safe init of read-only global data

Let's imagine that I'm writing a library that has a reasonably large amount of read-only global data that needs to be initialized before the library can be used. For example, perhaps the global data be lookup tables for various parts of the application logic that won't change during the lifetime of the program.
Now I have a few ways to initialize this data:
I may require that the user call some kind of init() function before the library is used.
I may lazily construct the data the first time a function is called on my library.
I may include the data in a initializer statement in the source, such that variables are statically initialized to their final value.
Now if my data is read-only and should be the same for every environment the library runs in, then (3) is fairly appealing. Even in that case it has some downsides: if the data is very large (but easy to generate procedurally) the size of bloat up a lot (e.g., a library with 50K of code but 8MB of lookup tables would end up around 8050K). Similarly, the source itself may be very large, or the build system needs to handle the generation of the source at compile time.
The main reason you might not able to use (3) is that the tables might be fixed (read-only), but require generation at runtime because they embed some information about the environment (e.g., the value of an environment variable, I configuration setting read from a file, information about the machine architecture, whatever). This data can't be embedded in the source since depends on the runtime environment.
So we have methods (1) and (2) at least - but I can't see how to make these thread-safe in a simple way. The rest of the library can be thread-safe simply by not mutating any global state - just like the vast majority of C functions can be written in a thread-safe way w/o any explicit use of threading primitives.
I can't figure out a similar alternative for this global init, however:
(1) Is undesirable because we prefer not to require the user to call this method, and in any case it simply moves the problem up to the calling code: the calling code then needs to organize to call this init() method exactly once across all threads using the library, and before any thread uses the library.
(2) Fails since concurrent calls to the library might do a double init.
In C++ you can just initialize globals with a method call, like int data[] = loadData(). Is there any equivalent in C? Or am I stuck using threading primitives (which vary by platform, e.g., pthread_once, call_once and whatever Windows has) just to get my thread-safe init?
I don't know of any platform-independent way of initializing a library in a thread-safe manner. That's not surprising since there's no platform-independent threading model in C.
So your solution is going to be platform-specific.
#ThingyWotsit mentions in the comments using C++ to initialize your library, and that will be thread-safe. But it may very well lock you into a specific C++ run-time, so it may not be a useful solution for your C shared object/library. You may not be willing or able to add a dependency on C++ and you may especially not be willing or able to be locked into a specific C++ run-time.
For GCC, you can use the __attribute((constructor)) to have your iniitaliziation function called when the shared object is loaded:
constructor
destructor
constructor (priority)
destructor (priority)
The constructor attribute causes the function to be called automatically before execution enters main ().
Similarly, the destructor attribute causes the function to be called
automatically after main () has completed or exit () has been called.
Functions with these attributes are useful for initializing data that
will be used implicitly during the execution of the program.
You may provide an optional integer priority to control the order in
which constructor and destructor functions are run. A constructor with
a smaller priority number runs before a constructor with a larger
priority number; the opposite relationship holds for destructors. So,
if you have a constructor that allocates a resource and a destructor
that deallocates the same resource, both functions typically have the
same priority. The priorities for constructor and destructor functions
are the same as those specified for namespace-scope C++ objects (see
C++ Attributes).
For example:
static __attribute__((constructor)) void my_lib_init_func( void )
{
...
}
Your code will run before main() is called.
If your library is dynamically loaded (explicit call to dlopen(), for exmaple), your init function will be called when your library is loaded, and your library won't be considered loaded until it returns.
Other compilers provide the functionally-identical #pragma init():
#pragma init(my_lib_init_func)
static void my_lib_init_func( void )
{
...
}
See #pragma init and #pragma fini using gcc compiler on linux
For Windows? The Windows C++ run-time is pretty stable and ubiquitous. I'd just use a C++ solution on Windows, especially if you're compiling with MSVC. (But see the comments...)
Option 3 is always preferable when possible. Your reasoning about the cons is wrong. If you have an 8MB constant table in the executable file, it's directly mapped and shared by all instances of the program or users of the shared library on any remotely modern operating system. If you generate it at runtime, each process will have its own copy of the table.
When option 3 is not available you must use pthread_once or equivalent or implement your own version of the same (much less efficiently) using a lock. There is little reason to use weird OS-specific replacements for it; all major platforms either support POSIX threads API natively or have existing libraries which provide it on top of the platform's low-level primitives.

Preferred method to use two names to call the same function in C

I know there are at least three popular methods to call the same function with multiple names. I haven't actually heard of someone using the fourth method for this purpose.
1). Could use #defines:
int my_function (int);
#define my_func my_function
OR
#define my_func(int (a)) my_function(int (a))
2). Embedded function calls are another possibility:
int my_func(int a) {
return my_function(a);
}
3). Use a weak alias in the linker:
int my_func(int a) __attribute__((weak, alias("my_function")));
4). Function pointers:
int (* const my_func)(int) = my_function;
The reason I need multiple names is for a mathematical library that has multiple implementations of the same method.
For example, I need an efficient method to calculate the square root of a scalar floating point number. So I could just use math.h's sqrt(). This is not very efficient. So I write one or two other methods, such as one using Newton's Method. The problem is each technique is better on certain processors (in my case microcontrollers). So I want the compilation process to choose the best method.
I think this means it would be best to use either the macros or the weak alias since those techniques could easily be grouped in a few #ifdef statements in the header files. This simplifies maintenance (relatively). It is also possible to do using the function pointers, but it would have to be in the source file with extern declarations of the general functions in the header file.
Which do you think is the better method?
Edit:
From the proposed solutions, there appears to be two important questions that I did not address.
Q. Are the users working primarily in C/C++?
A. All known development will be in C/C++ or assembly. I am designing this library for my own personal use, mostly for work on bare metal projects. There will be either no or minimal operating system features. There is a remote possibility of using this in full blown operating systems, which would require consideration of language bindings. Since this is for personal growth, it would be advantageous to learn library development on popular embedded operating systems.
Q. Are the users going to need/want an exposed library?
A. So far, yes. Since it is just me, I want to make direct modifications for each processor I use after testing. This is where the test suite would be useful. So an exposed library would help somewhat. Additionally, each "optimal implementation" for particular function may have a failing conditions. At this point, it has to be decided who fixes the problem: the user or the library designer. A user would need an exposed library to work around failing conditions. I am both the "user" and "library designer". It would almost be better to allow for both. Then non-realtime applications could let the library solve all of stability problems as they come up, but real-time applications would be empowered to consider algorithm speed/space vs. algorithm stability.
Another alternative would be to move the functionality into a separately compiled library optimised for each different architecture and then just link to this library during compilation. This would allow the project code to remain unchanged.
Depending on the intended audience for your library, I suggest you chose between 2 alternatives:
If the consumer of your library is guaranteed to be Cish, use #define sqrt newton_sqrt for optimal readability
If some consumers of your library are not of the C variety (think bindings to Dephi, .NET, whatever) try to avoid consumer-visible #defines. This is a major PITA for bindings, as macros are not visible on the binary - embedded function calls are the most binding-friendly.
What you can do is this. In header file (.h):
int function(void);
In the source file (.c):
static int function_implementation_a(void);
static int function_implementation_b(void);
static int function_implementation_c(void);
#if ARCH == ARCH_A
int function(void)
{
return function_implementation_a();
}
#elif ARCH == ARCH_B
int function(void)
{
return function_implementation_b();
}
#else
int function(void)
{
return function_implementation_c();
}
#endif // ARCH
Static functions called once are often inlined by the implementation. This is the case for example with gcc by default : -finline-functions-called-once is enabled even in -O0. The static functions that are not called are also usually not included in the final binary.
Note that I don't put the #if and #else in a single function body because I find the code more readable when #if directives are outside the functions body.
Note this way works better with embedded code where libraries are usually distributed in their source form.
I usually like to solve this with a single declaration in a header file with a different source file for each architecture/processor-type. Then I just have the build system (usually GNU make) choose the right source file.
I usually split the source tree into separate directories for common code and for target-specific code. For instance, my current project has a toplevel directory Project1 and underneath it are include, common, arm, and host directories. For arm and host, the Makefile looks for source in the proper directory based on the target.
I think this makes it easier to navigate the code since I don't have to look up weak symbols or preprocessor definitions to see what functions are actually getting called. It also avoids the ugliness of function wrappers and the potential performance hit of function pointers.
You might you create a test suite for all algorithms and run it on the target to determine which are the best performing, then have the test suite automatically generate the necessary linker aliases (method 3).
Beyond that a simple #define (method 1) probably the simplest, and will not and any potential overhead. It does however expose to the library user that there might be multiple implementations, which may be undesirable.
Personally, since only one implementation of each function is likley to be optimal on any specific target, I'd use the test suite to determine the required versions for each target and build a separate library for each target with only those one version of each function the correct function name directly.

C Shared library: static variable initialization + global variable visibility among processes

I want to modifiy an existing shared library so that it uses different memory management routines depending on the application using the shared library.
(For now) there will be two families of memory management routines:
The standard malloc, calloc etc functions
specialized versions of malloc, calloc etc
I have come up with a potential way of solving this problem (with the help of some people here on SO). There are still a few grey areas and I would like some feedback on my proposal so far.
This is how I intend to implement the modification:
Replace existing calls to malloc/calloc etc with my_malloc/my_calloc etc. These new functions will invoke correctly assigned function pointers instead of calling hard coded function names.
Provide a mechanism for the shared library to initialize the function pointers used by my_malloc etc to point to the standard C memory mgmt routines - this allows me to provide backward compatability to applications which depend on this shared library - so they don't have to be modified as well. In C++, I could have done this by using static variable initialization (for example) - I'm not sure if the same 'pattern' can be used in C.
Introduce a new idempotent function initAPI(type) function which is called (at startup) by the application that need to use different mem mgmt routines in the shared libray. The initAPI() function assigns the memory mgmt func ptrs to the appropriate functions.
Clearly, it would be preferable if I could restrict who could call initAPI() or when it was called - for example, the function should NOT be called after API calls have been made to the library - as this will change the memory mgmt routines. So I would like to restrict where it is called and by whom. This is an access problem which can be solved by making the method private in C++, I am not sure how to do this in C.
The problems in 2 and 3 above can be trivially resolved in C++, however I am constrained to using C, so I would like to solve these issues in C.
Finally, assuming that the function pointers can be correctly set during initialisation as described above - I have a second question, regarding the visibility of global variables in a shared library, accross different processes using the shared library. The function pointers will be implemented as global variables (I'm not too concerned about thread safety FOR NOW - although I envisage wrapping access with mutex locking at some point)* and each application using the shared library should not interfere with the memory management routines used for another application using the shared library.
I suspect that it is code (not data) that is shared between processes using a shlib - however, I would like that confirmed - preferably, with a link that backs up that assertion.
*Note: if I am naively downplaying threading issues that may occur in the future as a result of the 'architecture' I described above, someone please alert me!..
BTW, I am building the library on Linux (Ubuntu)
Since I'm not entirely sure what the question being asked is, I will try to provide information that may be of use.
You've indicated c and linux, it is probably safe to assume you are also using the GNU toolchain.
GCC provides a constructor function attribute that causes a function to be called automatically before execution enters main(). You could use this to better control when your library initialization routine, initAPI() is called.
void __attribute__ ((constructor)) initAPI(void);
In the case of library initialization, constructor routines are executed before dlopen() returns if the library is loaded at runtime or before main() is started if the library is loaded at load time.
The GNU linker has a --wrap <symbol> option which allows you to provide wrappers for system functions.
If you link with --wrap malloc, references to malloc() will redirect to __wrap_malloc() (which you implement), and references to __real_malloc() will redirect to the original malloc() (so you can call it from within your wrapper implementation).
Instead of using the --wrap malloc option to provide a reference to the original malloc() you could also dynamically load a pointer to the original malloc() using dlsym(). You cannot directly call the original malloc() from the wrapper because it will be interpreted as a recursive call to the wrapper itself.
#define _GNU_SOURCE
#include <stdio.h>
#include <stdint.h>
#include <dlfcn.h>
void * malloc(size_t size) {
static void * (*func)(size_t) = NULL;
void * ret;
if (!func) {
/* get reference to original (libc provided) malloc */
func = (void *(*)(size_t)) dlsym(RTLD_NEXT, "malloc");
}
/* code to execute before calling malloc */
...
/* call original malloc */
ret = func(size);
/* code to execute after calling malloc */
...
return ret;
}
I suggest reading Jay Conrod's blog post entitled Tutorial: Function Interposition in Linux for additional information on replacing calls to functions in dynamic libraries with calls to your own wrapper functions.
-1 for the lack of concrete questions. The text is long, could have been written more succintly, and it does not contain a single question-mark.
Now to address your problems:
Static data (what you call "global variables") of a shared library is per-process. Your global variables in one process will not interfere with global variables in another process. No need for mutexes.
In C, you cannot restrict[1] who can call a function. It can be called by anybody who knows its name or has a pointer to it. You can code initAPI() such that it visibly aborts the program (crashes it) if it is not the first library function called. You are library writer, you set the rules of the game, and you have NO obligation towards coders who do not respect the rules.
[1] You can declare the function with static, meaning it can be called by name only by the code within the same translation unit; it can still be called through a pointer by anybody who manages to obtain a pointer to it. Such functions are not "exported" from libraries, so this is not applicable to your scenario.
Achieving this:
(For now) there will be two families of memory management routines:
The standard malloc, calloc etc functions
specialized versions of malloc, calloc etc
with dynamic libraries on Linux is trivial, and does not require the complicated scheme you have concocted (nor the LD_PRELOAD or dlopen suggested by #ugoren).
When you want to provide specialized versions of malloc and friends, simply link these routines into your main executable. Voila: your existing shared library will pick them up from there, no modifications required.
You could also build specialized malloc into e.g. libmymalloc.so, and put that library on the link line before libc, to achieve the same result.
The dynamic loader will use the first malloc it can see, and searches the list starting from the a.out, and proceeding to search other libraries in the same order they were listed on link command line.
UPDATE:
On further reflection, I don't think what you propose will work.
Yes, it will work (I use that functionality every day, by linking tcmalloc into my main executable).
When your shared library (the one providing an API) calls malloc "behind the scenes", which (of possibly several) malloc implementations does it get? The first one that is visible to the dynamic linker. If you link a malloc implementation into a.out, that will be the one.
It's easy enough for you to require that your initialization function is:
called from the main thread
that the client may call it exactly once
and that the client may provide the optional function pointers by parameter
If different applications run in separate processes, it's quite simple to do using dynamic libraries.
The library can simply call malloc() and free(), and applications that want to override it could load another library, with alternative implementations for these libraries.
This can be done with the LD_PRELOAD environment variable.
Or, if your library is loaded with dlopen(), just load the malloc library first.
This is basically what tools such as valgrind, which replace malloc, do.

Resources