Difference between Re-entrant and Thread-Safe function - c

What is the difference between a re-entrant function and a thread safe function?

Re-entrant means no global state (local only).
Thread safe means it is not possible for 2 (or more) threads to conflict with each other (by writing conflicting values).

A thread-safe function can be called simultaneously from multiple
threads, even when the invocations use shared data, because all
references to the shared data are serialized.
A reentrant function can
also be called simultaneously from multiple threads, but only if each
invocation uses its own data.
Hence, a thread-safe function is always reentrant, but a reentrant
function is not always thread-safe.
The difference can be cottoned on with the example,
A class is said to be reentrant if its member functions can be called
safely from multiple threads, as long as each thread uses a different
instance of the class. The class is thread-safe if its member
functions can be called safely from multiple threads, even if all the
threads use the same instance of the class.
Source: Qt

Did you check the wiki article on the subject. It explains it well so please see that for a full discussion.
A few relevant bits from the article:
In computing, a computer program or subroutine is called reentrant if it can be interrupted in the middle of its execution, and then be safely called again ("re-entered") before its previous invocations complete execution. The interruption could be caused by an internal action such as a jump or call, or by an external action such as a hardware interrupt or signal. Once the reentered invocation completes, the previous invocations will resume correct execution.
and
This definition of reentrancy differs from that of thread-safety in multi-threaded environments. A reentrant subroutine can achieve thread-safety, but being reentrant alone might not be sufficient to be thread-safe in all situations. Conversely, thread-safe code does not necessarily have to be reentrant (see below for examples).

Related

Are we allowed to call pthread_key_create and pthread_key_delete from different threads? Should lmdb make this clear?

Details about the issue leading to the question
We're facing a SIGSEGV error under extremely rare circumstances when using the lmdb database library that are not easily reproducible. The best we got out of it is a stack-trace that looks like this through the core dump:
#0 in mdb_env_reader_dest (ptr=...) at .../mdb.c: 4935
#1 in __nptl_deallocate_tsd () at pthread_create.c:301
...
The line the stack-trace is pointing to is this (it's not the same because we attempted some changes to fix this).
Having tracked the issue for a while, the only explanation I have is that, when closing the lmdb environment from a thread other than the main thread, there's some kind of race between this line and this line, where the latter deletes the key, calls the custom destructor mdb_env_reader_dest, which causes SIGSEGV according to the documentation of lmdb, when attempting to use resources after freeing them.
The question
The documentation of pthread_key_create and pthread_key_delete are ambiguous to me, personally, in the sense whether they're talking about the calls to pthread_key_create and pthread_key_delete or the data underneath the pointers. This is what the docs say:
The associated destructor functions used to free thread-specific data at thread exit time are only guaranteed to work correctly when called in the thread that allocated the thread-specific data.
So the question is, can we call mdb_env_open and mdb_env_close from two different threads, leading to pthread_key_create and pthread_key_delete from different threads, and expect correct behavior?
I couldn't find such a requirement in the lmdb documentation. The closest I could find is this, which doesn't seem to reference the mdb_env_open function.
Are we allowed to call pthread_key_create and pthread_key_delete from different threads?
Yes.
However, a key must be created via pthread_key_create before it can be used by any thread, and that is not, itself, inherently thread safe. The key creation is often synchronized by performing it before starting any (other) threads that will use the key, but there are other alternatives.
Similarly, a key must not be deleted before all threads are finished with it, and the deletion is not, itself, inherently thread safe. TSD keys often are not deleted at all, and when they are deleted, that is often synchronized by first joining all (other) threads that may use the key. But again, there are other alternatives.
The documentation of pthread_key_create and pthread_key_delete are
ambiguous to me, personally, in the sense whether they're talking
about the calls to pthread_key_create and pthread_key_delete or the
data underneath the pointers. This is what the docs say:
The associated destructor functions used to free thread-specific data
at thread exit time are only guaranteed to work correctly when called
in the thread that allocated the thread-specific data.
The destructor functions those docs are talking about are the ones passed as the second argument to pthread_key_create().
And note well that that text is drawn from the Rationale section of the docs, not the Description section. It is talking about why the TSD destructor functions are not called by pthread_key_delete(), not trying to explain what the function does. That particular point is that TSD destructor functions must run in each thread carrying non-NULL TSD, as opposed to in the thread that calls pthread_key_delete().
So the question is, can we call mdb_env_open and mdb_env_close from
two different threads, leading to pthread_key_create and
pthread_key_delete from different threads, and expect correct
behavior?
The library's use of thread-specific data does not imply otherwise. However, you seem to be suggesting that there is a race between two different lines in the same function, mdb_env_close0, which can be the case only if that function is called in parallel by two different threads. The MDB docs say of mdb_env_close() that "Only a single thread may call this function." I would guess that they mean that to be scoped to a particular MDB environment. In any case, if you really have the race you think you have, then it seems to me that your program must be calling mdb_env_close() from multiple threads, contrary to specification.
So, as far as I know or can tell, the thread that calls mdb_env_close() does not need to be the same one that called mdb_env_open() (or mdb_env_create()), but it does need to be the only one that calls it.

What kind of functions cannot be used in the critical section of the spin lock?

I'm kind of confused about what sort of functions are not allowed in a spin lock's critical section.
In particular I'm confused about reentrant functions. I thought that all reentrant functions are unsafe to use in a spin lock's critical section, but it appears that functions like kfree and memcpy are ok to use.
So how do we know what functions are ok or not ok? I generally think anything that might block is unsafe, but don't all reentrant functions have the capacity/potential to block?
Also what is the role and relationship of the interrupt handler to spin locks?
You have reentrant backward: Re-entrant functions are the functions that are safe to use while a spinlock is held.
Many non-reentrant functions can be ok as well.
The primary concern is calling a function that directly or indirectly tries to acquire the same lock you're currently holding which leads to deadlock.
Specific to kmalloc, though you did not specify this in your question:
The general answer above is the reason why you've heard kmalloc may not be safe to use within a spinlock: It also acquires a spinlock by default. Use the flag GFP_ATOMIC to control whether kmalloc attempts to acquire a spinlock:
GFP_ATOMIC The allocation is high priority and must not sleep. This
is the flag to use in interrupt handlers, in bottom halves, while
holding a spinlock, and in other situations where you cannot sleep.
http://books.gigatux.nl/mirror/kerneldevelopment/0672327201/ch11lev1sec4.html

Need to understand one usage of pthread_mutex_lock() and pthread_cond_wait() and pthread_cond_signal()

I need to understand one usage of pthread_mutex_lock() and pthread_cond_wait() and pthread_cond_signal().
I have seen a piece of code where a function, say for example, CallANumber() is invoked from main() and inside this CallANumber() function pthread_mutex_lock() is used along with pthread_cond_wait() and then release by pthread_mutex_unlock() and there is another function, say for example, WaitForResponse(), inside this function pthread_mutex_lock() along with pthread_cond_signal() has been called and released by pthread_mutex_unlock().
But I have not found any pthread_create() call inside the source base.
Is it possible to call Pthread_mutex_lock/unlock() and pthread_cond_wait/signal() APIs without a pthread_create() function getting never been called ?
There are two reasons for using these functions in programs which are not multi-threaded:
The functions are called from generic code, perhaps in a library, and this library needs to perform synchronization in case the process is multi-threaded (which the library authors do not know). Without synchronization, the library might not work as expected in a multi-threaded program.
The synchronization happens across processes instead of threads, using process-shared mutexes and process-shared condition variables.

What happens if lua is interrupted by a signal?

Lua docs say:
The Lua library defines no global variables at all. It keeps all its state in the dynamic structure lua_State and a pointer to this
structure is passed as an argument to all functions inside Lua. This
implementation makes Lua reentrant and ready to be used in
multithreaded code.
But is it true? Has anyone tried to verify this? What if a signal is raised and caught by a custom handler during the execution of lua? That is, is lua itself (never mind the system calls it makes) truly reentrant?
EDIT:
A well-known problem in lua is the lack of a timer implementation. These can be implemented using POSIX timers, that raise a signal. But raising such a signal may interrupt the execution of lua itself. The canonical solution to solve this problem is the masking/unmasking of a signal, but if lua were truly re-entrant this would not be needed.
AFAICT, re-entrancy is a single threaded concept, and somewhat independent from multi-threading. Multi-thread safety relates to data coherence when concurrent read/write shared data, whereas re-entrancy relates to state coherence of function pre/post signal, within one thread.
A function is either multi-thread safe, or it is not. There is no in-between. However, it is not so simple with regards to re-entrancy: there are conditions under which a function is re-entrant, and conditions under which it is not; for some functions, there are no conditions under which it is re-entrant. I'm not a computer scientist but my guess is that there are very few functions, if any, that would be re-entrant under all conditions. Like void f() {} would be one, but it's not very useful :)
The following are probably true:
A required condition for a function to be re-entrant is that it must not use any static or global data or data that can be set from outside itself (such as registers or DMA).
Another required condition for re-entrancy is that the function only call re-entrant functions. In this case the function is re-entrant with the sum of all conditions required for the called functions to be considered re-entrant. So if A calls B and C, and B is re-entrant if condition b is true, and C is re-entrant if condition c is true, then a necessary condition for A to be re-entrant is conditions b and c must be true.
A function that accepts at least one argument is only re-entrant if 1 and 2 are true and the signal handler does not call, directly or indirectly, the function with the same argument.
An API is re-entrant in the same manner as the totality of its functions. This means that there may be only a subset of the API that can be said to be re-entrant, under certain specific conditions (1-3), and other functions are not re-entrant. This does not mean the API is not re-entrant; just that a subset of it is re-entrant, under certain conditions.
If the above is correct, then you have to be more specific when asking (or stating) whether Lua is re-entrant, to ask which subset of Lua functions are known to be re-entrant, under what conditions. Apparently all Lua functions satisfy 1, but which ones satisfy 2? Almost all Lua API functions accept at least one argument, so under the condition that your signal handler does not call directly or indirectly the same Lua function with the same Lua state variable, you could say that Lua is re-entrant for those functions that don't call non-reentrant functions.
Update 1: why condition 3:
Consider
void f(const Foo& foo) {
if (foo.bar)
do stuff
signal happens here, calling isr()
modify fo
}
Foo* isrFoo;
void g() {
Foo foo;
isrFoo = & foo;
f(foo)
}
void isr() {
f(*isrFoo)
}
Although f(const Foo&) does not use globals or static (although strictly speaking it doesn't know if a is a ref to such var), the object received can be shared by multiple objects and hence, in isr(), can be modified, such that when f() resumes, foo is no longer same as when interrupted. One could say that f() is re-entrant (in single-thread) but here isr() is interfering, making f() non-re-entrant in that particular case. Assuming that an object copy op could be made atomic, f() could be made re-entrant even for this particular design of isr() if foo was copied into a local variable of f before being used, or if isr() made a local copy, or foo was pass-by-value.
Update 2: russian roulette
Russian roulette is a game of chance. So no, re-entrancy is not game of chance: given the above, the manual says basically that if your signal handler does not call (directly or indirectly) Lua C API functions, then you can consider the Lua C API functions re-entrant because of the way the API was designed and implemented.
For example if you have a timer that ticks (signals) every 100 ms, but the handler just sets a flag to true for "do something ASAP", and your code loops endlessly, calling a Lua function (via lua_pcall) at every iteration, to check the flag, you shouldn't have any problems: if the Lua function is interrupted by the timer before the flag is checked, the flag will get set, then upon return from signal the flag will be seen as true and your function will take action as designed.
However, if you are not careful, your Lua function (not the C API that calls it) may not be re-entrant, thus causing lua_pcall to not be re-entrant when calling your Lua function. For example if your Lua function (called via lua_pcall) checks the flag in two places:
function checkTimerFlagSet()
if flag then ... end
... do stuff ...
if flag then ... end
and the timer signal occurs between the two checks, then the flag could be seen as false before signal and true after, during the same function call, which could lead to inconsistent behavior of your Lua function. But this is merely rule #1 not being followed (no choice since your signal handler can only set global variable) by your function, not by the Lua C API: this "bad" (i.e. non-reentrant) design of your Lua function is what caused one of the Lua C API functions (lua_pcall) to no-longer be re-entrant. It is re-entrant otherwise.
It is true that lua keeps all its variables in lua_State. If a signal occurs, that signal will be handled in C. You cannot call lua safely from your signal handler, just as you can't call even some thread safe functions from a signal handler.
What the documentation is saying is that if you have different threads with different lua_State variables, they can each safely run lua without the need to synchronise between them.

Calling a standard library function in signal handler

Why is calling a standard library function inside a signal handler discouraged?
This is explained in the GNU LibC documentation.
If you call a function in the handler, make sure it is reentrant with respect to signals, or else make sure that the signal cannot interrupt a call to a related function.
And just in case, here's the Wikipedia page on reentrant functions.
A computer program or routine is described as reentrant if it can be safely called again before its previous invocation has been completed (i.e it can be safely executed concurrently).
Its not only re-entrancy issues, depending on the signal being services you also want to avoid inadvertent calls to malloc() (i.e. asprintf()) and other variadic expansion (i.e. printf()).
It is all running fine and stuff, until you run into some mysterious bugs which are totally untraceable :)
man 7 signal will give you a list of system calls which are safe to call from a signal handler. It is described in POSIX as well.
Because the library function may not be reentrant.

Resources