Lua docs say:
The Lua library defines no global variables at all. It keeps all its state in the dynamic structure lua_State and a pointer to this
structure is passed as an argument to all functions inside Lua. This
implementation makes Lua reentrant and ready to be used in
multithreaded code.
But is it true? Has anyone tried to verify this? What if a signal is raised and caught by a custom handler during the execution of lua? That is, is lua itself (never mind the system calls it makes) truly reentrant?
EDIT:
A well-known problem in lua is the lack of a timer implementation. These can be implemented using POSIX timers, that raise a signal. But raising such a signal may interrupt the execution of lua itself. The canonical solution to solve this problem is the masking/unmasking of a signal, but if lua were truly re-entrant this would not be needed.
AFAICT, re-entrancy is a single threaded concept, and somewhat independent from multi-threading. Multi-thread safety relates to data coherence when concurrent read/write shared data, whereas re-entrancy relates to state coherence of function pre/post signal, within one thread.
A function is either multi-thread safe, or it is not. There is no in-between. However, it is not so simple with regards to re-entrancy: there are conditions under which a function is re-entrant, and conditions under which it is not; for some functions, there are no conditions under which it is re-entrant. I'm not a computer scientist but my guess is that there are very few functions, if any, that would be re-entrant under all conditions. Like void f() {} would be one, but it's not very useful :)
The following are probably true:
A required condition for a function to be re-entrant is that it must not use any static or global data or data that can be set from outside itself (such as registers or DMA).
Another required condition for re-entrancy is that the function only call re-entrant functions. In this case the function is re-entrant with the sum of all conditions required for the called functions to be considered re-entrant. So if A calls B and C, and B is re-entrant if condition b is true, and C is re-entrant if condition c is true, then a necessary condition for A to be re-entrant is conditions b and c must be true.
A function that accepts at least one argument is only re-entrant if 1 and 2 are true and the signal handler does not call, directly or indirectly, the function with the same argument.
An API is re-entrant in the same manner as the totality of its functions. This means that there may be only a subset of the API that can be said to be re-entrant, under certain specific conditions (1-3), and other functions are not re-entrant. This does not mean the API is not re-entrant; just that a subset of it is re-entrant, under certain conditions.
If the above is correct, then you have to be more specific when asking (or stating) whether Lua is re-entrant, to ask which subset of Lua functions are known to be re-entrant, under what conditions. Apparently all Lua functions satisfy 1, but which ones satisfy 2? Almost all Lua API functions accept at least one argument, so under the condition that your signal handler does not call directly or indirectly the same Lua function with the same Lua state variable, you could say that Lua is re-entrant for those functions that don't call non-reentrant functions.
Update 1: why condition 3:
Consider
void f(const Foo& foo) {
if (foo.bar)
do stuff
signal happens here, calling isr()
modify fo
}
Foo* isrFoo;
void g() {
Foo foo;
isrFoo = & foo;
f(foo)
}
void isr() {
f(*isrFoo)
}
Although f(const Foo&) does not use globals or static (although strictly speaking it doesn't know if a is a ref to such var), the object received can be shared by multiple objects and hence, in isr(), can be modified, such that when f() resumes, foo is no longer same as when interrupted. One could say that f() is re-entrant (in single-thread) but here isr() is interfering, making f() non-re-entrant in that particular case. Assuming that an object copy op could be made atomic, f() could be made re-entrant even for this particular design of isr() if foo was copied into a local variable of f before being used, or if isr() made a local copy, or foo was pass-by-value.
Update 2: russian roulette
Russian roulette is a game of chance. So no, re-entrancy is not game of chance: given the above, the manual says basically that if your signal handler does not call (directly or indirectly) Lua C API functions, then you can consider the Lua C API functions re-entrant because of the way the API was designed and implemented.
For example if you have a timer that ticks (signals) every 100 ms, but the handler just sets a flag to true for "do something ASAP", and your code loops endlessly, calling a Lua function (via lua_pcall) at every iteration, to check the flag, you shouldn't have any problems: if the Lua function is interrupted by the timer before the flag is checked, the flag will get set, then upon return from signal the flag will be seen as true and your function will take action as designed.
However, if you are not careful, your Lua function (not the C API that calls it) may not be re-entrant, thus causing lua_pcall to not be re-entrant when calling your Lua function. For example if your Lua function (called via lua_pcall) checks the flag in two places:
function checkTimerFlagSet()
if flag then ... end
... do stuff ...
if flag then ... end
and the timer signal occurs between the two checks, then the flag could be seen as false before signal and true after, during the same function call, which could lead to inconsistent behavior of your Lua function. But this is merely rule #1 not being followed (no choice since your signal handler can only set global variable) by your function, not by the Lua C API: this "bad" (i.e. non-reentrant) design of your Lua function is what caused one of the Lua C API functions (lua_pcall) to no-longer be re-entrant. It is re-entrant otherwise.
It is true that lua keeps all its variables in lua_State. If a signal occurs, that signal will be handled in C. You cannot call lua safely from your signal handler, just as you can't call even some thread safe functions from a signal handler.
What the documentation is saying is that if you have different threads with different lua_State variables, they can each safely run lua without the need to synchronise between them.
Related
Taken from: https://www.gnu.org/software/libc/manual/html_node/Nonreentrancy.html
For example, suppose that the signal handler uses gethostbyname. This function returns its value in a static object, reusing the same object each time. If the signal happens to arrive during a call to gethostbyname, or even after one (while the program is still using the value), it will clobber the value that the program asked for.
I fail to see how the above scenario is non-reentrant. It seems to me that gethostbyname is a (read-only) getter function that merely reads from memory (as opposed to modifying memory). Why is gethostbyname non-reentrant?
As the word says, reentrancy is the capability of a function to be able to be called again while it is being called in anothe thread. The scenario you propose is the exact place in which reentrancy is exercised. asume the function has some static or global variable (as the gethostbyname(3) function does) As the return buffer for the structure is being written by one, the other call can be overwriting it to completely destroy the first writing. When the in execution instance of the function (the interrupted one, not the interrumpting one) gets control again, all it's data has been overwritten by the interrupting one, and destroyed it.
A common solution to solve this problem with interruptions is to disable interrupts while the function is executing. This way it doesn't get interrupted by a new call to itself.
If two threads call the same piece of code, and all the parameters and local variables are stored in the stack, each thread has a copy of its own data, so there's no problem in calling both at the same time, as the data they touch is in different stacks. This will not happen with static variables, being those local scope, compilation unit scope or global scope (think that the problem comes when calling the same piece of code, so everywhere one call has access to, the other has also)
Static data, like buffers (look at stdio buffered packages) etc. means in general, the routines will not be reentrant.
I've come across at_quick_exit and quick_exit while going over stdlib.h and looking for functions that I haven't implemented.
I don't understand the point of having these two functions. Do they have any practical usage?
Basically it exists in C because of C++. The relevant document from WG 14 C standard committe can be found here.
The document was adapted from the paper accepted by the C++ standard. The idea behind quick_exit is to exit the program without canceling all threads and without executing destructors of static objects. C doesn't has language support for such things as "destructors" at all and the thread support library in C is almost nowhere implemented. The at_quick_exit and quick_exit functions have very little to no meaning at all in C.
In C there is a function _Exit that causes normal program termination to occur and control to be returned to the host environment, but is not required to flush open file descriptors, write unbuffered data, close open files, as opposed to exit(). Basically the at_quick_exit and quick_exit functions are facilities build to run custom user handles and then execute _Exit, while atexit is a facility to execute custom handlers upon calling exit().
They essentially have no practical usage. The intent seems to be that a function that may have significant nontrivial atexit handlers could use quick_exit to exit with just a minimal subset of such handlers (that it defines by calling at_quick_exit) being called, under conditions where calling all the atexit handlers may not be safe. It may also be called from a signal handler, but it doesn't seem like there'd be anything meaningful you could do from the at_quick_exit handlers in that case.
It is said that you should only call asynchronous-safe functions inside a signal handler. My question is, what constitutes asynchronous-safeness? A function which is both reentrant and thread safe is asynchronous-safe I guess? Or No?
Re-entrance and thread safety has a little or nothing to do with this. Side effects, state and interruption of those functions are facts that matter.
asynchronous-safe function [GNU Pth]
A function is asynchronous-safe,
or asynchronous-signal safe, if it can be called safely and without
side effects from within a signal handler context. That is, it must be
able to be interrupted at any point to run linearly out of sequence
without causing an inconsistent state. It must also function properly
when global data might itself be in an inconsistent state. Some
asynchronous-safe operations are listed here:
call the signal() function to reinstall a signal handler
unconditionally modify a volatile sig_atomic_t variable (as
modification to this type is atomic)
call the _Exit() function to
immediately terminate program execution
invoke an asynchronous-safe
function, as specified by your implementation
Few functions are
portably asynchronous-safe. If a function performs any other
operations, it is probably not portably asynchronous-safe.
A rule of thumb is this - only signal some condition variable from signal handler (such as futex/pthread condition, wake up epoll loop etc.).
UPDATE:
As EmployedRussian suggested, even calling pthread_cond_signal is a bad idea. I've checked the source code of the recent eglibc and it has lock/unlock pair in there. Thus, introducing a possibility for a deadlock. This leaves us with few options to signal other threads:
Using eventfd.
Changing global atomic variable and hope that SA_RESTART is not set and other threads will check our atomic.
For your own code, yes, re-entrant and thread-safe are the characteristics you need, as, depending on how you set up your signal handling mechanism, your signal handler may itself be interrupted by another signal. In general, try to do as little work as possible inside the signal handler. Setting flags to trigger special code in your normal program flow is probably all you should be doing.
For functions in the OS that you might call, check out man 7 signal for a list of what is safe to call. Note that malloc() and free() are not on the list. The pthread synchronization APIs are not on the list either, but I would think that some would have to be safe to call, so you can set a global flag safely in a signal handler.
This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
what is the purpose of atexit function?
In UNIX at least: I'm aware that C/C++ can register a number of functions to be called at the exit of main - exit handlers. Thee functions to be called can be registered, in reverse order, using:
int atexit(void (*func) (void));
I'm having trouble determining how this would be useful though. The functions are void/void and global, so they are unlikely to have access to many variables around the program unless the variables are also globals. Can someone let me know the kinds of things you would do with exit handlers?
Also, do exit handlers work the same way on non-UNIX platforms since they're part of an ANSI C specification?
You can perform cleanup for global objects in a atexit handler:
static my_type *my_object;
static void my_object_release() { free(my_object); }
my_type *get_my_object_instance()
{
if (!my_object)
{
my_object = malloc(sizeof(my_type));
...
atexit(my_object_release);
}
return my_object;
}
If you want to be able to close over some variables in an atexit-like handler, you can devise your own data structure containing cleanup function/parameter pairs, and register a single atexit handler calling all the said functions with their corresponding arguments.
Exit handler allows a library to do shutdown cleanup (thus of global data structure) without the main program being aware of that need. Two examples of things I've done in an exit handler:
restoring tty flags
closing correctly a network connection so that the peer hadn't to wait a time out
Your probably can think of other use.
The most obvious problem that an atexit handler solves, is tidy up for global objects. This is a C feature and of course C doesn't have automatic deallocation like C++. If you have access to the implementation of main you can write your own such code, but otherwise atexit can be helpful.
Read this blog post on Start and Termination in C++:
When a program is terminating it needs to do some finishing touches
like saving data to a file that will be used in the next session. In
this light each program has a particular set of things to do depending
on the purpose of the program (when closing). Any of such things done
is done by one of the functions whose pointer would be argument to the
atexit function.
The purpose of the atexit function is to register (record in memory)
the functions for these finishing touches. When the atexit function
executes using any of the pointers to these functions as argument the
pointed function is registered. This has to be done before the C++
program reaches its termination phase.
Read more: http://www.bukisa.com/articles/356786_start-and-termination-in-c#ixzz1WdWVl4TF
What is the difference between a re-entrant function and a thread safe function?
Re-entrant means no global state (local only).
Thread safe means it is not possible for 2 (or more) threads to conflict with each other (by writing conflicting values).
A thread-safe function can be called simultaneously from multiple
threads, even when the invocations use shared data, because all
references to the shared data are serialized.
A reentrant function can
also be called simultaneously from multiple threads, but only if each
invocation uses its own data.
Hence, a thread-safe function is always reentrant, but a reentrant
function is not always thread-safe.
The difference can be cottoned on with the example,
A class is said to be reentrant if its member functions can be called
safely from multiple threads, as long as each thread uses a different
instance of the class. The class is thread-safe if its member
functions can be called safely from multiple threads, even if all the
threads use the same instance of the class.
Source: Qt
Did you check the wiki article on the subject. It explains it well so please see that for a full discussion.
A few relevant bits from the article:
In computing, a computer program or subroutine is called reentrant if it can be interrupted in the middle of its execution, and then be safely called again ("re-entered") before its previous invocations complete execution. The interruption could be caused by an internal action such as a jump or call, or by an external action such as a hardware interrupt or signal. Once the reentered invocation completes, the previous invocations will resume correct execution.
and
This definition of reentrancy differs from that of thread-safety in multi-threaded environments. A reentrant subroutine can achieve thread-safety, but being reentrant alone might not be sufficient to be thread-safe in all situations. Conversely, thread-safe code does not necessarily have to be reentrant (see below for examples).