Is hsearch_r thread-safe? - c

Can I use hcreate_r, hsearch_r and hdestroy_r in a thread-safe manner?
Do I have to wrap all calls to it with a mutex lock?

Quoting the manpage of HSEARCH(3):
The hcreate_r(), hsearch_r(), and hdestroy_r() functions are thread-safe.
So no, you do not need to wrap the calls with any sort of locking.
In general, functions suffixed with _r tend to be re-entrant versions of the same function without the _r suffix (such as strtok_r). Their re-entrant nature (usually) makes them intrinsically thread-safe.

Related

Is stat(2) thread safe?

Many functions of the C library are clearly marked as thread-safe or not thread-safe. For example, when I look at the manual of gmtime(3) there is a nice table that shows which of these functions are thread-safe and which aren't.
Looking at the manual page of the stat(2) function, it doesn't say one way or the other. Are functions supposed to be thread safe unless we are told otherwise?
Reading up the POSIX Safety Concept did not really clearly state that a function not marked as unsafe is safe. Maybe I missed a sentence somewhere?
The POSIX page on Thread Safety says that all functions are thread-safe except the ones listed there. stat() is not in the list, nor are any of the variants (lstat(), fstat_at(), fstat()). So it should be thread-safe.
The gmtime routines return a static pointer, which means it can be overwritten by other calls.
Stat does not return a pointer, you are supplying it with the structure to fill in. Therefore it cannot be overwritten.

Where can I find the list of non-reentrant functions provided in gnu libc?

I am now porting an single-threaded library to support multi-threads, and I need the whole list of functions that use local static or global variables.
Any information is appreciated.
Check the manual page for each function you use ... the non-thread-safe ones will be identified as such, and the manual page will mention a thread safe version when there is one (e.g., readdir_r). You could extract the list by running a script over the man pages.
Edit: Although my answer has been accepted, I fear that it is inaccurate and possibly dangerous. For example, while strerror_r mentions that it is a thread safe version of strerror, strerror itself says nothing about thread safety ... what it says instead is "the string might be overwritten", which merely implies that it isn't thread-safe. So you need to search for at least "might be overwritten" as well as "thread", but there's no guarantee that even that will be complete.
Its always a good idea to know if a particular function is reentrant or not, but you must also consider the situation when you may call several reentrant functions from a shared piece of code from multiple threads, which could also lead to problems when using shared data.
So, if you have any data shared between threads, the data must be "protected" irregardless of the fact that the functions being called are reentrant.
Consider the following function:
void yourFunc(CommonObject *o)
{
/* This function is NOT thread safe */
reentrant_func1(o->propertyA);
reentrant_func2(o->propertyA);
}
If this function is not mutex protected, you will get undesired behavior in a multithreaded application, irregardless of the fact that func1 and func2 are reentrant.

Is getpwnam_r() reentrant a requirement?

getpwnam_r() is reentrant according a number of manpages. However, the standard only state
The getpwnam_r() function is thread-safe and returns values in a user-supplied buffer instead of possibly using a static data area that may be overwritten by each call.
I am confused. Must a NSS Module's ...getpwnam_r() function reentrant? Or just thread-safe is enough?
Well, as you note the standard requires that the function must be thread-safe. That doesn't prevent an implementation from providing a stricter guarantee.
IOW, portable software cannot assume that getpwnam_r is reentrant. But, if you care only about some specific platform which guarantees that it's reentrant, then presumably you can assume that.

Are posix regcomp and regexec threadsafe? In specific, on GNU libc?

Two separate questions here really: Can I use regexes in a multithreaded program without locking and, if so, can I use the same regex_t at the same time in multiple threads? I can't find an answer on Google or the manpages.
http://www.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
2.9.1 Thread-Safety
All functions defined by this volume of POSIX.1-2008 shall be thread-safe, except that the following functions1 need not be thread-safe.
...
regexec and regcomp are not in that list, so they are required to be thread-safe.
See also: http://www.opengroup.org/onlinepubs/9699919799/functions/regcomp.html
Part of the rationale text reads:
The interface is defined so that the matched substrings rm_sp and rm_ep are in a separate regmatch_t structure instead of in regex_t. This allows a single compiled RE to be used simultaneously in several contexts; in main() and a signal handler, perhaps, or in multiple threads of lightweight processes.
Can I use regexes in a multithreaded program without locking
Different ones, yes.
can I use the same regex_t at the same time in multiple threads?
In general: If you plan on doing so, you will have to do the locking around the functions, since few data structures do the locking for you.
regexec: Since regexec however takes a const regex_t, executing regexec seems safe for concurrent execution without locking. (After all, this is POSIX.1-2001, where stupid stuff like static buffers as used in the early BSD APIs usually don't occur anymore.)

Is malloc thread-safe?

Is the malloc() function re-entrant?
Question: "is malloc reentrant"?
Answer: no, it is not. Here is one definition of what makes a routine reentrant.
None of the common versions of malloc allow you to re-enter it (e.g. from a signal handler). Note that a reentrant routine may not use locks, and almost all malloc versions in existence do use locks (which makes them thread-safe), or global/static variables (which makes them thread-unsafe and non-reentrant).
All the answers so far answer "is malloc thread-safe?", which is an entirely different question. To that question the answer is it depends on your runtime library, and possibly on the compiler flags you use. On any modern UNIX, you'll get a thread-safe malloc by default. On Windows, use /MT, /MTd, /MD or /MDd flags to get thread-safe runtime library.
I read somewhere that if you compile with -pthread, malloc becomes thread safe. I´m pretty sure its implementation dependant though, since malloc is ANSI C and threads are not.
If we are talking gcc:
Compile and link with -pthread and
malloc() will be thread-safe, on x86
and AMD64.
http://groups.google.com/group/comp.lang.c.moderated/browse_thread/thread/2431a99b9bdcef11/ea800579e40f7fa4
Another opinion, more insightful
{malloc, calloc, realloc, free,
posix_memalign} of glibc-2.2+ are
thread safe
http://linux.derkeiler.com/Newsgroups/comp.os.linux.development.apps/2005-07/0323.html
This is quite old question and I want to bring freshness according current state of things.
Yes, currently malloc() is thread-safe.
From the GNU C Library Reference Manual of glibc-2.20 [released 2014-09-07]:
void * malloc (size_t size)
Preliminary: MT-Safe | ...
...
1.2.2.1 POSIX Safety Concepts:
... MT-Safe or Thread-Safe functions are safe to call in the presence
of other threads. MT, in MT-Safe, stands for Multi Thread.
Being MT-Safe does not imply a function is atomic, nor that it uses
any of the memory synchronization mechanisms POSIX exposes to users.
It is even possible that calling MT-Safe functions in sequence does
not yield an MT-Safe combination. For example, having a thread call
two MT-Safe functions one right after the other does not guarantee
behavior equivalent to atomic execution of a combination of both
functions, since concurrent calls in other threads may interfere in a
destructive way.
Whole-program optimizations that could inline functions across library
interfaces may expose unsafe reordering, and so performing inlining
across the GNU C Library interface is not recommended. The documented
MT-Safety status is not guaranteed underwhole-program optimization.
However, functions defined in user-visible headers are designed to be
safe for inlining.
Yes, under POSIX.1-2008 malloc is thread-safe.
2.9.1 Thread-Safety
All functions defined by this volume of POSIX.1-2008 shall be thread-safe, except that the following functions1 need not be thread-safe.
[ a list of functions that does not contain malloc ]
Here is an excerpt from malloc.c of glibc :
Thread-safety: thread-safe unless NO_THREADS is defined
assuming NO_THREADS is not defined by default, malloc is thread safe at least on linux.
If you are working with GLIBC, the answer is: Yes, BUT.
Specifically, yes, BUT, please, please be aware that while malloc and free are thread-safe, the debugging functions are not.
Specifically, the extremely useful mtrace(), mcheck(), and mprobe() functions are not thread-safe. In one of the shortest, straightest answers you will ever see from a GNU project, this is explained here:
https://sourceware.org/bugzilla/show_bug.cgi?id=9939
You will need to consider alternate techniques, such as ElectricFence, valgrind, dmalloc, etc.
So, if you mean, "are the malloc() and free() functions threadsafe", the answer is yes. But if you mean, "is the entire malloc/free suite threadsafe", the answer is NO.
Short answer: yes, as of C11, which is the first version of the C standard that includes the concept of threads, malloc and friends are required to be thread-safe. Many operating systems that included both threads and a C runtime made this guarantee long before the C standard did, but I'm not prepared to swear to all. However, malloc and friends are not and never have been required to be reentrant.
That means, it is safe to call malloc and free from multiple threads simultaneously and not worry about locking, as long as you aren't breaking any of the other rules of memory allocation (e.g. call free once and only once on each pointer returned by malloc). But it is not safe to call these functions from a signal handler that might have interrupted a call to malloc or free in the thread handling the signal. Sometimes, using functionality beyond ISO C, you can guarantee that the thread handling the signal did not interrupt a call to malloc or free, e.g. with sigprocmask and sigpause, but try not to do that unless you have no other option, because it's hard to get perfectly right.
Long answer with citations: The C standard added a concept of threads in the 2011 revision (link is to document N1570, which is the closest approximation to the official text of the 2011 standard that is publicly available at no charge). In that revision, section 7.1.4 paragraph 5 states:
Unless explicitly stated otherwise in the detailed descriptions that follow, library functions shall prevent data races as follows: A library function shall not directly or indirectly access objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's arguments. A library function shall not directly or indirectly modify objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's non-const arguments. Implementations may share their own internal objects between threads if the objects are not visible to users and are protected against data races.
[footnote: This means, for example, that an implementation is not permitted to use a static object for internal purposes without synchronization because it could cause a data race even in programs that do not explicitly share objects between threads. Similarly, an implementation of memcpy is not permitted to copy bytes beyond the specified length of the destination object and then restore the original values because it could cause a data race if the program shared those bytes between threads.]
As I understand it, this is a long-winded way of saying that the library functions defined by the C standard are required to be thread-safe (in the usual sense: you can call them from multiple threads simultaneously, without doing any locking yourself, as long as they don't end up clashing on the data passed as arguments) unless the documentation for a specific function specifically says it isn't.
Then, 7.22.3p2 confirms that malloc, calloc, realloc, aligned_alloc, and free in particular are thread-safe:
For purposes of determining the existence of a data race, memory allocation functions behave as though they accessed only memory locations accessible through their arguments and not other static duration storage. These functions may, however, visibly modify the storage that they allocate or deallocate. A call to free or realloc that deallocates a region p of memory synchronizes with any allocation call that allocates all or part of the region p. This synchronization occurs after any access of p by the deallocating function, and before any such access by the allocating function.
Contrast what it says about strtok, which is not and never has been thread-safe, in 7.24.5.8p6:
The strtok function is not required to avoid data races with other calls to the strtok function.
[footnote: The strtok_s function can be used instead to avoid data races.]
(comment on the footnote: don't use strtok_s, use strsep.)
Older versions of the C standard said nothing whatsoever about thread safety. However, they did say something about reentrancy, because signals have always been part of the C standard. And this is what they said, going back to the original 1989 ANSI C standard (this document has nigh-identical wording to, but very different section numbering from, the ISO C standard that came out the following year):
If [a] signal occurs other than as the result of calling the abort
or raise function, the behavior is undefined if the signal handler
calls any function in the standard library other than the signal
function itself or refers to any object with static storage duration
other than by assigning a value to a static storage duration variable
of type volatile sig_atomic_t . Furthermore, if such a call to the
signal function results in a SIG_ERR return, the value of errno is
indeterminate.
Which is a long-winded way of saying that C library functions are not required to be reentrant as a general rule. Very similar wording still appears in C11, 7.14.1.1p5:
If [a] signal occurs other than as the result of calling the abort or raise function, the behavior is undefined if the signal handler refers to any object with static or thread storage duration that is not a lock-free atomic object other than by assigning a value to an object declared as volatile sig_atomic_t, or the signal handler calls any function in the standard library other than the abort function, the _Exit function, the quick_exit function, or the signal function with the first argument equal to the signal number corresponding to the signal that caused the invocation of the handler. Furthermore, if such a call to the signal function results in a SIG_ERR return, the value of errno is indeterminate.
[footnote: If any signal is generated by an asynchronous signal handler, the behavior is undefined.]
POSIX requires a much longer, but still short compared to the overall size of the C library, list of functions to be safely callable from an "asynchronous signal handler", and also defines in more detail the circumstances under which a signal might "occur other than as the result of calling the abort or raise function." If you're doing anything nontrivial with signals, you are probably writing code intended to be run on an OS with the Unix nature (as opposed to Windows, MVS, or something embedded that probably doesn't have a complete hosted implementation of C in the first place), and you should familiarize yourself with the POSIX requirements for them, as well as the ISO C requirements.
I suggest reading
§31.1 Thread Safety (and Reentrancy Revisited)
of the book The Linux Programming Interface, it explains the difference between thread safety and reentrancy, as well as malloc.
Excerpt:
A function is said to be thread-safe if it can safely be invoked by
multiple threads at the same time; put conversely, if a function is
not thread-safe, then we can’t call it from one thread while it is
being executed in another thread.
....
This function illustrates the typical reason that a function is not
thread-safe: it employs global or static variables that are shared by all threads.
...
Although the use of critical sections to implement thread safety is a significant
improvement over the use of per-function mutexes, it is still somewhat inefficient
because there is a cost to locking and unlocking a mutex. A reentrant function
achieves thread safety without the use of mutexes. It does this by avoiding the use
of global and static variables.
...
However, not all functions can
be made reentrant. The usual reasons are the following:
By their nature, some functions must access global data structures. The functions in the malloc library provide a good example. These functions maintain a
global linked list of free blocks on the heap. The functions of the malloc library
are made thread-safe through the use of mutexes.
....
Definitely worth a read.
And to answer your question, malloc is thread safe but not reentrant.
It depends on which implementation of the C runtime library you're using. If you're using MSVC for example then there's a compiler option which lets you specify which version of the library you want to build with (i.e. a run-time library that supports multi-threading by being tread-safe, or not).
No, it is not thread-safe. There may actually be a malloc_lock() and malloc_unlock() function available in your C library. I know that these exist for the Newlib library. I had to use this to implement a mutex for my processor, which is multi-threaded in hardware.
malloc and free are not reentrant, because they use a static data structure which records what memory blocks are free. As a result, no library functions that allocate or free memory are reentrant.
No, it is not.
Web archive link (original has gone dead)

Resources