Is malloc thread-safe? - c

Is the malloc() function re-entrant?

Question: "is malloc reentrant"?
Answer: no, it is not. Here is one definition of what makes a routine reentrant.
None of the common versions of malloc allow you to re-enter it (e.g. from a signal handler). Note that a reentrant routine may not use locks, and almost all malloc versions in existence do use locks (which makes them thread-safe), or global/static variables (which makes them thread-unsafe and non-reentrant).
All the answers so far answer "is malloc thread-safe?", which is an entirely different question. To that question the answer is it depends on your runtime library, and possibly on the compiler flags you use. On any modern UNIX, you'll get a thread-safe malloc by default. On Windows, use /MT, /MTd, /MD or /MDd flags to get thread-safe runtime library.

I read somewhere that if you compile with -pthread, malloc becomes thread safe. I´m pretty sure its implementation dependant though, since malloc is ANSI C and threads are not.
If we are talking gcc:
Compile and link with -pthread and
malloc() will be thread-safe, on x86
and AMD64.
http://groups.google.com/group/comp.lang.c.moderated/browse_thread/thread/2431a99b9bdcef11/ea800579e40f7fa4
Another opinion, more insightful
{malloc, calloc, realloc, free,
posix_memalign} of glibc-2.2+ are
thread safe
http://linux.derkeiler.com/Newsgroups/comp.os.linux.development.apps/2005-07/0323.html

This is quite old question and I want to bring freshness according current state of things.
Yes, currently malloc() is thread-safe.
From the GNU C Library Reference Manual of glibc-2.20 [released 2014-09-07]:
void * malloc (size_t size)
Preliminary: MT-Safe | ...
...
1.2.2.1 POSIX Safety Concepts:
... MT-Safe or Thread-Safe functions are safe to call in the presence
of other threads. MT, in MT-Safe, stands for Multi Thread.
Being MT-Safe does not imply a function is atomic, nor that it uses
any of the memory synchronization mechanisms POSIX exposes to users.
It is even possible that calling MT-Safe functions in sequence does
not yield an MT-Safe combination. For example, having a thread call
two MT-Safe functions one right after the other does not guarantee
behavior equivalent to atomic execution of a combination of both
functions, since concurrent calls in other threads may interfere in a
destructive way.
Whole-program optimizations that could inline functions across library
interfaces may expose unsafe reordering, and so performing inlining
across the GNU C Library interface is not recommended. The documented
MT-Safety status is not guaranteed underwhole-program optimization.
However, functions defined in user-visible headers are designed to be
safe for inlining.

Yes, under POSIX.1-2008 malloc is thread-safe.
2.9.1 Thread-Safety
All functions defined by this volume of POSIX.1-2008 shall be thread-safe, except that the following functions1 need not be thread-safe.
[ a list of functions that does not contain malloc ]

Here is an excerpt from malloc.c of glibc :
Thread-safety: thread-safe unless NO_THREADS is defined
assuming NO_THREADS is not defined by default, malloc is thread safe at least on linux.

If you are working with GLIBC, the answer is: Yes, BUT.
Specifically, yes, BUT, please, please be aware that while malloc and free are thread-safe, the debugging functions are not.
Specifically, the extremely useful mtrace(), mcheck(), and mprobe() functions are not thread-safe. In one of the shortest, straightest answers you will ever see from a GNU project, this is explained here:
https://sourceware.org/bugzilla/show_bug.cgi?id=9939
You will need to consider alternate techniques, such as ElectricFence, valgrind, dmalloc, etc.
So, if you mean, "are the malloc() and free() functions threadsafe", the answer is yes. But if you mean, "is the entire malloc/free suite threadsafe", the answer is NO.

Short answer: yes, as of C11, which is the first version of the C standard that includes the concept of threads, malloc and friends are required to be thread-safe. Many operating systems that included both threads and a C runtime made this guarantee long before the C standard did, but I'm not prepared to swear to all. However, malloc and friends are not and never have been required to be reentrant.
That means, it is safe to call malloc and free from multiple threads simultaneously and not worry about locking, as long as you aren't breaking any of the other rules of memory allocation (e.g. call free once and only once on each pointer returned by malloc). But it is not safe to call these functions from a signal handler that might have interrupted a call to malloc or free in the thread handling the signal. Sometimes, using functionality beyond ISO C, you can guarantee that the thread handling the signal did not interrupt a call to malloc or free, e.g. with sigprocmask and sigpause, but try not to do that unless you have no other option, because it's hard to get perfectly right.
Long answer with citations: The C standard added a concept of threads in the 2011 revision (link is to document N1570, which is the closest approximation to the official text of the 2011 standard that is publicly available at no charge). In that revision, section 7.1.4 paragraph 5 states:
Unless explicitly stated otherwise in the detailed descriptions that follow, library functions shall prevent data races as follows: A library function shall not directly or indirectly access objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's arguments. A library function shall not directly or indirectly modify objects accessible by threads other than the current thread unless the objects are accessed directly or indirectly via the function's non-const arguments. Implementations may share their own internal objects between threads if the objects are not visible to users and are protected against data races.
[footnote: This means, for example, that an implementation is not permitted to use a static object for internal purposes without synchronization because it could cause a data race even in programs that do not explicitly share objects between threads. Similarly, an implementation of memcpy is not permitted to copy bytes beyond the specified length of the destination object and then restore the original values because it could cause a data race if the program shared those bytes between threads.]
As I understand it, this is a long-winded way of saying that the library functions defined by the C standard are required to be thread-safe (in the usual sense: you can call them from multiple threads simultaneously, without doing any locking yourself, as long as they don't end up clashing on the data passed as arguments) unless the documentation for a specific function specifically says it isn't.
Then, 7.22.3p2 confirms that malloc, calloc, realloc, aligned_alloc, and free in particular are thread-safe:
For purposes of determining the existence of a data race, memory allocation functions behave as though they accessed only memory locations accessible through their arguments and not other static duration storage. These functions may, however, visibly modify the storage that they allocate or deallocate. A call to free or realloc that deallocates a region p of memory synchronizes with any allocation call that allocates all or part of the region p. This synchronization occurs after any access of p by the deallocating function, and before any such access by the allocating function.
Contrast what it says about strtok, which is not and never has been thread-safe, in 7.24.5.8p6:
The strtok function is not required to avoid data races with other calls to the strtok function.
[footnote: The strtok_s function can be used instead to avoid data races.]
(comment on the footnote: don't use strtok_s, use strsep.)
Older versions of the C standard said nothing whatsoever about thread safety. However, they did say something about reentrancy, because signals have always been part of the C standard. And this is what they said, going back to the original 1989 ANSI C standard (this document has nigh-identical wording to, but very different section numbering from, the ISO C standard that came out the following year):
If [a] signal occurs other than as the result of calling the abort
or raise function, the behavior is undefined if the signal handler
calls any function in the standard library other than the signal
function itself or refers to any object with static storage duration
other than by assigning a value to a static storage duration variable
of type volatile sig_atomic_t . Furthermore, if such a call to the
signal function results in a SIG_ERR return, the value of errno is
indeterminate.
Which is a long-winded way of saying that C library functions are not required to be reentrant as a general rule. Very similar wording still appears in C11, 7.14.1.1p5:
If [a] signal occurs other than as the result of calling the abort or raise function, the behavior is undefined if the signal handler refers to any object with static or thread storage duration that is not a lock-free atomic object other than by assigning a value to an object declared as volatile sig_atomic_t, or the signal handler calls any function in the standard library other than the abort function, the _Exit function, the quick_exit function, or the signal function with the first argument equal to the signal number corresponding to the signal that caused the invocation of the handler. Furthermore, if such a call to the signal function results in a SIG_ERR return, the value of errno is indeterminate.
[footnote: If any signal is generated by an asynchronous signal handler, the behavior is undefined.]
POSIX requires a much longer, but still short compared to the overall size of the C library, list of functions to be safely callable from an "asynchronous signal handler", and also defines in more detail the circumstances under which a signal might "occur other than as the result of calling the abort or raise function." If you're doing anything nontrivial with signals, you are probably writing code intended to be run on an OS with the Unix nature (as opposed to Windows, MVS, or something embedded that probably doesn't have a complete hosted implementation of C in the first place), and you should familiarize yourself with the POSIX requirements for them, as well as the ISO C requirements.

I suggest reading
§31.1 Thread Safety (and Reentrancy Revisited)
of the book The Linux Programming Interface, it explains the difference between thread safety and reentrancy, as well as malloc.
Excerpt:
A function is said to be thread-safe if it can safely be invoked by
multiple threads at the same time; put conversely, if a function is
not thread-safe, then we can’t call it from one thread while it is
being executed in another thread.
....
This function illustrates the typical reason that a function is not
thread-safe: it employs global or static variables that are shared by all threads.
...
Although the use of critical sections to implement thread safety is a significant
improvement over the use of per-function mutexes, it is still somewhat inefficient
because there is a cost to locking and unlocking a mutex. A reentrant function
achieves thread safety without the use of mutexes. It does this by avoiding the use
of global and static variables.
...
However, not all functions can
be made reentrant. The usual reasons are the following:
By their nature, some functions must access global data structures. The functions in the malloc library provide a good example. These functions maintain a
global linked list of free blocks on the heap. The functions of the malloc library
are made thread-safe through the use of mutexes.
....
Definitely worth a read.
And to answer your question, malloc is thread safe but not reentrant.

It depends on which implementation of the C runtime library you're using. If you're using MSVC for example then there's a compiler option which lets you specify which version of the library you want to build with (i.e. a run-time library that supports multi-threading by being tread-safe, or not).

No, it is not thread-safe. There may actually be a malloc_lock() and malloc_unlock() function available in your C library. I know that these exist for the Newlib library. I had to use this to implement a mutex for my processor, which is multi-threaded in hardware.

malloc and free are not reentrant, because they use a static data structure which records what memory blocks are free. As a result, no library functions that allocate or free memory are reentrant.

No, it is not.
Web archive link (original has gone dead)

Related

Is stat(2) thread safe?

Many functions of the C library are clearly marked as thread-safe or not thread-safe. For example, when I look at the manual of gmtime(3) there is a nice table that shows which of these functions are thread-safe and which aren't.
Looking at the manual page of the stat(2) function, it doesn't say one way or the other. Are functions supposed to be thread safe unless we are told otherwise?
Reading up the POSIX Safety Concept did not really clearly state that a function not marked as unsafe is safe. Maybe I missed a sentence somewhere?
The POSIX page on Thread Safety says that all functions are thread-safe except the ones listed there. stat() is not in the list, nor are any of the variants (lstat(), fstat_at(), fstat()). So it should be thread-safe.
The gmtime routines return a static pointer, which means it can be overwritten by other calls.
Stat does not return a pointer, you are supplying it with the structure to fill in. Therefore it cannot be overwritten.

Microsoft C Run Time function implementation - now and before

I am trying to write some portable code and I've been thinking how did Microsoft implement old C runtime routines like gmtime or fopen, etc which were returning a pointer, opposite to todays gmtime_s or fopen_s which requires object to passed and are returning some errno status code (I guess).
One way would be to create static (better than global) object inside such routines and return pointer to it, but if one object is currently using this static pointer and another object invokes that routine, first object would get changed buffer - which is not good.
Furthermore, I doubt that such routines uses dynamic memory because that would lead to memory leaks.
As with other Microsoft stuff, implementation is not opened so that I can take a peak. Any suggestions?
Well, first, such globals and statics cannot be used anyway because of thread-safety.
The use of dynamic memory, or arrays, or arrays of handles, or other such combos DO leak resources if the programmer misuses them. On non-trivial OS, such resources are linked to the process and are released upon process termination, so it is a serious problem for the app, but not for the OS.
Regarding gmtime, you are correct; it could have operated upon a variable that has static storage duration (which is the same storage duration as variables declared "globally", btw... There is no "global" in C). Historically speaking, you should probably assume this is the case, because C doesn't require that there be any support for multithreading. If you're referring to an era where there was decent support for multithreading, it's probable that gmtime might return something that has thread specific storage duration, instead, as the MSDN documentation for gmtime says gmtime and other similar functions "... all use one common tm structure per thread for the conversion."
However, fopen is a function that creates resources, and as a result it's reasonable to expect that every return value will be unique (unless it's an erroneous return value).
Indeed, fopen does constitute dynamic management; you are expected to call fclose to close the FILE once you're done with it... If you forget to close a file every now and then, there is no need to panic, as the C standard requires that the program close all FILEs that are still open upon program termination. This implies that the program keeps track of all of your FILEs behind the scenes.
However, it would obviously be a bad practice to repeatedly leak file descriptors, over and over again, constantly, for a long period of time.
I'm not sure about the Visual Studio specifics, but these function libraries are typically implemented as opaque type. Which is why they return a pointer and why you can't know the contents of the FILE struct.
Meaning there will either be a static memory pool or a call to malloc inside the function. There are no guarantees of the C library functions are re-entrant.
Calling fopen without having a corresponding fclose might indeed create a memory leak: at any rate you have a "resource leak". Therefore, make sure that you always call fclose.
As for the implementation details: you can't have the Visual Studio source code, but you could download

Are functions in the C standard library thread safe?

Where can I get a definitive answer, whether my memcpy (using the eglibc implementation that comes with Ubuntu) is thread safe? - Honestly, I really did not find a clear YES or NO in the docs.
By the way, with "thread safe" I mean it is safe to use memcpy concurrently whenever it would be safe to copy the date byte for byte concurrently. This should be possible at least if read-only data are copied to regions that do not overlap.
Ideally I would like to see something like the lists at the bottom of this page in the ARM compiler docs.
You can find that list here, at chapter 2.9.1 Thread-Safety : http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09_01
That is, this is a list over functions that posix does not require to be thread safe. All other functions are required to be thread safe. Posix includes the standard C library and the typical "unix" interfaces. (Full list here, http://pubs.opengroup.org/onlinepubs/9699919799/functions/contents.html)
memcpy() is specified by posix, but not part of the list in 2.9.1, and can therefore be considered thread safe.
The various environments on linux at least tries to implement posix to the best of its abilities - The functions on linux/glibc might be thread-safe even if posix doesn't require it to be - though this is rarely documented. For other functions/libraries than what posix covers, you are left with what their authors have documented...
From what I can tell, posix equates thread safety with reentrancy, and guarantees there is no internal data races. You, however, are responsible for the possible external data races - such as protecting yourself from calling e.g. memcpy() with memory that might be updated concurrently.
It depends on the function, and how you use it.
Take for example memcpy, it is generally thread safe, if you copy data where both source and destination is private to a single thread. If you write to data that can be read from/written to by another thread, it's no longer thread safe and you have to protect the access.
If a glibc function is not thread-safe then the man page will say so, and there will (most likely) be a thread safe variant also documented.
See, for example, man strtok:
SYNOPSIS
#include
char *strtok(char *str, const char *delim);
char *strtok_r(char *str, const char *delim, char **saveptr);
The _r (for "reentrant") is the thread-safe variant.
Unfortunately, the man pages do not make a habit of stating that a function is thread safe, but only mention thread-safety when it is an issue.
As with all functions, if you give it a pointer to a shared resource then it will become thread-unsafe. It is up to you to handle locking.

Where can I find the list of non-reentrant functions provided in gnu libc?

I am now porting an single-threaded library to support multi-threads, and I need the whole list of functions that use local static or global variables.
Any information is appreciated.
Check the manual page for each function you use ... the non-thread-safe ones will be identified as such, and the manual page will mention a thread safe version when there is one (e.g., readdir_r). You could extract the list by running a script over the man pages.
Edit: Although my answer has been accepted, I fear that it is inaccurate and possibly dangerous. For example, while strerror_r mentions that it is a thread safe version of strerror, strerror itself says nothing about thread safety ... what it says instead is "the string might be overwritten", which merely implies that it isn't thread-safe. So you need to search for at least "might be overwritten" as well as "thread", but there's no guarantee that even that will be complete.
Its always a good idea to know if a particular function is reentrant or not, but you must also consider the situation when you may call several reentrant functions from a shared piece of code from multiple threads, which could also lead to problems when using shared data.
So, if you have any data shared between threads, the data must be "protected" irregardless of the fact that the functions being called are reentrant.
Consider the following function:
void yourFunc(CommonObject *o)
{
/* This function is NOT thread safe */
reentrant_func1(o->propertyA);
reentrant_func2(o->propertyA);
}
If this function is not mutex protected, you will get undesired behavior in a multithreaded application, irregardless of the fact that func1 and func2 are reentrant.

Is getpwnam_r() reentrant a requirement?

getpwnam_r() is reentrant according a number of manpages. However, the standard only state
The getpwnam_r() function is thread-safe and returns values in a user-supplied buffer instead of possibly using a static data area that may be overwritten by each call.
I am confused. Must a NSS Module's ...getpwnam_r() function reentrant? Or just thread-safe is enough?
Well, as you note the standard requires that the function must be thread-safe. That doesn't prevent an implementation from providing a stricter guarantee.
IOW, portable software cannot assume that getpwnam_r is reentrant. But, if you care only about some specific platform which guarantees that it's reentrant, then presumably you can assume that.

Resources