I have some code that moves bytes in a buffer using memmove(). The buffer is accessed by multiple threads. I get a very weird behavior; sometimes the buffer it's not what it should be and I was thinking if memmove() or/and malloc() are thread safe. I'm working on iOS (in case this is platform dependent).
In an implementation that provides threads, malloc will normally be thread safe (i.e., it will take steps to assure the heap doesn't get corrupted, even if malloc gets called from multiple threads). The exact way it will do that varies: some use a single heap, and internal synchronization to ensure against corruption. Others will use multiple heaps, so different threads can allocate memory simultaneously without collisions.
memmove will normally be just like if you did assignments in your own code -- if you're sharing a buffer across threads, it's your responsibility to synchronize access to that data.
You should be using a mutex (NSLock) as a protective barrier around accessing your buffer. Take a look at Synchronization in Apple's Threading Programming Guide.
Malloc may be thread-safe. The standard doesn't require it, but many C compilers are used in systems whose applications requires thread safety, and your particular compiler's library may be thread safe, or offer a thread safe option. I don't know about iOS.
Memmove (or any other kind of block move) is not thread safe, any more than an assignment statement is thread safe.
Since the current C standard does not specify threads, it has nothing to say about thread safety. Whenever you have threads, you're dealing with a system that has placed further requirements, beyond the basic C language standard's requirements, on how the standard library functions behave. I'm not sure what requirements iOS makes, but POSIX and Windows both require malloc to be thread-safe, and I'd find it hard to believe any system designed after the mid 90s would not make this requirement.
Note that the upcoming C1x standard will specify threads, and if the implementation has threads, then malloc will be required to be thread-safe.
No, since the standard C library doesn't have a concept of threads, the functions it defines can't be.
Related
Why is malloc() considered a standard C library function and not a system call? It seems like the OS is responsible for handling all memory allocation requests.
It would certainly be possible to implement malloc and free as system calls, but it's rarely if ever done that way.
System calls are calls into the OS kernel. For example, on POSIX systems (Linux, UNIX, ...), read and write are system calls. When a C program calls read, it's probably calling a wrapper that does whatever is needed to make a request to the kernel and then return the result to the caller.
It turns out that the most efficient way to do memory management is to use lower-level system calls (see brk and sbrk) to expand the current process's data segment, and then use library calls (malloc, free, etc.) to manage memory within that segment. That management doesn't require any interaction with the kernel; it's all just pointer manipulation performed within the current process. The malloc function will invoke a system call such as brk or sbrk if it needs more memory than is currently available, but many malloc calls won't require any interaction with the kernel at all.
The above is fairly specific to Linux/POSIX/UNIX systems. The details will be a bit different for Windows for example, but the overall design is likely to be similar.
Note that some C standard library functions are typically implemented directly as system calls. time is one example (but as Nick ODell points out in a comment, a time call can often be performed without interacting with the kernel).
It seems like the OS is responsible for handling all memory allocation requests.
Well, both yes and no
It actually depends more on your specific system than it depend on C.
Most OS allocates memory in trunks of some size. Typically called a page. The page size may differ. And on a specific system there may be several supported page-sizes. 4K is a typical page-size on many systems but huge page much bigger may be supported.
But yes... at the end of the day there is only one entity that can allocate memory. The OS. Unless you are on bare-metal where other code can handle it - if even supported.
Why is malloc() considered a standard C library function and not a system call?
The short answer is: Because malloc isn't a OS/systemcall. Period.
To elaborate a bit more. One malloc call may lead to a systemcall but the next malloc may not.
For instance: You request 100 bytes using malloc. malloc may decide to call the OS. The OS gives you 4K. In your next malloc you request 500 byte. Then the "layer in between" can just give the 500 bytes from the trunk already provided by the previous syscall.
So no... memory allocation via malloc may not lead to any syscall for alocation of more memory.
It's all very dependent on your specific system. And the C standard doesn't care.
But malloc is not a syscall. malloc uses other syscalls when needed.
It seems like the OS is responsible for handling all memory allocation requests.
For performance reasons, it's not a good idea to ask the OS for memory every time the program needs memory. There are a few reasons for this:
The OS manages memory in units called pages. Pages are typically 4096 bytes long. (But some architectures or operating systems use larger pages.) The OS can't allocate memory to a process in a chunk smaller than a page.
Imagine you need 10 bytes to store a string. It would be very wasteful to allocate 4096 bytes and only use the first 10. A memory allocator can ask the OS for a page, and slice that page into smaller allocations.
A system call requires a context switch. A context switch is expensive, (~100 ns on x86 systems) relative to calling a function in the same program. Again, it is better to ask for a larger chunk of memory, and re-use it for many allocations.
Why is malloc() considered a library call and not a system call?
For some library calls, like read() the implementation in the library is very simple: it calls the system call of the same name. One call to the library function read() produces one system call to read(). It's reasonable to describe read() as a system call, because all the work is being done in the kernel.
The story with malloc() is more complicated. There's no system call called malloc(), and the library call malloc() will actually use the system calls sbrk(), brk(), or mmap(), depending on the size of your allocation and the implementation you're using. Much of the time, it makes no system call at all!
There are many different choices in how to implement malloc(). For that reason, you'll see many different competing implementations, such as jemalloc, or tcmalloc.
Why is malloc() considered a standard C library function and not a system call?
Because it's part of the C standard library.
It seems like the OS is responsible for handling all memory allocation
requests.
It's not. An operating system typically allocates some memory space for a given process, but how the memory is used after that is up to the process. Using the standard library for things like memory allocation insulates your code from the details of any given operating system, which makes your code a lot more portable. A given implementation of malloc might ultimately make a system call to obtain memory, but whether it does or doesn't or does some of the time is an implementation detail.
Although this question has been asked in different ways in SO, i will ask it from pthreads perspective in a different way to know the tools to provide synchronization.
We know that each thread has it's own thread stack, but shares the heap and global data. When the heap is shared, i am confused how and which synchronization tool to provide to protect the full heap?
There are two possibilities - either the functions your system provides to deal with the heap (malloc, free, etc.) are thread-safe or they're not.
If they are, no problem - you don't have to do anything.
If they're not, you'll need to write a wrapper function for each one that you want to use and lock appropriately. pthread_mutex_* calls seem appropriate to me.
I have implemented two applications that share data using the POSIX shared memory API (i.e. shm_open). One process updates data stored in the shared memory segment and another process reads it. I want to synchronize the access to the shared memory region using some sort of mutex or semaphore. What is the most efficient way of do this? Some mechanisms I am considering are
A POSIX mutex stored in the shared memory segment (Setting the PTHREAD_PROCESS_SHARED attribute would be required)
Creating a System V semaphore using semget
Rather than a System V semaphore, I would go with a POSIX named semaphore using sem_open(), etc.
Might as well make this an answer.
You can use sem_init with pshared true to create a POSIX semaphore in your shared memory space. I have used this successfully in the past.
As for whether this is faster or slower than a shared mutex and condition variable, only profiling can tell you. On Linux I suspect they are all pretty similar since they rely on the "futex" machinery.
If efficiency is important, I would go with process-shared mutexes and condition variables.
AFAIR, each operation with a semaphore requires a syscall, so uncontended mutex should be faster than the semaphore [ab]used in mutex-like manner.
First, really benchmark to know if performance is important. The cost of these things is often overestimated. So if you don't find that the access to the control structure is of same order of magnitude than the writes, just take whatever construct is semantically the best for your use case. This would be the case usually if you'd have some 100 bytes written per access to the control structure.
Otherwise, if the control structure is the bottleneck, you should perhaps avoid to use them. C11 has the new concept of _Atomic types and operations that can be used in cases where there are races in access to data. C11 is not yet widely implemented but probably all modern compilers have extensions that implement these features already.
is sprintf thread safe ?
//Global log buffer
char logBuffer[20];
logStatus (char * status, int length)
{
snprintf(logBuffer, 19, status);
printf ("%s\n", logBuffer);
}
The thread safety of this function totally depends upon the thread safety of snprintf/sprintf .
Updates :
thanks for ur answers .
i dont mind, if the actual contents gts messed up. but want to confirm that the sprintf would not cause a memory corruption / buffer overflow going beyond 20 bytes in this case, when multiple threads are trying to write to logBuffer ?
There is no problem using snprintf() in multiple threads. But here you are writing to a shared string buffer, which I assume is shared across threads.
So your use of this function would not be thread safe.
Your question has an incorrect premise. Even if sprintf itself can be safely called from multiple threads at the same time (as I sure hope it can), your code is not protecting your global variable. The standard library can't possibly help you there.
You have several problems with your code.
Your usage of snprintf is very suspicious. Don't use it just to
copy a string. Generally don't pass dynamically allocated strings
with whatever content as format to any of the printf functions.
They interpret the contents and if there is anything in them that
resembles a %-format, you are doomed.
Don't use static buffers as you do. This is certainly neither
thread safe not re-entrant.
Either use printf with an appropriate format directly, or replace
the call by puts.
Then, Linux adheres to the POSIX standard, which requires that the standard IO functions are thread safe.
Regarding your update about not worrying if the logBuffer content get garbled:
I'm not sure why you want to avoid making your function completely thread safe by using a locally allocated buffer or some synchronization mechanism, but if you want to know what POSIX has to say about it, here you go (http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap04.html#tag_04_11):
Applications shall ensure that access to any memory location by more than one thread of control (threads or processes) is restricted such that no thread of control can read or modify a memory location while another thread of control may be modifying it. Such access is restricted using functions that synchronize thread execution and also synchronize memory with respect to other threads. [followed by a list of functions which provide synchronization]
So, POSIX says that your program needs to make sure mutilple threads won't be modifying logBuffer concurrently (or modifying logBuffer in one thread while reading it in another). If you don't hold to that, there's no promise made that the worst that will happen is garbled data in logBuffer. There's simply no promise made at all about what the results will be. I don't know if Linux might document a more specific behavior, but I doubt it does.
"There is no problem using snprintf() in multiple threads."
Not true.
Not true, at least in case of POSIX functions.
All of the standard vararg functions are not mt-safe - this includes all the printf() family (1), but also every other variadic function as well (2)
sprintf() for example is: "MT-Safe locale|AS-Unsafe heap|AC-Unsafe mem" - what means, that it can fail if locale is set asynchronously or if asynchronous cancellation of threads is used. In other words, special attention must be paid when using such functions in MT environment.
va_arg is not mt-safe: |MT-Safe race:ap|AS-Safe|AC-Unsafe corrupt| - what means, that inter-locking is needed.
Additionally, what should be obvious, even totally mt-safe function can be used in unsafe way - what happens for example if two or more threads are operating the same data/memory areas.
It's not thread safe, since the buffer where you sprintf is shared between all threads.
"Do you have a refernce which says that they are not thread safe? When I Google, it seems that they are"
My previous answer to this question has been removed/deleted (why?), so I'll try again, using different approach:
AC (async. cancellation of threads): this is obviously a case when almost all of the "apparently MT-safe" code can fail, simply because the thread is interrupted at a random point of time, so none of synchronization methods are guaranted to work correctly (i.e. any form of mutex can't be really guranteed to work correctly)
Threads can use the same malloc() arena, what means, that if one of the threads will fail (i.e. it'll damage the malloc arena) then all the consecutive calls to malloc() will/can cause critical errors - this of course depends on system configuration - but it also means, that nobody should assume that malformed memory (de)allocations are safe.
Since all of the systems are providing the option to use different local settings, it is obvious, that async. change to the "locale" settings can cause errors...
Regards.
As far as I know each thread gets a distinct stack when the thread is created by the operating system. I wonder if each thread has a heap distinct to itself also?
No. All threads share a common heap.
Each thread has a private stack, which it can quickly add and remove items from. This makes stack based memory fast, but if you use too much stack memory, as occurs in infinite recursion, you will get a stack overflow.
Since all threads share the same heap, access to the allocator/deallocator must be synchronized. There are various methods and libraries for avoiding allocator contention.
Some languages allow you to create private pools of memory, or individual heaps, which you can assign to a single thread.
By default, C has only a single heap.
That said, some allocators that are thread aware will partition the heap so that each thread has it's own area to allocate from. The idea is that this should make the heap scale better.
One example of such a heap is Hoard.
Depends on the OS. The standard c runtime on windows and unices uses a shared heap across threads. This means locking every malloc/free.
On Symbian, for example, each thread comes with its own heap, although threads can share pointers to data allocated in any heap. Symbian's design is better in my opinion since it not only eliminates the need for locking during alloc/free, but also encourages clean specification of data ownership among threads. Also in that case when a thread dies, it takes all the objects it allocated along with it - i.e. it cannot leak objects that it has allocated, which is an important property to have in mobile devices with constrained memory.
Erlang also follows a similar design where a "process" acts as a unit of garbage collection. All data is communicated between processes by copying, except for binary blobs which are reference counted (I think).
Each thread has its own stack and call stack.
Each thread shares the same heap.
It depends on what exactly you mean when saying "heap".
All threads share the address space, so heap-allocated objects are accessible from all threads. Technically, stacks are shared as well in this sense, i.e. nothing prevents you from accessing other thread's stack (though it would almost never make any sense to do so).
On the other hand, there are heap structures used to allocate memory. That is where all the bookkeeping for heap memory allocation is done. These structures are sophisticatedly organized to minimize contention between the threads - so some threads might share a heap structure (an arena), and some might use distinct arenas.
See the following thread for an excellent explanation of the details: How does malloc work in a multithreaded environment?
Typically, threads share the heap and other resources, however there are thread-like constructions that don't. Among these thread-like constructions are Erlang's lightweight processes, and UNIX's full-on processes (created with a call to fork()). You might also be working on multi-machine concurrency, in which case your inter-thread communication options are considerably more limited.
Generally speaking, all threads use the same address space and therefore usually have just one heap.
However, it can be a bit more complicated. You might be looking for Thread Local Storage (TLS), but it stores single values only.
Windows-Specific:
TLS-space can be allocated using TlsAlloc and freed using TlsFree (Overview here). Again, it's not a heap, just DWORDs.
Strangely, Windows support multiple Heaps per process. One can store the Heap's handle in TLS. Then you would have something like a "Thread-Local Heap". However, just the handle is not known to the other threads, they still can access its memory using pointers as it's still the same address space.
EDIT: Some memory allocators (specifically jemalloc on FreeBSD) use TLS to assign "arenas" to threads. This is done to optimize allocation for multiple cores by reducing synchronization overhead.
On FreeRTOS Operating system, tasks(threads) share the same heap but each one of them has its own stack. This comes in very handy when dealing with low power low RAM architectures,because the same pool of memory can be accessed/shared by several threads, but this comes with a small catch , the developer needs to keep in mind that a mechanism for synchronizing malloc and free is needed, that is why it is necessary to use some type of process synchronization/lock when allocating or freeing memory on the heap, for example a semaphore or a mutex.