Does a page fault cause a thread context switch on Linux? - c

If a thread suffers a major fault while trying to read from an address, and the data must be swapped in from "disk", does Linux take advantage of that to run another waiting thread, if there is one?
From what I've read online, the answer is yes. But I haven't seen anything conclusive.

That depends on the scheduler you use. In general, the answer is yes, unless the disk operation is sufficiently fast or unless the kernel has another reason not to swap in a different process.


Calling convention which only allows one instance of a function at a time

Say I have multiple threads and all threads call the same function at approximately the same time.
Is there a calling convention which would only allow one instance of the function at any time? What I mean is that the function called by the second thread would only start after the function called by the first thread had returned.
Or are these calling conventions compiler specific? I don't have a whole lot of experience using them.
(Skip to the bottom if you don't care about the threading mumbo-jumbo)
As mentioned before, this is not a "calling convention" but a general problem of computing: concurrency. And the particular case where two or more threads can enter a shared zone at a time, and have a different outcome, is called a race condition (and also extends to/from electronics, and other areas).
The hard thing about threading is that computing is such a deterministic affair, but when threading gets involved, it adds a degree of uncertainty, which vary per platform/OS.
A one-thread affair would guarantee that it can do all tasks in the same order, always, but when you got multiple threads, and the order depends on how fast they can complete a task, shared other applications wanting to use the CPU, then the underlying hardware affects the results.
There's not much of a "sure fire way to do threading", as there's techniques, tools and libraries to deal with individual cases.
Locking in
The most well known technique is using semaphores (or locks), and the most well known semaphore is the mutex one, which only allows one thread at a time to access a shared space, by having a sort of "flag" that is raised once a thread has entered.
if (locked == NO)
locked = YES;
// Do ya' thing
locked = NO;
The code above, although it looks like it could work, it would not guarantee against cases where both threads pass the if () and then set the variable (which threads can easily do). So there's hardware support for this kind of operation, that guarantees that only one thread can execute it: The testAndSet operation, that checks and then, if available, sets the variable. (Here's the x86 instruction from the instruction set)
On the same vein of locks and semaphores, there's also the read-write lock, that allows multiple readers and one writer, specially useful for things with low volatility. And there's many other variations, some that limit an X amount of threads and whatnot.
But overall, locks are lame, since they are basically forcing serialisation of multi-threading, where threads actually need to get stuck trying to get a lock (or just testing it and leaving). Kinda defeats the purpose of having multiple threads, doesn't it?
The best solution in terms of threading, is to minimise the amount of shared space that threads need to use, possibly, elmininating it completely. Maybe use rwlocks when volatility is low, try to have "try and leave" kind of threads, that check if the lock is up, and then go away if it isn't, etc.
As my OS teacher once said (in Zen-like fashion): "The best kind of locking is the one you can avoid".
Thread Pools
Now, threading is hard, no way around it, that's why there are patterns to deal with such kind of problems, and the Thread Pool Pattern is a popular one, at least in iOS since the introduction of Grand Central Dispatch (GCD).
Instead of having a bunch of threads running amok and getting enqueued all over the place, let's have a set of threads, waiting for tasks in a "pool", and having queues of things to do, ideally, tasks that shouldn't overlap each other.
Now, the thread pattern doesn't solve the problems discussed before, but it changes the paradigm to make it easier to deal with, mentally. Instead of having to think about "threads that need to execute such and such", you just switch the focus to "tasks that need to be executed" and the matter of which thread is doing it, becomes irrelevant.
Again, pools won't solve all your problems, but it will make them easier to understand. And easier to understand may lead to better solutions.
All the theoretical things above mentioned are implemented already, at POSIX level (semaphore.h, pthreads.h, etc. pthreads has a very nice of r/w locking functions), try reading about them.
(Edit: I thought this thread was about Obj-C, not plain C, edited out all the Foundation and GCD stuff)
Calling convention defines how stack & registers are used to implement function calls. Because each thread has its own stack & registers, synchronising threads and calling convention are separate things.
To prevent multiple threads from executing the same code at the same time, you need a mutex. In your example of a function, you'd typically put the mutex lock and unlock inside the function's code, around the statements you don't want your threads to be executing at the same time.
In general terms: Plain code, including function calls, does not know about threads, the operating system does. By using a mutex you tap into the system that manages the running of threads. More details are just a Google search away.
Note that C11, the new C standard revision, does include multi-threading support. But this does not change the general concept; it simply means that you can use C library functions instead of operating system specific ones.

Making process survive failure in its thread

I'm writing app that has many independant threads. While I'm doing quite low level, dangerous stuff there, threads may fail (SIGSEGV, SIGBUS, SIGFPE) but they should not kill whole process. Is there a way to do it proper way?
Currently I intercept aforementioned signals and in their signal handler then I call pthread_exit(NULL). It seems to work but since pthread_exit is not async-signal-safe function I'm a bit concerned about this solution.
I know that splitting this app into multiple processes would solve the problem but in this case it's not an feasible option.
EDIT: I'm aware of all the Bad Thingsā„¢ that can happen (I'm experienced in low-level system and kernel programming) due to ignoring SIGSEGV/SIGBUS/SIGFPE, so please try to answer my particular question instead of giving me lessons about reliability.
The PROPER way to do this is to let the whole process die, and start another one. You don't explain WHY this isn't appropriate, but in essence, that's the only way that is completely safe against various nasty corner cases (which may or may not apply in your situation).
I'm not aware of any method that is 100% safe that doesn't involve letting the whole process. (Note also that sometimes just the act of continuing from these sort of errors are "undefined behaviour" - it doesn't mean that you are definitely going to fall over, just that it MAY be a problem).
It's of course possible that someone knows of some clever trick that works, but I'm pretty certain that the only 100% guaranteed method is to kill the entire process.
Low-latency code design involves a careful "be aware of the system you run on" type of coding and deployment. That means, for example, that standard IPC mechanisms (say, using SysV msgsnd/msgget to pass messages between processes, or pthread_cond_wait/pthread_cond_signal on the PThreads side) as well as ordinary locking primitives (adaptive mutexes) are to be considered rather slow ... because they involve something that takes thousands of CPU cycles ... namely, context switches.
Instead, use "hot-hot" handoff mechanisms such as the disruptor pattern - both producers as well as consumers spin in tight loops permanently polling a single or at worst a small number of atomically-updated memory locations that say where the next item-to-be-processed is found and/or to mark a processed item complete. Bind all producers / consumers to separate CPU cores so that they will never context switch.
In this type of usecase, whether you use separate threads (and get the memory sharing implicitly by virtue of all threads sharing the same address space) or separate processes (and get the memory sharing explicitly by using shared memory for the data-to-be-processed as well as the queue mgmt "metadata") makes very little difference because TLBs and data caches are "always hot" (you never context switch).
If your "processors" are unstable and/or have no guaranteed completion time, you need to add a "reaper" mechanism anyway to deal with failed / timed out messages, but such garbage collection mechanisms necessarily introduce jitter (latency spikes). That's because you need a system call to determine whether a specific thread or process has exited, and system call latency is a few micros even in best case.
From my point of view, you're trying to mix oil and water here; you're required to use library code not specifically written for use in low-latency deployments / library code not under your control, combined with the requirement to do message dispatch with nanosec latencies. There is no way to make e.g. pthread_cond_signal() give you nsec latency because it must do a system call to wake the target up, and that takes longer.
If your "handler code" relies on the "rich" environment, and a huge amount of "state" is shared between these and the main program ... it sounds a bit like saying "I need to make a steam-driven airplane break the sound barrier"...

get_user_pages_fast() from kernel thread

I need to call get_user_pages_fast() from a kernel thread. But get_user_pages_fast() uses current->mm internally, which is set to NULL for kernel thread. Is there any way to get around this? The kernel thread in question is working on behalf of another process, say x, would it be be fine to just set x->mm to current->mm and invoke get_user_pages_fast()?
[EDIT 1]: I verified this and it seems to be working. I am still concerned if it could break in some cases. Any insight is welcome. Thanks.
Your "hack" will indeed work, but let's take a step back and understand what the idea of it is:
When you are in a kernel thread, (And I am talking about a pure kernel thread (child of kthreadd), not a user thread executing in kernel mode, as would be the case of servicing a syscall), there is no user memory to speak of. This is why current->mm is null: There is no "current" user space memory.
When you assign current->mm to x->mm you are "cheating" by annexing the process memory space of the innocent x to be your own. As a consequence, any allocation you perform will be charged to x, and will be visible by x (it is, after all, part of its memory space). Also, there might be internal kernel checks on current->mm which might be tricked, leading to your kernel mode thread to be treated by the kernel as if it were a user mode thread (though arguably other checks rely on KERNEL_DS/USER_DS, which you're not modifying). Still, a concern. This will break if x ever dies (hey - nobody's immortal), and will likely cause an oops, if not a panic altogether.
You haven't said WHY you need to get user pages - if the case is that you know x is alive and you are doing this as part of, say, IPC/shmem, I can see a reason for that. If that is the case, you might want to provide some API for the process in question to "register" with the kernel thread. Otherwise, your solution works, but is.. well, not as neat as it could be.
I'm not convinced this is totally safe. The _fast part of get_user_pages_fast means that acquiring mm->mmap_sem is not required, and part of the reason that works is because it is assumed that we are running within the process itself (so eg the current->mm can't go away completely). Since you're running in another thread, you're susceptible to races if the real process ever does something that changes its mapping.
I guess the question is why can't you just use get_user_pages instead?

Reader-Writer using semaphores and shared memory in C

I'm trying to make a simple reader/writer program using POSIX named semaphores, its working, but on some systems, it halts immediately on the first semaphore and thats it ... I'm really desperate by now. Can anyone help please? Its working fine on my system, so i can't track the problem by ltrace. (sorry for the comments, I'm from czech republic)
POSIX semaphores are not well suited for application code since they are interruptible. Basically any sort of IO to your processes will mess up your signalling. Please have a look at this post.
So you'd have to be really careful to interpret all error returns from the sem_ functions properly. In the code that you posted there is no such thing.
If your implementation of POSIX supports them, just use rwlocks, they are made for this, are much higher level and don't encounter that difficulty.
In computer science, the readers-writers problems are examples of a common computing problem in concurrency. There are at least three variations of the problems, which deal with situations in which many threads try to access the same shared memory at one time. Some threads may read and some may write, with the constraint that no process may access the share for either reading or writing, while another process is in the act of writing to it. (In particular, it is allowed for two or more readers to access the share at the same time.) A readers-writer lock is a data structure that solves one or more of the readers-writers problems.

pthread_mutex_lock locks, but no owner is set

I've been working on this one for a few days -
As a background, I'm working on taking a single-threaded C program and making it multi-threaded. I have recently discovered a new deadlock case, but when I look at the mutex in gdb I see that
__lock=2 yet __owner=0
This is not a recursive mutex. Has anyone seen this? The program I'm working on is a daemon and this case only happens after executing at a high-throughput rate for over 20 minutes (approximately) and then relaxing the load. If you have any ideas I'd be grateful.
Edit - I neglected to mention that all of my other threads are idle at this time.
This is to be expected. A normal (non-recursive, non-errorchecking) mutex has no need to store its owner, and some time can be saved skipping the step of looking up the caller's thread id. (This makes little difference on x86 but can be a huge difference on platforms like MIPS with broken ABIs, where there is no thread register and getting the thread id incurs a fault into kernelspace.)
The deadlock you're seeing it almost certainly due either to the thread trying to lock a mutex it already holds, or an actual logic error where two or more threads are each waiting for mutexes the other holds.
As far as I can tell, this is due to a limitation of the pthread library. Whenever I have found parts of the code that use excessive locking and unlocking and heavily stressed that section of the code, I have had this kind of failure. I have solved them by re-writing these sections to minimize their locking, which is easier code to maintain (less error checking when re-acquiring potentially freed objects) and eliminates some overhead.
I just fixed the issue I was having - stack corruption caused the mutex.__data.__lock value to get set to some ridiculous number (4 billion-ish) just prior to attempting the pthread_mutex_lock call. See if you can set a breakpoint, or print debugging info on the value of __lock just prior to performing the lock operation, and I'm willing to bet it's invalid right before the deadlock occurs.
