I have a hard time understanding how Wait Queues works in the Linux kernel, I tried to implement a simple example of Wait Queues but with an unsuccessful attempt. I know that when using wait_event_interruptible(w, condition), the process is put to sleep (TASK_INTERRUPTIBLE) until the condition evaluates to true.
I don't understand how these both things works.
The condition is checked each time the waitqueue wq is woken up.
wake_up has to be called after changing any variable that could change the result of the wait condition.
How can we wake up the process after it put to sleep exactly?
This code goes to sleep and never wakes up.
#include <linux/init.h>
#include <linux/module.h>
#include <linux/sched.h>
// Declare the wait queue with a macro
DECLARE_WAIT_QUEUE_HEAD(wait_queue_t);
// Variables
int i = 5;
static void change_val(int *i)
{
*i = 4;
}
static int __init sys_module_init(void)
{
wait_event_interruptible(wait_queue_t, (i == 4)); // Sleeping until "i" is equal to 4
change_val(&i); // Changing the value of "i"
wake_up(&wait_queue_t); // Waking up the wait_queue
return 0;
}
// ... other kernel module code
Related
I want to write a C program that runs for a specified amount of seconds
say 10 seconds and then exits. The code should set up an interrupt to go
off after a specified amount of time has elapsed.
Here is my attempt. But I am not sure if SIGALRM is the correct way to do it.
Can SIGALRM be called an interrupt?
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
void handler()
{
_exit(0);
}
int main()
{
signal(SIGALRM, handler);
alarm(10);
for (;;); /* You can assume that for(;;); is just a dummy code. The main idea is to insert something into code. Whatever code it may be so that it stops after 10 seconds – */
return 0;
}
Any suggestions/alternatives/better way to achieve this?
The wording "signal" vs. "interrupt" is not fully clear. Signals can interrupt system calls, so a signal is an interrupt in this sense. But a signal is not a hardware interrupt. Whan you use an operating system, normal programs often don't have direct access to hardware interrupts.
Calling _exit from the signal handler might be problematic if your program needs to finish a task or to clean up something.
I suggest to implement a graceful end by setting a flag. Additionally I suggest to use sigaction instead of signal, because the semantics of signal and signal handlers set up with this function is implementation-dependent.
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
static volatile sig_atomic_t timeout = 0;
void handler(int sig)
{
(void) sig;
timeout = 1;
}
int main(void)
{
struct sigaction act;
memset(&act, 0, sizeof(act));
act.sa_handler = handler;
if(sigaction(SIGALRM, act, NULL) < 0)
{
// handle error
}
alarm(10);
while(!timeout /* and maybe other conditions */)
{
// do something, handle error return codes and errno (EINTR)
// check terminate flag as necessary
}
// clean up if necessary
return 0;
}
Explanation (as requested in a comment)
static volatile sig_atomic_t timeout = 0;
sig_atomic_t is a type that guarantees atomic access even in the presence of asynchronous interrupts made by signals. That means an access to the variable cannot be interrupted in between, i.e. the software will never see a partially modified value. (see https://en.cppreference.com/w/c/program/sig_atomic_t)
volatile informs the compiler not to optimize access to the variable. This is necessary because the signal handler may modify the value while the main function is running the loop that is intended to check the flag. Otherwise the compiler might optimize the access out of the loop condition and do it only once before the loop because the variable is never modified inside the loop. (see https://en.cppreference.com/w/c/language/volatile)
I'm porting a software from an embedded computer to a Linux machine. (Ubuntu 14.04 or Raspbian (raspberry pi))
The original program was using setjmp/longjmp to handle timeout and CTRL+C event. It was running on a Microcontroller with a single main (one thread).
I'm trying to have a similar behaviour while using threads (pthreads).
The idea is that I want either a timeout or a CTRL+C to restart an infinite loop.
The original code was doing something like the code below. I don't mind to drop the setjmp/longjmp by something else. (ex: try/catch or signal or pthread_kill, conditional variable, etc..)
Any idea how to implement similar behavior with C/C++ ?
Here is the code which seems to partially work and is probably not recommended/broken:
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <string.h>
#include <unistd.h>
#include <pthread.h>
#include <setjmp.h>
// Define
#define TICK_NS_TIME (10000000) // 0.01 sec = 10 ms (100 times per second)
#define NS_PER_SEC (1000000000) // Nano sec per second.
#define TICK_PER_SEC (NS_PER_SEC/TICK_NS_TIME) // Number of tick per second (Ex:100)
#define TIMEOUT_COUNT (30*TICK_PER_SEC) // 30 seconds timeout (with 100 tick per second)
// Env set/long jmp
#define ENV_SZ (2)
#define ENV_TIMEOUT (0)
#define ENV_CTRLC (1)
static jmp_buf env[ENV_SZ];
// Variables
int timeout_val;
// sig handler.
void signal_handler(int signo)
{
pthread_t self = pthread_self();
printf("Thread %lu in signal handler\n", (long)self);
if (signo == SIGINT) {
longjmp(env[ENV_CTRLC], 1); // Q?: Is it in the same thread ? (Never, Always, Sometimes?)
}
else
{
printf("Other signal received..quitting."); // Ex: kill -9 pid
exit(0);
}
}
// thread timer function
void* timer_function(void* in_param)
{
// Loop approx 100x per second.
for (;;) {
nanosleep((const struct timespec[]){{0, TICK_NS_TIME }}, NULL); // Sleep 10 ms seconds.
if (timeout_val) {
if (!--timeout_val) {
longjmp(env[ENV_TIMEOUT], 1); // longjmp when timer reaches 0. (Q?: Is this valid with multithread?)
}
}
}
}
// main
int main(int argc, char **argv)
{
int i;
int val;
struct sigaction actions;
pthread_t thread;
setvbuf (stdout, NULL, _IONBF, 0); // Make sure stdout is not buffered (ex:printf, etc.)
printf("[Program started]\r\n");
memset(&actions, 0, sizeof(actions));
sigemptyset(&actions.sa_mask);
actions.sa_flags = 0;
actions.sa_handler = signal_handler;
val = sigaction(SIGINT, &actions, NULL);
pthread_create(&thread, NULL, timer_function, NULL); // timer thread for example
printf("[Timer thread started]\r\n");
// setting env.
val = setjmp(env[ENV_TIMEOUT]);
if (val!=0){ printf("[JMP TIMEOUT]\r\n"); }
val = setjmp(env[ENV_CTRLC]);
if (val!=0){ printf("[JMP CTRLC]\r\n"); }
// main loop
timeout_val = TIMEOUT_COUNT;
i = 0;
for (;;)
{
i++;
if (i > 10){ i = 0; printf("[%d]", timeout_val/TICK_PER_SEC); } // Number of seconds before time out.
sleep(1);
printf(".");
}
printf("Main completed\n");
return 0;
}
//Compile: g++ -pthread main.cpp -o main
Suggestion for alternative implementation would be great since I'm new to programming with threads !
setjmp() saves the information required to restore the calling environment. longjmp() can then restore this environment, but only within the same thread.
The C11 standard is explicit about the constraint of having the same thread:
7.13.2.1/2 If there has been no such invocation (i.e: of a previous setjmp), or if the invocation was from another thread of
execution, or if the function containing the invocation of the
setjmp macro has terminated execution in the interim, or if the
invocation of the setjmp macro was within the scope of an identifier
with variably modified type and execution has left that scope in the
interim, the behavior is undefined.
In fact, setjmp/longjmp are generally implemented by saving the stack pointer so that restoring it makes sense only int the same execution context.
Alternative
Unless I've missed something, you use the second thread only to act as a timer. You could instead get rid of your POSIX pthread, and use a timer signal activated with POSIX timer_create().
But be aware that using setjmp/longjmp from a signal handler (so already in your original code for CTRL+C) is tricky, as explained in this SO answer. So you'd consider sigsetjmp/siglongjmp.
For the records: C or C++ ?
Your question is tagged C. But you mention c++ try and catch. So for the sake of completeness:
in C++ setjmp should be replaced by a try/catch and the longjmp by throwing an exception. setjmp/longjmp are supported in C++ only if unwinding the stack wouldn't require invocation of any non-trivial destructor (see C++ standard, 18.10/4).
the exceptions are not propagated across the threads, unless catched and explicitely rethrown using std::rethrow_exception(). It's delicate, so refer to this SO question for for additional details. But it's possible and could solve your issue.
I have written a sample linux device driver code which will create two kernel threads and each will increment a single global variable. I have used wait-queues to perform the task of incrementing the variable, and each thread will wait on the wait queue until a timer expires and each thread is woken up at random.
But problem is when I inserted this module, the whole system is just freezing up, and I have to restart the machine. This is happening every time I inserted the module. I tried debugging the kthread code to see if I am entering dead-lock situation by mistake but I am unable to figure out anything wrong with the code.
Can anyone please tell me what I am doing wrong in the code to get the hang-up situation?
#include <linux/module.h>
#include <linux/kernel.h>
#include <linux/init.h>
#include <linux/errno.h>
#include <linux/semaphore.h>
#include <linux/wait.h>
#include <linux/timer.h>
#include <linux/sched.h>
#include <linux/kthread.h>
spinlock_t my_si_lock;
pid_t kthread_pid1;
pid_t kthread_pid2 ;
static DECLARE_WAIT_QUEUE_HEAD(wqueue);
static struct timer_list my_timer;
int kthread_num;
/* the timer callback */
void my_timer_callback( unsigned long data ){
printk(KERN_INFO "my_timer_callback called (%ld).\n", jiffies );
if (waitqueue_active(&wqueue)) {
wake_up_interruptible(&wqueue);
}
}
/*Routine for the first thread */
static int kthread_routine_1(void *kthread_num)
{
//int num=(int)(*(int*)kthread_num);
int *num=(int *)kthread_num;
char kthread_name[15];
unsigned long flags;
DECLARE_WAITQUEUE(wait, current);
printk(KERN_INFO "Inside daemon_routine() %ld\n",current->pid);
allow_signal(SIGKILL);
allow_signal(SIGTERM);
do{
set_current_state(TASK_INTERRUPTIBLE);
add_wait_queue(&wqueue, &wait);
spin_lock_irqsave(&my_si_lock, flags);
printk(KERN_INFO "kernel_daemon [%d] incrementing the shared data=%d\n",current->pid,(*num)++);
spin_unlock_irqrestore(&my_si_lock, flags);
remove_wait_queue(&wqueue, &wait);
if (kthread_should_stop()) {
break;
}
}while(!signal_pending(current));
set_current_state(TASK_RUNNING);
return 0;
}
/*Routine for the second thread */
static int kthread_routine_2(void *kthread_num)
{
//int num=(int)(*(int*)kthread_num);
int *num=(int *)kthread_num;
char kthread_name[15];
unsigned long flags;
DECLARE_WAITQUEUE(wait, current);
printk(KERN_INFO "Inside daemon_routine() %ld\n",current->pid);
allow_signal(SIGKILL);
allow_signal(SIGTERM);
do{
set_current_state(TASK_INTERRUPTIBLE);
add_wait_queue(&wqueue, &wait);
spin_lock_irqsave(&my_si_lock, flags);
printk(KERN_INFO "kernel_daemon [%d] incrementing the shared data=%d\n",current->pid,(*num)++);
spin_unlock_irqrestore(&my_si_lock, flags);
remove_wait_queue(&wqueue, &wait);
if (kthread_should_stop()) {
break;
}
}while(!signal_pending(current));
set_current_state(TASK_RUNNING);
return 0;
}
static int __init signalexample_module_init(void)
{
int ret;
spin_lock_init(&my_si_lock);
init_waitqueue_head(&wqueue);
kthread_num=1;
printk(KERN_INFO "starting the first kernel thread with id ");
kthread_pid1 = kthread_run(kthread_routine_1,&kthread_num,"first_kthread");
printk(KERN_INFO "%ld \n",(long)kthread_pid1);
if(kthread_pid1< 0 ){
printk(KERN_ALERT "Kernel thread [1] creation failed\n");
return -1;
}
printk(KERN_INFO "starting the second kernel thread with id");
kthread_pid2 = kthread_run(kthread_routine_2,&kthread_num,"second_kthread");
printk(KERN_INFO "%ld \n",(long)kthread_pid2);
if(kthread_pid2 < 0 ){
printk(KERN_ALERT "Kernel thread [2] creation failed\n");
return -1;
}
setup_timer( &my_timer, my_timer_callback, 0 );
ret = mod_timer( &my_timer, jiffies + msecs_to_jiffies(2000) );
if (ret) {
printk("Error in mod_timer\n");
return -EINVAL;
}
return 0;
}
static void __exit signalexample_module_exit(void)
{
del_timer(&my_timer);
}
module_init(signalexample_module_init);
module_exit(signalexample_module_exit);
MODULE_LICENSE("GPL");
MODULE_DESCRIPTION("Demonstrates use of kthread");
You need a call to schedule() in both of your thread functions:
/* In kernel thread function... */
set_current_state(TASK_INTERRUPTIBLE);
add_wait_queue(&wqueue, &wait);
schedule(); /* Add this call here */
spin_lock_irqsave(&my_si_lock, flags);
/* etc... */
Calling set_current_state(TASK_INTERRUPTIBLE) sets the state in the current process' task structure, which allows the scheduler to move the process off of the run queue once it sleeps. But then you have to tell the scheduler, "Okay, I've set a new state. Reschedule me now." You're missing this second step, so the changed flag won't take effect until the next time the scheduler decides to suspend your thread, and there's no way to know how soon that will happen, or which line of your code it's executing when it happens (except in the locked code - that shouldn't be interrupted).
I'm not really sure why it's causing your whole system to lock up, because your system's state is pretty unpredictable. Since the kernel threads weren't waiting for the timer to expire before grabbing locks and looping, I have no idea when you could expect the scheduler to actually take action on the new task struct states, and a lot of things could be happening in the meantime. Your threads are repeatedly calling add_wait_queue(&wqueue, &wait); and remove_wait_queue(&wqueue, &wait);, so who knows what state the wait queue is in by the time your timer callback fires. In fact, since the kernel threads are spinning, this code has a race condition:
if (waitqueue_active(&wqueue)) {
wake_up_interruptible(&wqueue);
}
It's possible that you have active tasks on the waitqueue when the if statement is executed, only to have them emptied out by the time wake_up_interruptible(&wqueue); is called.
By the way, I'm assuming your current goal of incrementing a global variable is just an exercise to learn waitqueues and sleep states. If you ever want to actually implement a shared counter, look at atomic operations instead, and you'll be able to dump the spinlock.
If you decide to keep the spinlock, you should switch to using the DEFINE_SPINLOCK() macro instead.
Also, as I mentioned in my comment, you should change your two kthread_pid variables to be of task_struct * type. You also need a call to kthread_stop(kthread_pid); in your __exit routine for each of the threads you start. kthread_should_stop() will never return true if you don't ever tell them to stop.
I have written the following code to get the understanding of event ordering using pthreads and mutex. main function creates two threads which are associated to functions func1 and func2. Function func1 checks for the value of count and conditionally wait for func2 to signal it. Function func2 increments the count and when count reaches 50000, it signals func1.
Then func1 prints the value of count which is(or should be) at that time 50000.
But in actual output, along with 50000 some other values are also being printed. I am not getting any reason why is it so. What I think is, when func2 signals, func1 wakes up and execute from after the pthread_cond_wait statement, and so it should print only 50000. Please point out where I am wrong and what should be changed to get correct output?
#include <pthread.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
pthread_mutex_t evmutex;
pthread_cond_t evcond;
char a;
int count;
int N = 50000;
void *func1()
{
while(1)
{
pthread_mutex_lock(&evmutex);
if(count < N)
{
pthread_cond_wait(&evcond,&evmutex);
printf("%d\n",count);
count = 0;
}
pthread_mutex_unlock(&evmutex);
}
}
void *func2()
{
while(1)
{
pthread_mutex_lock(&evmutex);
count++;
if(count == N)
{
pthread_cond_signal(&evcond);
}
pthread_mutex_unlock(&evmutex);
}
}
int main ()
{
pthread_t ptd1,ptd2;
pthread_mutex_init(&evmutex,NULL);
pthread_cond_init(&evcond,NULL);
count = 0;
pthread_create(&ptd1,NULL,func1,NULL);
pthread_create(&ptd2,NULL,func2,NULL);
pthread_exit(NULL);
pthread_mutex_destroy(&evmutex);
pthread_cond_destroy(&evcond);
return 0;
}
You've not synchronized with the producer, func2(), and telling it to wait until the consumer, func1(), has processed the condition.
Nothing stops the producer from signalling the condition, re-acquiring the mutex, and incrementing the counter again. pthread_cond_signal doesn't mean your producer will halt and wait for the consumer to process.
This means the producer might increment the counter many times before your consumer gets scheduled and wakes up to print the current number.
You'd need to add another condition variable which the producer waits for after it's incremented the counter to N, and have the consumer signal that when it has processed the counter.
In addition to that, you need to handle spurious wakeups as other answers mentions.
Some implementations of pthread_cond_wait() suffer from spurious wake-ups, and because of this, it's common practice to use a while (cond) { pthread_cond_wait(...); } loop to work around this.
I found a good explanation of the problem and causes here: Why does pthread_cond_wait have spurious wakeups?
I'm trying to develop a program to time limit the execution of a function. In the code below I have a function named Inc which does a lot of iterations (simulated by the infinite loop). The first part of each iteration is quite long, followed by a second part that should be pretty fast.
I don't mind preempting the execution while in the first part of the code, but I'd like to avoid the alarm going off while doing a write operation on the second part.
My first idea was to turn off the alarm before entering the 'safe region' saving the remaining time. Then after exiting, I would set the alarm up with the saved time. I don't know how to implement this. Could someone help me? Alternative methods are also welcome.
#include <pthread.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
pthread_t thread;
FILE *fout;
void *Inc(void *param){
int i;
long long int x = 0;
fout = fopen("file.txt", "w");
/* Large number of iterations */
while(1){
int k = 0;
for(i=0; i<5000000; i++)
k += (rand())%3;
x += k;
printf("%lld\n", x);
/* Enter Safe Region */
fprintf(fout, "%lld\n", x);
/* Exit Safe Region */
}
}
void Finish(int param){
pthread_cancel(thread);
fclose(fout);
}
main (){
pthread_attr_t attr;
void *status;
signal(SIGALRM, Finish);
alarm(10);
pthread_attr_init(&attr);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_JOINABLE);
pthread_create(&thread, &attr, Inc, NULL);
pthread_attr_destroy(&attr);
pthread_join(thread, &status);
printf("Program Finished\n");
}
The obvious thing is to take a lock before calling pthread_cancel, and to hold the same lock across your "safe region".
Unfortunately, you can't wait on a mutex or a semaphore in a signal handler. But alarm signals aren't the only way to do something after 10 seconds - you could instead have your main thread go to sleep for 10 seconds, then wake up, take the lock, cancel the worker thread and then join it.
Of course this would mean that the main thread will sleep 10 seconds even if the worker thread finishes after 5 seconds. So instead of sleeping, have the main thread do a 10 second timed wait on a semaphore, which the worker thread posts when it finishes.
Like a sleep, a timed wait can complete early due to a signal, so be sure to retry the timed wait on EINTR. Your significant cases are EINTR (wait again), success (join the worker thread - no need to cancel since it has posted the semaphore), ETIMEDOUT (take the lock, cancel, join) and if you like, other errors. There are no other errors listed for sem_timedwait that should affect you, though.
Another idea is to block SIGALRM across your "safe region", which would be simpler except that (a) I can never remember how to safely do I/O with signals disabled, and (b) your worker thread could probably be running "simultaneously" with the signal handler (either truly simultaneous or apparently so due to pre-emption), meaning that the signal could be taken, then your worker disables signals and enters the critical region, then the signal handler cancels it. If nobody answers who can actually remember the details, here's some information on blocking signals.