I am just wondering if there is a way to kill a userspace program from a kernel module.I know that kill command won't work as it is a system call from userspace to kernel space.
This code will kill the calling process...
int signum = SIGKILL;
task = current;
struct siginfo info;
memset(&info, 0, sizeof(struct siginfo));
info.si_signo = signum;
int ret = send_sig_info(signum, &info, task);
if (ret < 0) {
printk(KERN_INFO "error sending signal\n");
}
You can see how the OOM killer does it here...
http://lxr.free-electrons.com/source/mm/oom_kill.c?v=3.16#L516
If you know what syscall can be used by userspace to deliver signals, why can't you check how it is implemented? More importantly though, why do you think you need to send a signal in the first place? How do you determine what to signal in the first place?
Is this another beyond terrible college assignment?
Related
Would it be possible to wake up a thread that is waiting on a futex lock? I tried
using a signal mechanism but it does not seem to work. Are there any other approches
I could try out? Below, I've added in an example that might be similar to what I'm
trying to achieve.
I have a thread A that acquires a futex lock "lockA" as follows :-
ret = syscall(__NR_futex, &lockA, FUTEX_LOCK_PI, 1, 0, NULL, 0);
I have a thread B that tries to acquire the futex lock "lockA", and blocks in the kernel,
as thread A has acquired the lock.
ret = syscall(__NR_futex, &lockA, FUTEX_LOCK_PI, 1, 0, NULL, 0);
If thread B does acquire lockA, another thread, thread C will know about it. If thread B
does not acquire the lock, thread C would like thread B to stop waiting for the lock, and
do something else.
So basically, at this point I'm trying to figure out if I can make thread C "signal" thread B
so that it won't block in the kernel anymore. In order to do that, I set a signal handler in
thread B as follows :-
struct sigaction act;
act.sa_handler = handler;
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
act.sa_restorer = NULL;
sigaction(SIGSYS, &act, NULL);
...
...
void handler() {
fprintf(stderr, "Inside the handler, outta the kernel\n");
}
From thread C I try to send the signal as :-
pthread_kill(tid_of_B, SIGSYS);
What am I doing wrong? Can thread B be woken up at all? If so, should I use another approach?
[EDIT]
Based on a comment below, I tried checking the return value from pthread_kill and realised that the call was not returning.
A few things.
You're using FUTEX_LOCK_PI which is not in the man page. I've just looked at the kernel source and a document and it appears this version is only for use inside the kernel itself. It's used to implement a "PI mutex" as a replacement for a kernel spinlock.
If you use a futex, you must implement the semantics on data in the address it points to.
Here is a crude, pseudocode, and possibly/probably wrong example:
int mysem = 1;
void
lock(void)
{
// atomic_dec returns new value
while (1) {
if (atomic_dec(&mysem) == 0)
break;
futex(&mysem,FUTEX_WAIT,...)
}
}
void
unlock(void)
{
// non_atomic_swap returns old value
if (non_atomic_swap(&mysem,1) != 0)
futex(&mysem,FUTEX_WAKE,...)
}
I have a library that accesses a hardware resource (SPI) via a 3rd party library. My library, and in turn the SPI resource, is accessed by multiple processes so I need to lock the resource with semaphores, the lock functions are below:
static int spi_lock(void)
{
struct timespec ts;
if (clock_gettime(CLOCK_REALTIME, &ts) == -1)
{
syslog(LOG_ERR,"failed to read clock: %s\n", SPISEM, strerror(errno));
return 3;
}
ts.tv_sec += 5;
if (sem_timedwait(bcoms->spisem, &ts) == -1)
{
syslog(LOG_ERR,"timed out trying to acquire %s: %s\n", SPISEM, strerror(errno));
return 1;
}
return 0;
}
static int spi_unlock(void)
{
int ret = 1;
if (sem_post(bcoms->spisem))
{
syslog(LOG_ERR,"failed to release %s: %s\n", SPISEM, strerror(errno));
goto done;
}
ret = 0;
done:
return ret;
}
Now my problem is the library is used in a daemon and that daemon is stopped via a kill signal. Sometimes I get the kill signal while I am holding the semaphore lock and hence the servers cannot be restarted successfully because the lock is perpetually taken. To fix this I am trying to block the signals as shown below (I am waiting for hardware to test this on atm):
static int spi_lock(void)
{
sigset_t nset;
struct timespec ts;
sigfillset(&nset);
sigprocmask(SIG_BLOCK, &nset, NULL);
if (clock_gettime(CLOCK_REALTIME, &ts) == -1)
{
syslog(LOG_ERR,"failed to read clock: %s\n", SPISEM, strerror(errno));
return 3;
}
ts.tv_sec += 5; // 5 seconds to acquire the semaphore is HEAPS, so we better bloody get it !!!
if (sem_timedwait(bcoms->spisem, &ts) == -1)
{
syslog(LOG_ERR,"timed out trying to acquire %s: %s\n", SPISEM, strerror(errno));
return 1;
}
return 0;
}
static int spi_unlock(void)
{
sigset_t nset;
int ret = 1;
if (sem_post(bcoms->spisem))
{
syslog(LOG_ERR,"failed to release %s: %s\n", SPISEM, strerror(errno));
goto done;
}
sigfillset(&nset);
sigprocmask(SIG_UNBLOCK, &nset, NULL);
ret = 0;
done:
return ret;
}
But having read the man pages for sigprocmask() it says in a multi-threaded system to use pthread_sigmask(), and one of the servers I want to protect is will be multi threaded. What I don't understand is if I use pthread_sigmask() in the library, and the main parent thread spawns a SPI read thread that uses those locking functions in my library, the read thread will be protected, but can't the main thread still receive the kill signal and take down the daemon while I am holding the mutex with the signals disabled on the read thread getting me no where? If so is there a better solution to this locking problem?
Thanks.
Indeed you've analyzed the problem correctly - masking signals does not protect you. Masking signals is not the right tool to prevent process termination with shared data (like files or shared semaphores) in an inconsistent state.
What you probably should be doing, if you want to exit gracefully on certain signals, is having the program install signal handlers to catch the termination request and feed it into your normal program logic. There are several approaches you can use:
Send the termination request over a pipe to yourself. This works well if your program is structured around a poll loop that can wait for input on a pipe.
Use sem_post, the one async-signal-safe synchronization function, to report the signal to the rest of the program.
Start a dedicated signal-handling thread from the main thread then block all signals in the main thread (and, by inheritance, all other new threads). This thread can just do for(;;) pause(); and since pause is async-signal-safe, you can call any functions you want from the signal handlers -- including the pthread sync functions needed for synchronizing with other threads.
Note that this approach will still not be "perfect" since you can never catch or block SIGKILL. If a user decides to kill your process with SIGKILL (kill -9) then the semaphore can be left in a bad state and there's nothing you can do.
I don't think your approach will work. You can not block SIGKILL or SIGSTOP. Unless you are saying that the daemon is getting a different signal (like SIGHUP). But even then I think it's bad practice to block all signals from a library call. That can result in adverse effects on the calling application. For example, the application may be relying on particular signals and missing any such signals could cause it to function incorrectly.
As it turns out there probably isn't an easy way to solve your problem using semaphores. So an alternative approach is to use something like "flock" instead. That solves your problem because it is based on open file descriptors. If a process dies holding an flock the associated file descriptor will be automatically closed and hence will free the flock.
I've written a program that uses SIGALRM and a signal handler.
I'm now trying to add this as a test module within the kernel.
I found that I had to replace a lot of the functions that libc provides with their underlying syscalls..examples being timer_create with sys_timer_create timer_settime with sys_timer_settime and so on.
However, I'm having issues with sigaction.
Compiling the kernel throws the following error
arch/arm/mach-vexpress/cpufreq_test.c:157:2: error: implicit declaration of function 'sys_sigaction' [-Werror=implicit-function-declaration]
I've attached the relevant code block below
int estimate_from_cycles() {
timer_t timer;
struct itimerspec old;
struct sigaction sig_action;
struct sigevent sig_event;
sigset_t sig_mask;
memset(&sig_action, 0, sizeof(struct sigaction));
sig_action.sa_handler = alarm_handler;
sigemptyset(&sig_action.sa_mask);
VERBOSE("Blocking signal %d\n", SIGALRM);
sigemptyset(&sig_mask);
sigaddset(&sig_mask, SIGALRM);
if(sys_sigaction(SIGALRM, &sig_action, NULL)) {
ERROR("Could not assign sigaction\n");
return -1;
}
if (sigprocmask(SIG_SETMASK, &sig_mask, NULL) == -1) {
ERROR("sigprocmask failed\n");
return -1;
}
memset (&sig_event, 0, sizeof (struct sigevent));
sig_event.sigev_notify = SIGEV_SIGNAL;
sig_event.sigev_signo = SIGALRM;
sig_event.sigev_value.sival_ptr = &timer;
if (sys_timer_create(CLOCK_PROCESS_CPUTIME_ID, &sig_event, &timer)) {
ERROR("Could not create timer\n");
return -1;
}
if (sigprocmask(SIG_UNBLOCK, &sig_mask, NULL) == -1) {
ERROR("sigprocmask unblock failed\n");
return -1;
}
cycles = 0;
VERBOSE("Entering main loop\n");
if(sys_timer_settime(timer, 0, &time_period, &old)) {
ERROR("Could not set timer\n");
return -1;
}
while(1) {
ADD(CYCLES_REGISTER, 1);
}
return 0;
}
Is such an approach of taking user-space code and changing the calls alone sufficient to run the code in kernel-space?
Is such an approach of taking user-space code and changing the calls
alone sufficient to run the code in kernel-space?
Of course not! What are you doing is to call the implementation of a system call directly from kernel space, but there is not guarantee that they SYS_function has the same function definition as the system call. The correct approach is to search for the correct kernel routine that does what you need. Unless you are writing a driver or a kernel feature you don't nee to write kernel code. System calls must be only invoked from user space. Their main purpose is to offer a safe manner to access low level mechanisms offered by an operating system such as File System, Socket and so on.
Regarding signals. You had a TERRIBLE idea to try to use signal system calls from kernel space in order to receive a signal. A process sends a signal to another process and signal are meant to be used in user space, so between user space processes. Typically, what happens when you send a signal to another process is that, if the signal is not masked, the receiving process is stopped and the signal handler is executed. Note that in order to achieve this result two switches between user space and kernel space are required.
However, the kernel has its internal tasks which have exactly the same structure of a user space with some differences ( e.g. memory mapping, parent process, etc..). Of course you cannot send a signal from a user process to a kernel thread (imagine what happen if you send a SIGKILL to a crucial component). Since kernel threads have the same structure of user space thread, they can receive signal but its default behaviour is to drop them unless differently specified.
I'd recommend to change you code to try to send a signal from kernel space to user space rather than try to receive one. ( How would you send a signal to kernel space? which pid would you specify?). This may be a good starting point : http://people.ee.ethz.ch/~arkeller/linux/kernel_user_space_howto.html#toc6
You are having problem with sys_sigaction because this is the old definition of the system call. The correct definition should be sys_rt_sigaction.
From the kernel source 3.12 :
#ifdef CONFIG_OLD_SIGACTION
asmlinkage long sys_sigaction(int, const struct old_sigaction __user *,
struct old_sigaction __user *);
#endif
#ifndef CONFIG_ODD_RT_SIGACTION
asmlinkage long sys_rt_sigaction(int,
const struct sigaction __user *,
struct sigaction __user *,
size_t);
#endif
BTW, you should not call any of them, they are meant to be called from user space.
You're working in kernel space so you should start thinking like you're working in kernel space instead of trying to port a userspace hack into the kernel. If you need to call the sys_* family of functions in kernel space, 99.95% of the time, you're already doing something very, very wrong.
Instead of while (1), have it break the loop on a volatile variable and start a thread that simply sleeps and change the value of the variable when it finishes.
I.e.
void some_function(volatile int *condition) {
sleep(x);
*condition = 0;
}
volatile int condition = 1;
start_thread(some_function, &condition);
while(condition) {
ADD(CYCLES_REGISTER, 1);
}
However, what you're doing (I'm assuming you're trying to get the number of cycles the CPU is operating at) is inherently impossible on a preemptive kernel like Linux without a lot of hacking. If you keep interrupts on, your cycle count will be inaccurate since your kernel thread may be switched out at any time. If you turn interrupts off, other threads won't run and your code will just infinite loop and hang the kernel.
Are you sure you can't simply use the BogoMIPs value from the kernel? It is essentially what you're trying to measure but the kernel does it very early in the boot process and does it right.
I have been pulling my hairs for real strange issue. The kernel module is unable to send signal to user application (or user app is unable to receive) without printk, have to do dummy printk after or before sending the signal.
Actually, it works great even with empty printk. But, i am trying to understand whats happening.
Any thoughts?
Here is whats happening:
A - kernel)
Char device type module gets interrupt.
It extracts the data and send signal to user.
/* have to do printk here */
Return IRQ handle.
B- user)
Receives the signal.
issues a system call and read the data from char device's buffer . (copy_to_user)
kernel:
void irq_handler(){
int i;
for(i =0; i < 32; i++)
GPIOdata[i] = read_gpio_status(i);
struct task_struct *p = find_task_by_pid(processinfo.pid);
if (NULL == p)
return;
send_sig(SIGUSR1, p, 0);
/* have to add printk here */
return IRQ_HANDLED
}
user:
void signal_handler(int sig) {
char data[32];
ioctl(fd, READ_Data_from_Char_device, &data);
}
If you are using signal not sigaction for setting handler, then remember, that signal removes handler after getting a signal. And you should mask the signal, so it will not interrupt your process when running inside signal handler. I'ma also not sure about system call ioctl inside handler (look at man7 signal under section Async-signal-safe functions).
Calls to printk might slow down execution of other operations (because they are blocked on I/O or buffering) around these calls, so they can make synchronization slower (thus any mistakes in synchronization may not occur).
I discovered an issue with thread implementation, that is strange to me. Maybe some of you can explain it to me, would be great.
I am working on something like a proxy, a program (running on different machines) that receives packets over eth0 and sends it through ath0 (wireless) to another machine which is doing the exactly same thing. Actually I am not at all sure what is causing my problem, that's because I am new to everything, linux and c programming.
I start two threads,
one is listening (socket) on eth0 for incoming packets and sends it out through ath0 (also socket)
and the other thread is listening on ath0 and sends through eth0.
If I use threads, I get an error like that:
sh-2.05b# ./socketex
Failed to send network header packet.
: Interrupted system call
If I use fork(), the program works as expected.
Can someone explain that behaviour to me?
Just to show the sender implementation here comes its code snippet:
while(keep_going) {
memset(&buffer[0], '\0', sizeof(buffer));
recvlen = recvfrom(sockfd_in, buffer, BUFLEN, 0, (struct sockaddr *) &incoming, &ilen);
if(recvlen < 0) {
perror("something went wrong / incoming\n");
exit(-1);
}
strcpy(msg, buffer);
buflen = strlen(msg);
sentlen = ath_sendto(sfd, &btpinfo, &addrnwh, &nwh, buflen, msg, &selpv2, &depv);
if(sentlen == E_ERR) {
perror("Failed to send network header packet.\n");
exit(-1);
}
}
UPDATE: my main file, starting either threads or processes (fork)
int main(void) {
port_config pConfig;
memset(&pConfig, 0, sizeof(pConfig));
pConfig.inPort = 2002;
pConfig.outPort = 2003;
pid_t retval = fork();
if(retval == 0) {
// child process
pc2wsuThread((void *) &pConfig);
} else if (retval < 0) {
perror("fork not successful\n");
} else {
// parent process
wsu2pcThread((void *) &pConfig);
}
/*
wint8 rc1, rc2 = 0;
pthread_t pc2wsu;
pthread_t wsu2pc;
rc1 = pthread_create(&pc2wsu, NULL, pc2wsuThread, (void *) &pConfig);
rc2 = pthread_create(&wsu2pc, NULL, wsu2pcThread, (void *) &pConfig);
if(rc1) {
printf("error: pthread_create() is %d\n", rc1);
return(-1);
}
if(rc2) {
printf("error: pthread_create() is %d\n", rc2);
return(-1);
}
pthread_join(pc2wsu, NULL);
pthread_join(wsu2pc, NULL);
*/
return 0;
}
Does it help?
update 05/30/2011
-sh-2.05b# ./wsuproxy 192.168.1.100
mgmtsrvc
mgmtsrvc
Failed to send network header packet.
: Interrupted system call
13.254158,75.165482,DATAAAAAAmgmtsrvc
mgmtsrvc
mgmtsrvc
Still get the interrupted system call, as you can see above.
I blocked all signals as followed:
sigset_t signal_mask;
sigfillset(&signal_mask);
sigprocmask(SIG_BLOCK, &signal_mask, NULL);
The two threads are working on the same interfaces, but on different ports. The problem seems to appear still in the same place (please find it in the first code snippet). I can't go further and have not enough knowledge of how to solve that problem. Maybe some of you can help me here again.
Thanks in advance.
EINTR does not itself indicate an error. It means that your process received a signal while it was in the sendto syscall, and that syscall hadn't sent any data yet (that's important).
You could retry the send in this case, but a good thing would be to figure out what signal caused the interruption. If this is reproducible, try using strace.
If you're the one sending the signal, well, you know what to do :-)
Note that on linux, you can receive EINTR on sendto (and some other functions) even if you haven't installed a handler yourself. This can happen if:
the process is stopped (via SIGSTOP for example) and restarted (with SIGCONT)
you have set a send timeout on the socket (via SO_SNDTIMEO)
See the signal(7) man page (at the very bottom) for more details.
So if you're "suspending" your service (or something else is), that EINTR is expected and you should restart the call.
Keep in mind if you are using threads with signals that a given signal, when delivered to the process, could be delivered to any thread whose signal mask is not blocking the signal. That means if you have blocked incoming signals in one thread, and not in another, the non-blocking thread will receive the signal, and if there is no signal handler setup for the signal, you will end-up with the default behavior of that signal for the entire process (i.e., all the threads, both signal-blocking threads and non-signal-blocking threads). For instance, if the default behavior of a signal was to terminate a process, one thread catching that signal and executing it's default behavior will terminate the entire process, for all the threads, even though some threads may have been masking the signal. Also if you have two threads that are not blocking a signal, it is not deterministic which thread will handle the signal. Therefore it's typically the case that mixing signals and threads is not a good idea, but there are exceptions to the rule.
One thing you can try, is since the signal mask for a spawned thread is inherited from the generating thread, is to create a daemon thread for handling signals, where at the start of your program, you block all incoming signals (or at least all non-important signals), and then spawn your threads. Now those spawned threads will ignore any incoming signals in the parent-thread's blocked signal mask. If you need to handle some specific signals, you can still make those signals part of the blocked signal mask for the main process, and then spawn your threads. But when you're spawning the threads, leave one thread (could even be the main process thread after it's spawned all the worker threads) as a "daemon" thread waiting for those specific incoming (and now blocked) signals using sigwait(). That thread will then dispatch whatever functions are necessary when a given signal is received by the process. This will avoid signals from interrupting system calls in your other worker-threads, yet still allow you to handle signals.
The reason your forked version may not be having issues is because if a signal arrives at one parent process, it is not propagated to any child processes. So I would try, if you can, to see what signal it is that is terminating your system call, and in your threaded version, block that signal, and if you need to handle it, create a daemon-thread that will handle that signal's arrival, with the rest of the threads blocking that signal.
Finally, if you don't have access to any external libraries or debuggers, etc. to see what signals are arriving, you can setup a simple procedure for seeing what signals might be arriving. You can try this code:
#include <signal.h>
#include <stdio.h>
int main()
{
//block all incoming signals
sigset_t signal_mask;
sigfillset(&signal_mask);
sigprocmask(SIG_BLOCK, &signal_mask, NULL);
//... spawn your threads here ...
//... now wait for signals to arrive and see what comes in ...
int arrived_signal;
while(1) //you can change this condition to whatever to exit the loop
{
sigwait(&signal_mask, &arrived_signal);
switch(arrived_signal)
{
case SIGABRT: fprintf(stderr, "SIGABRT signal arrived\n"); break;
case SIGALRM: fprintf(stderr, "SIGALRM signal arrived\n"); break;
//continue for the rest of the signals defined in signal.h ...
default: fprintf(stderr, "Unrecognized signal arrived\n");
}
}
//clean-up your threads and anything else needing clean-up
return 0;
}