Is the data in siginfo trustworthy? - c

I've found that on Linux, by making my own call to the rt_sigqueue syscall, I can put whatever I like in the si_uid and si_pid fields and the call succeeds and happily delivers the incorrect values. Naturally the uid restrictions on sending signals provide some protection against this kind of spoofing, but I'm worried it may be dangerous to rely on this information. Is there any good documentation on the topic I could read? Why does Linux allow the obviously-incorrect behavior of letting the caller specify the siginfo parameters rather than generating them in kernelspace? It seems nonsensical, especially since extra sys
calls (and thus performance cost) may be required in order to get the uid/gid in userspace.
Edit: Based on my reading of POSIX (emphasis added by me):
If si_code is SI_USER or SI_QUEUE, [XSI] or any value less than or equal to 0, then the signal was generated by a process and si_pid and si_uid shall be set to the process ID and the real user ID of the sender, respectively.
I believe this behavior by Linux is non-conformant and a serious bug.

That section of the POSIX page you quote also lists what si-code means, and here's the meaning:
SI_QUEUE
The signal was sent by the sigqueue() function.
That section goes on to say:
If the signal was not generated by one
of the functions or events listed
above, si_code shall be set either
to one of the signal-specific values
described in XBD , or to an
implementation-defined value that is
not equal to any of the values defined
above.
Nothing is violated if only the sigqueue() function uses SI_QUEUE. Your scenario involves code other than the sigqueue() function using SI_QUEUE The question is whether POSIX envisions an operating system enforcing that only a specified library function (as opposed to some function which is not a POSIX-defined library function) be permitted to make a system call with certain characteristics. I believe the answer is "no".
EDIT as of 2011-03-26, 14:00 PST:
This edit is in response to R..'s comment from eight hours ago, since the page wouldn't let me leave an adequately voluminous comment:
I think you're basically right. But either a system is POSIX compliant or it is not. If a non-library function does a syscall which results in a non-compliant combination of uid, pid, and 'si_code', then the second statement I quoted makes it clear that the call itself is not compliant. One can interpret this in two ways. One ways is: "If a user breaks this rule, then he makes the system non-compliant." But you're right, I think that's silly. What good is a system when any nonprivileged user can make it noncompliant? The fix, as I see it, is somehow to have the system know that it's not the library 'sigqueue()' making the system call, then the kernel itself should set 'si_code' to something other than 'SI_QUEUE', and leave the uid and pid as you set them. In my opinion, you should raise this with the kernel folks. They may have difficulty, however; I don't know of any secure way for them to detect whether a syscall is made by a particular library function, seeing as how the library functions. almost by definition, are merely convenience wrappers around the syscalls. And that may be the position they take, which I know will be a disappointment.
(voluminous) EDIT as of 2011-03-26, 18:00 PST:
Again because of limitations on comment length.
This is in response to R..'s comment of about an hour ago.
I'm a little new to the syscall subject, so please bear with me.
By "the kernel sysqueue syscall", do you mean the `__NR_rt_sigqueueinfo' call? That's the only one that I found when I did this:
grep -Ri 'NR.*queue' /usr/include
If that's the case, I think I'm not understanding your original point. The kernel will let (non-root) me use SI-QUEUE with a faked pid and uid without error. If I have the sending side coded thus:
#include <sys/syscall.h>
#include <sys/types.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
int main(int argc,
char **argv
)
{
long john_silver;
siginfo_t my_siginfo;
if(argc!=2)
{
fprintf(stderr,"missing pid argument\n");
exit(1);
}
john_silver=strtol(argv[1],NULL,0);
if(kill(john_silver,SIGUSR1))
{
fprintf(stderr,"kill() fail\n");
exit(1);
}
sleep(1);
my_siginfo.si_signo=SIGUSR1;
my_siginfo.si_code=SI_QUEUE;
my_siginfo.si_pid=getpid();
my_siginfo.si_uid=getuid();
my_siginfo.si_value.sival_int=41;
if(syscall(__NR_rt_sigqueueinfo,john_silver,SIGUSR1,&my_siginfo))
{
perror("syscall()");
exit(1);
}
sleep(1);
my_siginfo.si_signo=SIGUSR2;
my_siginfo.si_code=SI_QUEUE;
my_siginfo.si_pid=getpid()+1;
my_siginfo.si_uid=getuid()+1;
my_siginfo.si_value.sival_int=42;
if(syscall(__NR_rt_sigqueueinfo,john_silver,SIGUSR2,&my_siginfo))
{
perror("syscall()");
exit(1);
}
return 0;
} /* main() */
and the receiving side coded thus:
#include <sys/types.h>
#include <signal.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int signaled_flag=0;
siginfo_t received_information;
void
my_handler(int signal_number,
siginfo_t *signal_information,
void *we_ignore_this
)
{
memmove(&received_information,
signal_information,
sizeof(received_information)
);
signaled_flag=1;
} /* my_handler() */
/*--------------------------------------------------------------------------*/
int
main(void)
{
pid_t myself;
struct sigaction the_action;
myself=getpid();
printf("signal receiver is process %d\n",myself);
the_action.sa_sigaction=my_handler;
sigemptyset(&the_action.sa_mask);
the_action.sa_flags=SA_SIGINFO;
if(sigaction(SIGUSR1,&the_action,NULL))
{
fprintf(stderr,"sigaction(SIGUSR1) fail\n");
exit(1);
}
if(sigaction(SIGUSR2,&the_action,NULL))
{
fprintf(stderr,"sigaction(SIGUSR2) fail\n");
exit(1);
}
for(;;)
{
while(!signaled_flag)
{
sleep(1);
}
printf("si_signo: %d\n",received_information.si_signo);
printf("si_pid : %d\n",received_information.si_pid );
printf("si_uid : %d\n",received_information.si_uid );
if(received_information.si_signo==SIGUSR2)
{
break;
}
signaled_flag=0;
}
return 0;
} /* main() */
I can then run (non-root) the receiving side thus:
wally:~/tmp/20110326$ receive
signal receiver is process 9023
si_signo: 10
si_pid : 9055
si_uid : 4000
si_signo: 10
si_pid : 9055
si_uid : 4000
si_signo: 12
si_pid : 9056
si_uid : 4001
wally:~/tmp/20110326$
And see this (non-root) on the send end:
wally:~/tmp/20110326$ send 9023
wally:~/tmp/20110326$
As you can see, the third event has spoofed pid and uid. Isn't that what you originally objected to? There's no EINVAL or EPERM in sight. I guess I'm confused.

I agree that si_uid and si_pid should be trustworthy, and if they are not it is a bug. However, this is only required if the signal is SIGCHLD generated by a state change of a child process, or if si_code is SI_USER or SI_QUEUE, or if the system supports the XSI option and si_code <= 0. Linux/glibc also pass si_uid and si_pid values in other cases; these are often not trustworthy but that is not a POSIX conformance issue.
Of course, for kill() the signal may not be queued in which case the siginfo_t does not provide any additional information.
The reason that rt_sigqueueinfo allows more than just SI_QUEUE is probably to allow implementing POSIX asynchronous I/O, message queues and per-process timers with minimal kernel support. Implementing these in userland requires the ability to send a signal with SI_ASYNCIO, SI_MESGQ and SI_TIMER respectively. I do not know how glibc allocates the resources to queue the signal beforehand; to me it looks like it does not and simply hopes rt_sigqueueinfo does not fail. POSIX clearly forbids discarding a timer expiration (async I/O completion, message arrival on a message queue) notification because too many signals are queued at the time of the expiration; the implementation should have rejected the creation or registration if there were insufficient resources. The objects have been defined carefully such that each I/O request, message queue or timer can have at most one signal in flight at a time.

Related

How to overwrite(reset) the default behaviour of SIGUSR1?

I read about add signal() function in the signal handler function can over write the default behaviour:
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
void signalHandler();
int main(void) {
signal(SIGUSR1, signalHandler);
sleep(60);
printf("I wake up");
return 0;
}
void signalHandler() {
signal(SIGUSR1, signalHandler);// I add this line to overwrite the default behaviour
printf("I received the signal");
}
And I trigger it with another process
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
int main(int argc, char *argv[]) {
kill(5061, SIGUSR1); // 5061 is the receiver pid, kill(argv[1], SIGUSR1) doesn't working, when I get 5061 as parameter
puts("send the signal ");
return 0;
}
The receiver process wake up as soon as it receiver the SIGUSR1 signal. How can I make the receiver continue sleep even when it receive the signal from other process?
BTW why kill(5061, SIGUSR1); 5061 is the receiver pid, kill(argv[1], SIGUSR1) doesn't work, when I get 5061 as parameter?
From the sleep(3) manual page:
Return Value
Zero if the requested time has elapsed, or the number of seconds left to sleep, if the call was interrupted by a signal handler.
So instead of just calling sleep you have to check the return value and call in a loop:
int sleep_time = 60;
while ((sleep_time = sleep(sleep_time)) > 0)
;
Your question (why kill(argv[1], SIGUSR1) does not work) actually is not related to signals, but to basic C programming. Please compile with gcc -Wall -g (i.e. all warnings and debugging information) on Linux, and improve your code till no warning is given.
Please read carefully the kill(2) man page (on Linux, after installing packages like manpages and manpages-dev and man-db on Ubuntu or Debian, you can type man 2 kill to read it on your computer). Read also carefully signal(2) and signal(7) man pages. Read these man pages several times.
Then, understand that the kill syscall is declared as
int kill(pid_t pid, int sig);
and since pid_t is some integral type, you need to convert argv[1] (which is a string, i.e. a char*) to some integer (and kill(argv[1], SIGUSR1) should not even compile without errors, since argv[1] is not some integer but a string). So please use:
kill((pid_t) atoi(argv[1]), SIGUSR1);
The man page says also that you should use signal(SIGUSR1, SIG_DFL) to restore the default behavior, and signal(SIGUSR1, SIG_IGN) to ignore that signal.
Of course, you should better use sigaction(2) as the man page of signal(2) tells you.
At last, please take the habit of reading man pages and spend hours to read good books like Advanced Linux Programming and Advanced Unix Programming. They explain things much better than we can explain in a few minutes. If you are not familiar with C programming, read also some good book on it.
I haven't tried it before, but you could use sigaction(2) instead and set the SA_RESTART flag, this should make syscalls restartable.
Edit:
Actually this won't work, according to the man page of signal(7) not all system calls are restartable:
The sleep(3) function is also never restarted if interrupted by a
handler, but gives a success return: the number of seconds remaining
to sleep.
So you should call the sleep function again with the remaining time instead.

Signals - c99 vs gnu99

I have the following code. When I compile it with the gnu extensions (-std=gnu99), the program will catch 5 SIGINT before ending (which I would expect). When compiled without it (-std=c99) ends after the second (and only outputs one line).
What am I missing?
#include <signal.h>
#include <stdlib.h>
#include <stdio.h>
int int_stage = 0;
int got_signal = 0;
void sigint(int parameter)
{
(void)parameter;
got_signal = 1;
int_stage++;
}
int main()
{
signal(SIGINT,sigint);
while(1)
{
if (got_signal)
{
got_signal = 0;
puts("still alive");
if (int_stage >= 5) exit(1);
}
}
return 0;
}
Use sigaction(2) rather than signal(2).
The Linux man page has this, in particular, in the Portability section:
In the original UNIX systems, when a handler that was established using signal() was invoked by the
delivery of a signal, the disposition of the signal would be reset to SIG_DFL, and the system did
not block delivery of further instances of the signal. System V also provides these semantics for
signal(). This was bad because the signal might be delivered again before the handler had a chance
to reestablish itself. Furthermore, rapid deliveries of the same signal could result in recursive
invocations of the handler.
BSD improved on this situation by changing the semantics of signal handling (but, unfortunately,
silently changed the semantics when establishing a handler with signal()). On BSD, when a signal
handler is invoked, the signal disposition is not reset, and further instances of the signal are
blocked from being delivered while the handler is executing.
The situation on Linux is as follows:
The kernel's signal() system call provides System V semantics.
By default, in glibc 2 and later, the signal() wrapper function does not invoke the kernel system
call. Instead, it calls sigaction(2) using flags that supply BSD semantics. This default behav‐
ior is provided as long as the _BSD_SOURCE feature test macro is defined. By default, _BSD_SOURCE
is defined; it is also implicitly defined if one defines _GNU_SOURCE, and can of course be explic‐
itly defined.
On glibc 2 and later, if the _BSD_SOURCE feature test macro is not defined, then signal() provides
System V semantics. (The default implicit definition of _BSD_SOURCE is not provided if one
invokes gcc(1) in one of its standard modes (-std=xxx or -ansi) or defines various other feature
test macros such as _POSIX_SOURCE, _XOPEN_SOURCE, or _SVID_SOURCE; see feature_test_macros(7).)
Using std=gnu99, you're getting BSD semantics. Using -std=c99, you're getting System V semantics. So the signal handler is "reinstalled" in one case (BSD), and the signal disposition is reset back to SIG_DFL in the other (System V).
The problem is that signal also resets the signal handling mechanism, you have to reset sigint as the signal handler. From the manual
In the original UNIX systems, when a handler that was established using signal() was invoked by the delivery of a signal, the disposition of the signal would be reset to SIG_DFL, and the system did not block delivery of further instances of the signal. System V also provides these semantics for signal(). This was bad because the signal might be delivered again before the handler had a chance to reestablish itself. Furthermore, rapid deliveries of the same signal could result in recursive invocations of the handler.
This is how to do it with the old antiquated signal() call.
Note how int_stage and got_signal have to be sig_atomic_t.
You can also only call async safe functions, look at here for a list.
#include <signal.h>
#include <stdlib.h>
#include <stdio.h>
sig_atomic_t int_stage = 0;
sig_atomic_t got_signal = 0;
void sigint(int parameter)
{
(void)parameter;
got_signal = 1;
int_stage++;
}
int main()
{
signal(SIGINT,sigint);
while(1)
{
if (got_signal)
{
signal(SIGINT,sigint);
got_signal = 0;
puts("still alive");
if (int_stage >= 5) exit(1);
}
}
return 0;
}
Please consider either using sigaction, or sigwait.
Sigaction would have practically the same idea, but no nonsense with re-initializing the signal handler. Sigwait would stop your thread until a signal is received. So, for sigwait, you can call any function or deal with any data. I can show you example code if you desire.
I agree with Ethan Steinberg - the "busy wait" is So Wrong...
But the problem is that you're failing to reset the signal handler. AFAIK, you must do this (call "signal(SIGINT,sigint)" again) with any version of C.

Linux futex syscall spurious wakes with return value 0?

I've run into an issue with the Linux futex syscall (FUTEX_WAIT operation) sometimes returning early seemingly without cause. The documentation specifies certain conditions that may cause it to return early (without a FUTEX_WAKE) but these all involve non-zero return values: EAGAIN if the value at the futex address does not match, ETIMEDOUT for timed waits that timeout, EINTR when interrupted by a (non-restarting) signal, etc. But I'm seeing a return value of 0. What, other than FUTEX_WAKE or the termination of a thread whose set_tid_address pointer points to the futex, could cause FUTEX_WAIT to return with a return value of 0?
In case it's useful, the particular futex I was waiting on is the thread tid address (set by the clone syscall with CLONE_CHILD_CLEARTID), and the thread had not terminated. My (apparently incorrect) assumption that the FUTEX_WAIT operation returning 0 could only happen when the thread terminated lead to serious errors in program logic, which I've since fixed by looping and retrying even if it returns 0, but now I'm curious as to why it happened.
Here is a minimal test case:
#define _GNU_SOURCE
#include <sched.h>
#include <sys/syscall.h>
#include <unistd.h>
#include <linux/futex.h>
#include <signal.h>
static char stack[32768];
static int tid;
static int foo(void *p)
{
syscall(SYS_getpid);
syscall(SYS_getpid);
syscall(SYS_exit, 0);
}
int main()
{
int pid = getpid();
for (;;) {
int x = clone(foo, stack+sizeof stack,
CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND
|CLONE_THREAD|CLONE_SYSVSEM //|CLONE_SETTLS
|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID
|CLONE_DETACHED,
0, &tid, 0, &tid);
syscall(SYS_futex, &tid, FUTEX_WAIT, x, 0);
/* Should fail... */
syscall(SYS_tgkill, pid, tid, SIGKILL);
}
}
Let it run for a while, at it should eventually terminate with Killed (SIGKILL), which is only possible if the thread still exists when the FUTEX_WAIT returns.
Before anyone goes assuming this is just the kernel waking the futex before it finishes destroying the thread (which might in fact be happening in my minimal test case here), please note that in my original code, I actually observed userspace code running in the thread well after FUTEX_WAIT returned.
Could you be dealing with a race condition between whether the parent or child operations complete first? You can probably investigate this theory by putting small sleeps at the beginning of your foo() or immediately after the clone() to determine if a forced sequencing of events masks the issue. I don't recommend fixing anything in that manner, but it could be helpful to investigate. Maybe the futex isn't ready to be waited upon until the child gets further through its initialization, but the parent's clone has enough to return to the caller?
Specifically, the CLONE_VFORK option's presence seems to imply this is a dangerous scenario. You may need a bi-directional signaling mechanism such that the child signals the parent that it has gotten far enough that it is safe to wait for the child.

The Unreliable Signal API - Code doesnt work as expected

Basically,expected output of is that it catches KeyboardInterrupt 5 times and exits the 6th time.(If 1st line of handler() is un-commented)
Now, if i comment that line too, then
also the behavior of program doesnt
change even though I am using
unreliable API.
As I have used signal() function, this is unreliable bcos after the first call to the handler(), SIGINT will then have the default behavior, thats is of exiting the a.out..
The program still quits after 5 ^C.. WHY?
**
The code works even without
reinstating the handler(). WHy?
**
/* ursig1.c */
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
static int count = 0;
void handler(int signo) {
// signal(SIGINT,handler);
/* Re-instate handler */
++count;
/* Increment count */
write(1,"Got SIGINT\n",11); /* Write message */
}
int
main(int argc,char **argv) {
signal(SIGINT,handler);
/* Register function */
while ( count < 5 ) {
puts("Waiting for SIGINT..");
sleep(4);
/* Snooze */
}
puts("End.");
return 0;
}
Have a read of the Linux man page for signal(2), under the section Portability, where it discusses the varying behaviour of signal(2) across the many different versions of Unix. In particular,
In the original Unix systems, when a
handler that was established using
signal() was invoked by the
delivery of a signal, the disposition
of the signal would be reset to
SIG_DFL, and the system did not
block delivery of further instances
of the signal. System V also provides
these semantics for signal().
This is the behaviour you are expecting, but it is not what Linux provides, as allowed by POSIX.1. You should be using sigaction(2) to install your signal handlers to get portable and defined behaviour.

Signal handler for SIGALRM does not work even if resetting in the handler

The example code of section 10.6, the expected result is:
after several iterations, the static structure used by getpwnam will be corrupted, and the program will terminate with SIGSEGV signal.
But on my platform, Fedora 11, gcc (GCC) 4.4.0, the result is
[Langzi#Freedom apue]$ ./corrupt
in sig_alarm
I can see the output from sig_alarm only once, and the program seems hung up for some reason, but it does exist, and still running.
But when I try to use gdb to run the program, it seems OK, I will see the output from sig_alarm at regular intervals.
And from my manual, it said the signal handler will be set to SIG_DEF after the signal is handled, and system will not block the signal. So at the beginning of my signal handler I reset the signal handler.
Maybe I should use sigaction instead, but I only want to know the reason about the difference between normal running and gdb running.
Any advice and help will be appreciated.
following is my code:
#include "apue.h"
#include <pwd.h>
void sig_alarm(int signo);
int main()
{
struct passwd *pwdptr;
signal(SIGALRM, sig_alarm);
alarm(1);
for(;;) {
if ((pwdptr = getpwnam("Zhijin")) == NULL)
err_sys("getpwnam error");
if (strcmp("Zhijin", pwdptr->pw_name) != 0) {
printf("data corrupted, pw_name: %s\n", pwdptr->pw_name);
}
}
}
void sig_alarm(int signo)
{
signal(SIGALRM, sig_alarm);
struct passwd *rootptr;
printf("in sig_alarm\n");
if ((rootptr = getpwnam("root")) == NULL)
err_sys("getpwnam error");
alarm(1);
}
According to the standard, you're really not allowed to do much in a signal handler. All you are guaranteed to be able to do in the signal-handling function, without causing undefined behavior, is to call signal, and to assign a value to a volatile static object of the type sig_atomic_t.
The first few times I ran this program, on Ubuntu Linux, it looked like your call to alarm in the signal handler didn't work, so the loop in main just kept running after the first alarm. When I tried it later, the program ran the signal handler a few times, and then hung. All this is consistent with undefined behavior: the program fails, sometimes, and in various more or less interesting ways.
It is not uncommon for programs that have undefined behavior to work differently in the debugger. The debugger is a different environment, and your program and data could for example be laid out in memory in a different way, so errors can manifest themselves in a different way, or not at all.
I got the program to work by adding a variable:
volatile sig_atomic_t got_interrupt = 0;
And then I changed your signal handler to this very simple one:
void sig_alarm(int signo) {
got_interrupt = 1;
}
And then I inserted the actual work into the infinite loop in main:
if (got_interrupt) {
got_interrupt = 0;
signal(SIGALRM, sig_alarm);
struct passwd *rootptr;
printf("in sig_alarm\n");
if ((rootptr = getpwnam("root")) == NULL)
perror("getpwnam error");
alarm(1);
}
I think the "apue" you mention is the book "Advanced Programming in the UNIX Environment", which I don't have here, so I don't know if the purpose of this example is to show that you shouldn't mess around with things inside of a signal handler, or just that signals can cause problems by interrupting the normal work of the program.
According to the spec, the function getpwnam is not reentrant and is not guaranteed to be thread safe. Since you are accessing the structure in two different threads of control (signal handlers are effectively running in a different thread context), you are running into this issue. Whenever you have concurrent or parallel execution (as when using pthreads or when using a signal handler), you must be sure not to modify shared state (e.g. the structure owned by 'getpwnam'), and if you do, then appropriate locking/synchronization must be used.
Additionally, the signal function has been deprecated in favor of the sigaction function. In order to ensure portable behavior when registering signal handlers, you should always use the sigaction invocation.
Using the sigaction function, you can use the SA_RESETHAND flag to reset the default handler. You can also use the sigprocmask function to enable/disable the delivery of signals without modifying their handlers.
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <unistd.h>
void sigalrm_handler(int);
int main()
{
signal(SIGALRM, sigalrm_handler);
alarm(3);
while(1)
{
}
return 0;
}
void sigalrm_handler(int sign)
{
printf("I am alive. Catch the sigalrm %d!\n",sign);
alarm(3);
}
For example, my code is runing in main doing nothing and every 3 seconds my program says im alive x)
I think that if you do as i done calling in the handler function alarm with value 3, the problem is resolved :)

Resources