Why the program not terminated on signal value change? - c

I have a simple program using signal with the user's handlers.
#include <signal.h>
#include <stdio.h>
#include <zconf.h>
int x = 0;
int i = 3;
void catcher3(int signum) {
i = 1;
}
void catcher2(int signum) {
// Stuck in infinity loop here.
// Happens even with i == 0
if (i != 0) {
x = 5;
}
}
void catcher1(int signum) {
printf("i = %d\n", i);
i--;
if (i == 0) {
signal(SIGFPE, catcher2);
signal(SIGTERM, catcher3);
}
}
int main() {
signal(SIGFPE, catcher1);
x = 10 / x;
printf("Goodbye");
}
While I expect it to print:
3
2
1
Goodbye
It actually prints:
3
2
1
# Infinity loop within catcher2
My questions are:
On running a user handler like catcher1, to which point the code returns after the handler's execution? I would expect it continue the execution but it re-runs the signal handler.
What causes the infinity loop?
How to fix it?
Why sending SIGTERM won't print "Goodbye"? (kill -s TERM <pid>)

As pointed out by AProgrammer, the program doesn't necessarily read x after returning from the handler, even if x is marked volatile (which it should be anyway). This is because the execution continues to the offending instruction. The read from memory and the actual division could be separate instructions.
To get around this you will have to continue the execution to a point before x was read from memory.
You can modify your program as follows -
#include <csetjmp>
jmp_buf fpe;
volatile int x = 0; // Notice the volatile
volatile int i = 3;
void catcher2(int signum) {
if (i != 0) {
x = 5;
longjump(fpe, 1);
}
}
int main() {
signal(SIGFPE, catcher1);
setjump(fpe);
x = 10 / x;
printf("Goodbye");
}
Rest of the functions can remain the same.
You should also not be using printf from the signal handler. Instead use write directly to print debug messages as -
write(1, "SIGNAL\n", sizeof("SIGNAL\n"));

The handling of signals is complex and full of implementation defined, unspecified and undefined behavior. If you want to be portable, there is in fact very few things that you can do. Mostly reading and writing volatile sig_atomic_t and calling _Exit. Depending on the signal number, it is often undefined if you leave the signal handler in another way than calling _Exit.
In your case, I think FPE is one of those signals for which leaving normally the signal handler is UB. The best I can see is restarting the machine instruction which triggered the signal. Few architectures, and last I looked x86 was not one of them, provide a way to do 10/x without loading x in a register; that means that restarting the instruction will always restart the signal, even if you modify x and x us a volatile sig_atomtic_t.
Usually longjmp is also able to leave signal handler. #Bodo confirmed that using setjmp and longjmp to restart the division, you can get the behavior you want.
Note: on Unix there is another set of functions, sigaction, siglongjump and others, which is better to use. In fact I don't recommend using something else in any serious program.

Related

problem with setjmp and longjmp to switch between 2 functions

I am trying to implement a code which continuously switched between functions fun() and main() which do nothing but to print on screen infinitely. I am trying to switch by setjmp and longjmp and using SIGALRM signal in C.
But when I run it, it just works once and then doesn't switch.
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <setjmp.h>
jmp_buf b1, b2;
int cur = 0;
void handlesig(int sig) {
if(!cur) {
cur = 1;
setjmp(b2);
longjmp(b1, 1);
}
else {
cur = 0;
setjmp(b1);
longjmp(b2, 1);
}
}
void fun() {
while(1) {
printf("I am in function fun()\n");
for(int x = 0; x < 100000000; x++);
}
}
int main() {
signal(SIGALRM, handlesig);
ualarm(900000, 900000); //send SIGALRM after each 900000 microseconds
if(!setjmp(b1))
fun(); //will be run when setjmp returns 0
while(1) {
printf("I am in function main()\n"); //will be run when setjmp returns 1
for(int x = 0; x < 100000000; x++);
}
return 0;
}
I am not getting what is the problem with this code.
Your program has undefined behavior because the lifetime of the block where setjmp was called on b1 or b2 in the signal handler ends as soon as longjmp is called (in the very next line). The next time you call longjmp trying to return to a jmp_buf that is no longer valid, the behavior is undefined, and this manifests as the state being utterly corrupted.
You can write a hack to work around this by using sigaltstack and SA_ONSTACK flag for the signal handler to have multiple stacks, so that even though the jmp_buf is formally invalid, it's in practice not clobbered. But this is not a valid program, just one which happens to work in practice on some systems (not all). Ultimately, there is no (valid/reliable) way to do what you're asking for with setjmp and longjmp; context switching requires a strictly stronger primitive than what they provide.

Raspberry Pi clean exit on CTRL+C in C

First of all, let me apologize as I can see that similar questions have been posted quite a few times in the past. However, as I am very unfamiliar with C, I need help confirming this.
I am trying to ensure that my program leaves a clean gpio if I interrupt it with CTRL+C. Easily done in python or java, but C proves to be a harder nut to crack for me, as I was led to believe that no try-catch-finally exists in C. Googling it, I found what I think may be the solution, but unexperienced as I am, I'm not sure it's done properly. Here is my code:
#include <stdio.h>
#include <wiringPi.h>
#include <signal.h>
void CleanGPIO() {
pinMode(1,INPUT);
}
int main()
{
wiringPiSetup();
signal(SIGINT, CleanGPIO);
pinMode(1, PWM_OUTPUT);
for (int i = 0; i < 1024; ++i) {
pwmWrite(1, i);
delay(1);
}
for (int i = 1023; i >= 0; --i) {
pwmWrite(1, i);
delay(1);
}
pinMode(1,INPUT);
return 0;
}
I have tested it and it works as intended (pin 1 is set as IN after I interrupt it with CTRL+C), but I'm concerned if this is the safe way to do it, and if there is a better solution available.
calling any function which is not speficied as signal-safe from a signal handler is undefined behaviour. I suppose there is no such guarantee about pinMode.
The proper way would be to set a volatile int flag that you periodically check in your main loop.
volatile int terminating = 0;
void terminate(int sign) {
signal(SIGINT, SIG_DFL);
terminating = 1;
}
int main() {
for (...) {
if (terminating) {
// cleanup
exit(1);
}
}
}
the call to signal inside the handler is to allow force terminating the program with a second ctrl+c in case proper clenup takes too long or is stuck for any reason.
Your solution is nearly right. You should also call exit in order to force the program to terminate (assuming you want to terminate immediately). The exit call takes a parameter which is the exit status to return to the caller (e.g., the shell). This should be non-zero for abnormal termination.
So, it should be:
void CleanGPIO() {
pinMode(1,INPUT);
exit(1);
}
If you don't want to exit from the handler but from main in a more controlled fashion you can set a flag instead and check the flag value inside the loops.

measuring time of signal

I need to measure the time it takes to handle an exception and invoke a signal handler 100,000 times. I need to use signal()system call to register a handler function for SIGFPE then i need to cause a divide by 0 error.
I only have a skeleton right now and am not sure how I should handle the signal. So far I plan on calling gettimeofday() then entering in a for loop 100k times to invoke the signal() then another gettimeofday() in order to end the time and then take the total elapsed time and average it out over those 100k invocations.
#include <signal.h>
#include <sys/time.h>
void handle_sigfe(int signum)
{
//unsure how to handle the signal to keep the loop running for 100k times
}
double time_in_milli (struct timeval t){ //for time conversion
return (((t.tv_sec*1000000+t.tv_usec)*1000)/1000000);
}
int main(int argv, char ** argv)
{
int x =5;
int y = 0;
int z = 0;
signal(SIGFPE, handle_sigfpe);
z = x/y;
return 0;
}
Anyone have any clue on how I need to handle this signal? I am completely lost on this
A divide-by-zero exception invokes the handler that you installed. When the handler returns, the processor goes back to the division instruction and tries again. The result is an infinite loop. To prevent that, you can use the sigsetjmp and siglongjmp routines.
When you call sigsetjmp it returns 0. However, when siglongjmp is called, the program behaves as if sigsetjmp returns the value supplied by siglongjmp. So you can use an if statement to execute the division or skip the division based on the return value from sigsetjmp.
If that's too confusing, hopefully the following example will clear things up.
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
static sig_atomic_t caught = 33;
static sigjmp_buf env;
void action( int unused )
{
caught = 42;
siglongjmp( env, 1 );
}
int main( void )
{
if ( signal( SIGFPE, action ) == SIG_ERR ) {
perror( "signal failed" );
return 1;
}
int x = 1;
int y = 0;
int z;
if ( sigsetjmp( env, 1 ) == 0 )
z = x / y;
printf( "%d\n", caught );
}
The global variable caught is used to indicate that the exception was caught. The initial value is 33. The signal handler sets it to 42. The printf at the end of the program displays the final value. It should print 42 to indicate the that signal was caught.
The global variable env is used by the sigsetjmp and siglongjmp functions to save a copy of the registers and stack.
The if (sigsetjmp(env,1) == 0) will initially be true, and the division will be attempted. But when the handler is invoked, siglongjmp will make the program behave as if sigsetjmp returned 1, and the division will be skipped.
This allows the program to move past the division, and execute the printf at the end.

C handle signal SIGFPE and continue execution

I am trying to handle a SIGFPE signal but my program just crashes or runs forever. I HAVE to use signal() and not the other ones like sigaction().
So in my code I have:
#include <stdio.h>
#include <signal.h>
void handler(int signum)
{
// Do stuff here then return to execution below
}
int main()
{
signal(SIGFPE, handler);
int i, j;
for(i = 0; i < 10; i++)
{
// Call signal handler for SIGFPE
j = i / 0;
}
printf("After for loop");
return 0;
}
Basically, I want to go into the handler every time there is a division by 0. It should do whatever it needs to inside the handler() function then continue the next iteration of the loop.
This should also work for other signals that need to be handled. Any help would be appreciated.
If you have to use signal to handle FPE or any other signal that you cause directly by invoking the CPU nonsense that causes it, it is only defined what happens if you either exit the program from the signal handler or use longjmp to get out.
Also note the exact placement of the restore functions, at the end of the computation branch but at the start of the handle branch.
Unfortunately, you can't use signal() like this at all; the second invocation causes the code to fall down. You must use sigaction if you intend to handle the signal more than once.
#include <stdio.h>
#include <signal.h>
#include <setjmp.h>
#include <string.h>
jmp_buf fpe;
void handler(int signum)
{
// Do stuff here then return to execution below
longjmp(fpe, 1);
}
int main()
{
volatile int i, j;
for(i = 0; i < 10; i++)
{
// Call signal handler for SIGFPE
struct sigaction act;
struct sigaction oldact;
memset(&act, 0, sizeof(act));
act.sa_handler = handler;
act.sa_flags = SA_NODEFER | SA_NOMASK;
sigaction(SIGFPE, &act, &oldact);
if (0 == setjmp(fpe))
{
j = i / 0;
sigaction(SIGFPE, &oldact, &act);
} else {
sigaction(SIGFPE, &oldact, &act);
/* handle SIGFPE */
}
}
printf("After for loop");
return 0;
}
Caveat: Sorry to rain on the parade, but you really don't want to do this.
It is perfectly valid to trap [externally generated] signals like SIGINT, SIGTERM, SIGHUP etc. to allow graceful cleanup and termination of a program that may have files open that are partially written to.
However, internally generated signals, such as SIGILL, SIGBUS, SIGSEGV and SIGFPE are very hard to recover from meaningfully. The first three are bugs--pure and simple. And, IMO, the SIGFPE is also a hard bug as well.
After such a signal, your program is in an unsafe and indeterminate state. Even trapping the signal and doing longjmp/siglongjmp doesn't fix this.
And, there is no way to tell exactly how bad the damage is. Or, how bad the damage will become if the program tries to proceed.
If you get SIGFPE, was it for a floating point calculation [which you might be able to smooth over]. Or, was it for integer divide-by-zero? What calculation was being done? And, where? You don't know.
Trying to continue can sometimes cause 10x the damage because now the program is out of control. After recovery, the program may be okay, but it may not be. So, the reliability of the program after the event, can not be determined with any degree of certainty.
What were the events (i.e.) calculations that led up to the SIGFPE? Maybe, it's not merely a single divide, but the chain of calculations that led up to the value being zero. Where did these values go? Will these now suspect values be used by code after the recovery operation has taken place?
For example, the program might overwrite the wrong file because the failed calculation was somehow involved in selecting the file descriptor that a caller is going to use.
Or, you leak memory. Or, corrupt the heap. Or, was the error within the heap allocation code itself?
Consider the following function:
void
myfunc(char *file)
{
int fd;
fd = open(file,O_WRONLY);
while (1) {
// do stuff ...
// write to the file
write(fd,buf,len);
// do more stuff ...
// generate SIGFPE ...
x = y / z;
}
close(fd);
}
Even with a signal handler that does siglongjmp, the file that myfunc was writing to is now corrupted/truncated. And, the file descriptor won't be closed.
Or, what if myfunc was reading from the file and saving the data to some array. That array is only partially filled. Now, you get SIGFPE. This is intercepted by the signal handler which does siglongjmp.
One of the callers of myfunc does the sigsetjmp to "catch" this. But, what can it do? The caller has no idea how bad things are. It might assume that the buffer myfunc was reading into is fully formed and write it out to a different file. That other file has now become corrupted.
UPDATE:
Oops, forgot to mention undefined behavior ...
Normally, we associate UB, such as writing past the end of an array, with a segfault [SIGSEGV]. But, what if it causes SIGFPE instead?
It's no longer just a "bad calculation" -- we're trapping [and ignoring] UB at the earliest detection point. If we do recovery, the next usage could be worse.
Here's an example:
// assume these are ordered in memory as if they were part of the same struct:
int x[10];
int y;
int z;
void
myfunc(void)
{
// initialize
y = 23;
z = 37;
// do stuff ...
// generate UB -- we run one past the end of x and zero out y
for (int i = 0; i <= 10; ++i)
x[i] = 0;
// do more stuff ...
// generate SIGFPE ...
z /= y;
// do stuff ...
// do something _really_ bad with y that causes a segfault or _worse_
// sends a space rocket off-course ...
}

Can we read and fault-inject another thread's program counter?

Assume that we have a single thread program and we hope to capture the value of program counter (PC) when a predefined interrupt occurs (like a timer interrupt).
It seems easy as you know we just write a specific assembly code using a special keyword __asm__ and pop the value on the top of the stack after making a shift 4 byte.
What about Multithreaded programs ?
How can we get values of all threads from another thread which run in the same process? (It seems extremely incredible to get values from thread which run on a separate core in multi-core processors).
(in multithreaded programs, every thread has its stack and registers too).
I want to implement a saboteur thread.
in order to perform fault injection in the target multi-threaded program, the model of fault is SEU (single error upset) which means that an arbitrary bit in the program counter register modified randomly (bit-flip) causing to violate the right program sequence. therefore, control flow error (CFE) occurs.
Since our target program is a multi-threaded program, we have to perform fault injection on all threads' PC. This is the task of saboteur tread. It should be able to obtain threads' PC to perform fault injection.
assume we have this code,
main ()
{
foo
}
void foo()
{
__asm__{
pop "%eax"
pop "%ebx" // now ebx holds porgram counter value (for main thread)
// her code injection like 00000111 XOR ebx for example
push ...
push ...
};
}
If our program was a multithreaded program.
is it means that we have more than one stack?
when OS perform context switching, it means that the stack and registers of the thread that was running moved to some place in the memory. Does this mean that if we want to get the values of the program counter for those threads, we find them in memory? where? and is it possible during run-time?
When you install a signal handler using sigaction() with SA_SIGINFO in the flags, the second parameter the signal handler gets is a pointer to siginfo_t, and the third parameter is a pointer to an ucontext_t. In Linux, this structure contains, among other things, the set of register values when the kernel interrupted the thread, including program counter.
#define _POSIX_C_SOURCE 200809L
#define _GNU_SOURCE
#include <signal.h>
#include <ucontext.h>
#if defined(__x86_64__)
#define PROGCOUNTER(ctx) (((ucontext *)ctx)->uc_mcontext.greg[REG_RIP])
#elif defined(__i386__)
#define PROGCOUNTER(ctx) (((ucontext *)ctx)->uc_mcontext.greg[REG_EIP])
#else
#error Unsupported architecture.
#endif
void signal_handler(int signum, siginfo_t *info, void *context)
{
const size_t program_counter = PROGCOUNTER(context);
/* Do something ... */
}
As usual, printf() et al. are not async-signal safe, which means it is not safe to use them in a signal handler. If you wish to output the program counter to e.g. standard error, you should not use any of the standard I/O to print to stderr, and instead construct the string to be printed by hand, and use a loop to write() the contents of the string; for example,
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
static void wrerr(const char *p)
{
const int saved_errno = errno;
const char *q = p;
ssize_t n;
/* Nothing to print? */
if (!p || !*p)
return;
/* Find end of q. strlen() is not async-signal safe. */
while (*q) q++;
/* Write data from p to q. */
while (p < q) {
n = write(STDERR_FILENO, p, (size_t)(q - p));
if (n > 0)
p += n;
else
if (n != -1 || errno != EINTR)
break;
}
errno = saved_errno;
}
Note that you'll want to keep the value of errno unchanged in the signal handler, so that if interrupted after a failed library function, the interrupted thread still sees the correct errno value. (It's mostly a debugging issue, and "good form"; some idiots pooh-pooh this as "it does not happen often enough for me to worry about".)
Your program can examine the /proc/self/maps pseudofile (it is not a real file, but something that the kernel generates on the fly when the file is read) to see the memory regions used by the program, to determine whether the program was running a C library function (very common) or something else when the interrupt was delivered.
If you wish to interrupt a specific thread in a multi-threaded program, just use pthread_kill(). Otherwise the signal is delivered to one of the threads that has not blocked the signal, more or less at random.
Here is an example program, that is tested to in x86-64 (AMD64) and x86, when compiled with GCC-4.8.4 using -Wall -O2:
#define _POSIX_C_SOURCE 200809L
#define _GNU_SOURCE
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
#include <signal.h>
#include <ucontext.h>
#include <time.h>
#include <stdio.h>
#if defined(__x86_64__)
#define PROGRAM_COUNTER(mctx) ((mctx).gregs[REG_RIP])
#define STACK_POINTER(mctx) ((mctx).gregs[REG_RSP])
#elif defined(__i386__)
#define PROGRAM_COUNTER(mctx) ((mctx).gregs[REG_EIP])
#define STACK_POINTER(mctx) ((mctx).gregs[REG_ESP])
#else
#error Unsupported hardware architecture.
#endif
#define MAX_SIGNALS 64
#define MCTX(ctx) (((ucontext_t *)ctx)->uc_mcontext)
static void wrerr(const char *p, const char *q)
{
while (p < q) {
ssize_t n = write(STDERR_FILENO, p, (size_t)(q - p));
if (n > 0)
p += n;
else
if (n != -1 || errno != EINTR)
break;
}
}
static const char hexc[16] = "0123456789abcdef";
static inline char *prehex(char *before, size_t value)
{
do {
*(--before) = hexc[value & 15];
value /= (size_t)16;
} while (value);
*(--before) = 'x';
*(--before) = '0';
return before;
}
static volatile sig_atomic_t done = 0;
static void handle_done(int signum)
{
done = signum;
}
static int install_done(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_handler = handle_done;
act.sa_flags = 0;
if (sigaction(signum, &act, NULL) == -1)
return errno;
return 0;
}
static size_t jump_target[MAX_SIGNALS] = { 0 };
static size_t jump_stack[MAX_SIGNALS] = { 0 };
static void handle_jump(int signum, siginfo_t *info, void *context)
{
const int saved_errno = errno;
char buffer[128];
char *p = buffer + sizeof buffer;
*(--p) = '\n';
p = prehex(p, STACK_POINTER(MCTX(context)));
*(--p) = ' ';
*(--p) = 'k';
*(--p) = 'c';
*(--p) = 'a';
*(--p) = 't';
*(--p) = 's';
*(--p) = ' ';
*(--p) = ',';
p = prehex(p, PROGRAM_COUNTER(MCTX(context)));
*(--p) = ' ';
*(--p) = '#';
wrerr(p, buffer + sizeof buffer);
if (signum >= 0 && signum < MAX_SIGNALS) {
if (jump_target[signum])
PROGRAM_COUNTER(MCTX(context)) = jump_target[signum];
if (jump_stack[signum])
STACK_POINTER(MCTX(context)) = jump_stack[signum];
}
errno = saved_errno;
}
static int install_jump(const int signum, void *target, size_t stack)
{
struct sigaction act;
if (signum < 0 || signum >= MAX_SIGNALS)
return errno = EINVAL;
jump_target[signum] = (size_t)target;
jump_stack[signum] = (size_t)stack;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_sigaction = handle_jump;
act.sa_flags = SA_SIGINFO;
if (sigaction(signum, &act, NULL) == -1)
return errno;
return 0;
}
int main(int argc, char *argv[])
{
const struct timespec sec = { .tv_sec = 1, .tv_nsec = 0L };
const int pid = (int)getpid();
ucontext_t ctx;
printf("Run\n");
printf("\tkill -KILL %d\n", pid);
printf("\tkill -TERM %d\n", pid);
printf("\tkill -HUP %d\n", pid);
printf("\tkill -INT %d\n", pid);
printf("or press Ctrl+C to stop this process, or\n");
printf("\tkill -USR1 %d\n", pid);
printf("\tkill -USR2 %d\n", pid);
printf("to send the respective signal to this process.\n");
fflush(stdout);
if (install_done(SIGTERM) ||
install_done(SIGHUP) ||
install_done(SIGINT) ) {
printf("Cannot install signal handlers: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
getcontext(&ctx);
if (install_jump(SIGUSR1, &&usr1_target, STACK_POINTER(MCTX(&ctx))) ||
install_jump(SIGUSR2, &&usr2_target, STACK_POINTER(MCTX(&ctx))) ) {
printf("Cannot install signal handlers: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
/* These are expressions that should evaluate to false, but the compiler
* should not be able to optimize them away. */
if (argv[0][1] == 'A') {
usr1_target:
fputs("USR1\n", stdout);
fflush(stdout);
}
if (argv[0][1] == 'B') {
usr2_target:
fputs("USR2\n", stdout);
fflush(stdout);
}
while (!done) {
putchar('.');
fflush(stdout);
nanosleep(&sec, NULL);
}
fputs("\nAll done.\n", stdout);
fflush(stdout);
return EXIT_SUCCESS;
}
If you save the above as example.c, you can compile it using
gcc -Wall -O2 example.c -o example
and run it
./example
Press Ctrl+C to exit the program. Copy the commands (for sending SIGUSR1 and SIGUSR2 signals), and run them from another window, and you'll see they modify the position for current execution. (The signals cause the program counter/instruction pointer to jump back, into an if clause that should never be executed otherwise.)
There are two sets of signal handlers. handle_done() just sets the done flag. handle_jump() outputs a message to standard error (using low-level I/O), and if specified, updates the program counter (instruction pointer) and stack pointer.
The stack pointer is the tricky part when creating an example program like this. It would be easy if we were satisfied with just crashing the program. However, an example is only useful if it works.
When we arbitrarily change the program counter/instruction pointer, and the interrupt was delivered when in a function call (most C library functions...), the return address is left on the stack. The kernel can deliver the interrupt at any point, so we cannot even assume that the interrupt was delivered when in a function call, either! So, to make sure the test program does not crash, I had to update the program counter/instruction pointer and stack pointer as a pair.
When a jump signal is received, the stack pointer is reset to a value I obtained using getcontext(). This is not guaranteed to be suitable for any jump location; it's just the best I could do for a minimal example. I definitely assume the jump labels are nearby, and not in subscopes where the compiler is likely to mess with the stack, mind you.
It is also important to keep in mind that because we are dealing with details left to the C compiler, we must conform to whatever binary code the compiler produces, not the other way around. For reliable manipulation of a process and its threads, ptrace() is a much better (and honestly, easier) interface. You just set up a parent process, and in the target traced child process, explicitly allow the tracing. I've shown examples here and here (both answers to the same question) on how to start, stop, and single-step individual threads in a target process. The hardest part is understanding the overall scheme, the concepts; the code itself is easier -- and much, much more robust than this signal-handler-context-manipulation way.
For self-introducing register errors (either to program counter/instruction pointer, or to any other register), with the assumption that most of the time that leads to the process crashing, this signal handler context manipulation should be sufficient.
No, it's not possible while a thread is executing. While a thread is executing, the current value of its program counter (EIP) is private to the CPU core it's running on. It's not available in memory anywhere.
It would be possible for an architecture to have special instructions to send inter-processor requests with queries about execution state, but x86 doesn't have this.
However, you can use ptrace system calls to do anything a debugger could; interrupt another thread and modify any of its state (general purpose registers, flags, program counter, etc. etc.) I can't give you an example, I just know that's the system call that debuggers use to modify the saved state of another thread / process. For example, this question asks about modifying another process's RIP using ptrace (for testing code-injection).
I'm not sure it's viable to ptrace one thread from another thread in the same process; your fault injector might work better as a separate process that interferes with the threads of another process.
Anyway, what will happen when you make a ptrace system call to modify something in another thread is that the CPU running your system call will send and inter-processor message to the kernel on the CPU running the other thread, which will interrupt that thread you want to mess with. Its state will be saved into memory by the kernel, where it can be modified by any CPU.
Once the other thread stops running, it isn't strongly associated with any CPU anymore. It will be cheaper to resume it on the CPU that already has hot caches for it, but that isn't guaranteed because that CPU could have started running any other thread once it was no longer busy running the thread you caused to be stopped.
Side note, not relevant to inter-thread fault injection:
Your C function for modifying EIP (foo()) is really ugly, BTW:
First of all, it's MSVC inline asm, so no Linux compiler will accept it (maybe icc?). Second, it only works with -fno-omit-frame-pointer, because it assumes that its inside a function that's pushed %ebp.
It would be so much easier to just write the whole function in asm. In 64bit non-inline asm, you'd just write:
global fault_inject_program_counter
fault_inject_program_counter:
xor qword [rsp], 0b00000111
ret
and assemble that file separately with NASM or YASM, and link the .o with code that calls it. (I'm assuming you'd prefer Intel syntax, since you used MSVC-style asm {} instead GNU C asm("pop ; ... ; "::: ); inline asm.)
an inline asm version might look like:
// this can't possibly work if inlined, or if compiled without `-fno-omit-frame-pointer
__attribute__((noinline)) void foo()
{
__asm__ volatile(
// "pop %eax\n\t"
// "pop %ebx\n\t" // now ebx holds the return address
// here code injection like 00000111 XOR ebx for example
// normal people would just write
"xorl $0b00000111, -4(%esp)\n\t"
// to modify the return value in-place, in a function with a frame pointer.
// push ...
// push ...
);
}

Resources