Canceling a long-running function using an ISR - c

is there a way of manipulating the stack from a timer ISR? So i can just throw away the highest frame of the stack by forcing a long-running function to exit? (I am aware of loosing the heap-allocated memory in this case)
The target would probably be an ARM CPU.
Best Regards

Looks like you want something like setjmp/longjmp with longjmp called after ISR termination.
It is possible alter ISR return address such a way, that instead of returning to long-running function longjmp will be called with right parameters and long-running function will be aborted to the place where setjmp was called.
Just another solution came in mind. May be it would be easier to restore all the registers (Stack pointer, PC, LR and others) to values they have before long-running functions was called in the ISR stack frame (using assembly). In order to do that you need to save all required values (using assembly) before long-running functions.

I would recommend avoiding a long-running function. While it may work in the short term, as your code grows it could become problematic.
Instead, consider using a state machine, or system of state machines in your master loop, and using your ISR for a flag. This will reduce timing issues and allow you to manage more tasks at once.

That's possible in theory, but probably impossible to do reliably.
You could use GCC builtins __builtin_frame_address and __builtin_return_address to correctly restore the stack and return from the previous function, but it will corrupt the program behavior. The function you forcibly return probably saved some registers on the stack, and needed to restore them before returning. The problem is, there is no way I know of to locate or mimic the restore code. It is certainly located just before the function returns (and you can't even know where this is), but it could be 1, 2, or even 0 instructions. And even if you locate it or mimic it, you can't really hardcode it, because it is likely to change when you change the function.
In conclusion, you may be able if you use some builtins and 2-3 inline assembly instructions, but you need to tailor-hardcode it for the function you want to return, and you have to change it whenever you change the function.

Why cant you just set a flag in your isr that your function will periodically check to see if it needs to exit? The reason I disapprove of the way you are trying to do it is because it is extremely dangerous to "kill" a function when it is in the middle of some operation. Unless you have a way to clean up absolutely everything after it (like when killing a process) there is no other way you can do it safely. It is always better to signal the function through a flag or semaphore of some kind from isr and then let that function cleanup after itself and exit normally.

Related

cgo Interacting with C Library that uses Thread Local Storage

I'm in the midst of wrapping a C library with cgo to be usable by normal Go code.
My problem is that I'd like to propagate error strings up to the Go API, but the C library in question makes error strings available via thread-local storage; there's a global get_error() call that returns a pointer to thread local character data.
My original plan was to call into C via cgo, check if the call returned an error, and if so, wrap the error string using C.GoString to convert it from a raw character pointer into a Go string. It'd look something like C.GoString(C.get_error()).
The problem that I foresee here is that TLS in C works on the level of native OS threads, but in my understanding, the calling Go code will be coming from one of potentially N goroutines that are multiplexed across some number of underlying native threads in a thread pool managed by the Go scheduler.
What I'm afraid of is running into a situation where I call into the C routine, then after the C routine returns, but before I copy the error string, the Go scheduler decides to swap the current goroutine out for another one. When the original goroutine gets swapped back in, it could potentially be on a different native thread for all I know, but even if it gets swapped back onto the same thread, any goroutines that ran there in the intervening time could've changed the state of the TLS, causing me to load an error string for an unrelated call.
My questions are these:
Is this a reasonable concern? Am I misunderstanding something about the go scheduler, or the way it interacts with cgo, that would cause this to not be an issue?
If this is a reasonable concern, how can I work around it?
cgo somehow manages to propagate errno values back to the calling Go code, which are also stored in TLS, which makes me think there must be a safe way to do this.
I can't think of a way that the C code itself could get preempted by the go scheduler, so should I introduce a wrapper C function and have IT make the necessary call and then conditionally copy the error string before returning back up to goland?
I'm interested in any solution that would allow me to propagate the error strings out to the rest of Go, but I'm hoping to avoid any solution that would require me to serialize accesses around the TLS, as adding a lock just to grab an error string seems greatly unfortunate to me.
Thanks in advance!
What I'm afraid of is running into a situation where I call into the C routine, then after the C routine returns, but before I copy the error string, the Go scheduler decides to swap the current goroutine out for another one. ...
Is this a reasonable concern?
Yes. The cgo "call C code" wrappers lock on to one POSIX / OS thread for the duration of each call, but the thread they lock is not fixed for all time; it does in fact bop around, as it were, to multiple different threads over time, as long as your goroutines are operating normally. (Since Go is cooperatively scheduled in the current implementations, you can, in some circumstances, be careful not to do anything that might let you switch underlying OS threads, but this is probably not a good plan.)
You can use runtime.LockOSThread here, but I think the best plan is otherwise:
how can I work around it?
Grab the error before Go resumes its normal scheduling algorithm (i.e., before unlocking the goroutine from the C / POSIX thread).
cgo somehow manages to propagate errno values ...
It grabs the errno value before unlocking the goroutine from the POSIX thread.
My original plan was to call into C via cgo, check if the call returned an error, and if so, wrap the error string using C.GoString to convert it from a raw character pointer into a Go string. It'd look something like C.GoString(C.get_error()).
If there is a variant of this that takes the error number (rather than fishing it out of a TLS variable), that plan should still work: just make sure that your C routines provide both the return value and the error number.
If not, write your own C wrapper, just as you suggested:
ftype wrapper_for_realfunc(char **errp, arg1type arg1, arg2type arg2) {
ftype ret = realfunc(arg1, arg2);
if IS_ERROR(ret) {
*errp = get_error();
} else {
*errp = NULL;
}
return ret;
}
Now your Go wrapper simply calls the wrapper, which fills in a pointer to C memory with an extra *C.char argument, setting it to nil if there is no error, and setting it to something on which you can use C.GoString if there is an error.
If that's not feasible for some reason, consider using runtime.LockOSThread and its counterpart, runtime.UnlockOSThread.

Practical Delimited Continuations in C / x64 ASM

I've look at a paper called A Primer on Scheduling Fork-Join Parallelism with Work Stealing. I want to implement continuation stealing, where the rest of the code after calling spawn is eligible to be stolen. Here's the code from the paper.
1 e();
2 spawn f();
3 g();
4 sync;
5 h();
An import design choice is which branch to offer to thief threads.
Using Figure 1, the choices are:
Child Stealing:
f() is made available to thief threads.
The thread that executed e() executes g().
Continuation Stealing:
Also called “parent stealing”.
The thread that executed e() executes f().
The continuation (which will next call g()) becomes available to thief threads.
I hear that saving a continuation requires saving both sets of registers (volatile/non-volatile/FPU). In the fiber implementation I did, I ended up implementing child stealing. I read about the (theoretical) negatives of child stealing (unbounded number of runnable tasks, see the paper for more info), so I want to use continuations instead.
I'm thinking of two functions, shift and reset, where reset delimits the current continuation, and shift reifies the current continuation. Is what I'm asking even plausible in a C environment?
EDIT: I'm thinking of making reset save return address / NV GPRs for the current function call (= line 3), and making shift transfer control to the next continuation after returning a value to the caller of reset.
I've implemented work stealing for a HLL called PARLANSE rather than C on an x86. PARLANSE is used daily to build production symbolic parallel programs at the million line scale.
In general, you have preserve the registers of both the continuation or the "child".
Consider that your compiler may see a computation in f() and see the same computation in g(), and might lift that computation to the point just before the spawn, and place that computation result in a register that both f() and g() use as in implied parameter.
Yes, this assumes a sophisticated compiler, but if you are using a stupid compiler that doesn't optimize, why are you trying to go parallel for speed?
In specific, however, your compiler could arrange for the registers to be empty before the call to spawn if it understood what spawn means. Then neither the continuation or the child has to preserve registers. (The PARLANSE compiler in fact does this).
So how much has to be saved depends on how much your compiler is willing to help, and that depends on whether it knows what spawn really does.
Your local friendly C compiler likely doesn't know about your implementation of spawn. So either you do something to force a register flush (don't ask me, its your compiler) or you put up with the fact that you personally don't know what's in the registers, and your implementation preserves them all to be safe.
If the amount of work spawned is significant, arguably it wouldn't matter if you saved all the registers. However, the x86 (and other modern architectures) seems have an enormous amount of state, mostly in the vector registers, that might be in use; last time I looked it was well in excess of 500 bytes ~~ 100 writes to memory to save these and IMHO that's an excessive price. If you don't believe these registers are going to be passed from the parent thread to the spawned thread, then you can work on enforcing spawn with no registers.
If you spawn routine wakes up using a standard continuation mechanism you have invented, then you have worry about whether your continuations pass large register state or not, also. Same problem, same solutions as for spawn; the compiler has to help or you personally have to intervene.
You'll find this a lot of fun.
[If you want to make it really interesting, try timeslicing the threads in case they go into deep computation without an occasional yeild causing thread starvation. Now you surely have save the entire state. I managed to get PARLANSE to realize spawning with no registers saved, yet have the time slicing save/restore full register state, by saving full state on a time slice, and continuing at a special place that refilled all the registers before it passed control to the time-sliced PC location].

What do they mean by scope of the program in c language?

While declaring the volatile keyword, the value of variable make change any moment from outside the scope of the program. What does that meant? Whether it will change outside the scope of main function or outside the scope of globally declared function? What is the perspective in terms of embedded system, if two or more events are performed simultaneously?
volatile was originally intended for stuff like reading from a memory mapped hardware device; each time you read from something like a memory address mapped to a serial port it might have a new value, even if nothing in your program wrote to it. volatile makes it clear that the data there may change at any time, so it should be reread each time, rather than allowing the compiler to optimize it to a single read when it knows your program never changes it. Similar cases can occur even without hardware interference; asynchronous kernel callbacks may write back into user mode memory in a similar way, so reading the value afresh each time is sometimes necessary.
Ab optimizing compiler assumes there is only the context of a single thread of execution. Another context means anything the compiler can't see happening at the same time. So this is hardware actions, interrupt handlers or other threads or processes. Where your code accesses a global (program or file level) variable the optimizer won't assume another context might change or read it unless you tell it by using the volatile qualifier.
Take the case of a hardware register that is memory mapped and you read in a while loop waiting for it to change. Without volatile the compiler only sees your while loop reading the register and if you allow the compiler to optimize the code it will optimize away the multiple reads and never see a change in the register. This is what we normally want the optimizing compiler to do with variables that don't change in a loop.
A similar thing happens to memory mapped hardware registers you write to. If your program never reads from them the compiler could optimize away the write. Again this is what you want an optimizing compiler to do when you are not dealing with a memory location that is used by hardware or another context.
Interrupt handlers and forked threads are treated the same way as hardware. The optimizer doesn't assume they are running at the same time and skip optimizing away a load or store to a shared memory location unless you use volatile.

Embedded system Can we use any function inside ISR?

I stumbled upon a this embedded system question, can we call an function inside an ISR ?
Working ARM Cortex M4, Have called function many times from ISR without any fault.
I assume behavior will be same for other micro controller as well or am i wrong ?
Note : Please ignore calling an function in ISR would increase my ISR time in turn increasing the interrupt latency.
Generally, there is nothing stopping you from calling a function from an ISR. There are however some things to consider.
First of all, you should keep ISRs as short as possible. Even the function call overhead might be considered too much in some cases. So if you call functions from inside an ISR, it might be wise to inline those functions.
You must also ensure that the called function is either re-entrant or that it isn't called by other parts of the code except the ISR. If a non re-entrant function is called by the main program and your ISR both, then you'll get severe but subtle "race condition" bugs. (Just as you will if the main program and the ISR modify the same shared variable non-atomically, without semaphore guards.)
And finally, designing a system with interrupts, where you don't know if there are other interrupts in the system, is completely unprofessional. You must always consider the program's interrupt situation as whole when designing the individual interrupts. Otherwise the program will have non-existent real-time performance and no programmer involved in the project will actually know what the program is doing. And from the point where nobody knows what they are doing, bugs are guaranteed to follow.
Some RTOS will enforce a policy of which of its macros can or can't be called from an ISR context, i.e. functions that will block on some shared resource. For example:
http://www.freertos.org/a00122.html

How can I allocate more stack in my program's death throes?

On a Posix system, I am coding a signal handler, using sigaction.
I would like to record some debug information, before calling exit(). This involves a few procedure calls.
If we have had a stack overflow, is there any way that I can make those function calls without messing things up further?
I know that after I do my debug stuff, I am going to call exit(), so we won't ever unwind the stack. Could I code a small assembler insert to set the stack pointer to the base of the stack?
Never mind that I am trashing it; it won't be needed later, and by trashing the start of the stack, I am not trashing beyond the end of it.
Has anyone done this, or an alternative, and shown it to work?
On POSIX, you can set up a separate stack for specific signal-handlers with sigaltstack(). The manpage on Linux for this function is very nice:
The most common usage of an alternate signal stack is to handle the
SIGSEGV
signal that is generated if the space available for the normal process
stack is exhausted: in this case, a signal handler for SIGSEGV cannot be
invoked on the process stack; if we wish to handle it, we must use an
alternate signal stack.
One thing to keep in mind is that you need to use sigaction() rather than signal() to establish the relevant signal-handler, but that's a good idea anyway. Also, the sa_flags for sigaction()s struct sigaction need to contain SA_ONSTACK.
There is no requirement that your program use the stack (for callframes, as opposed to the data structure). Being a turing-complete programming language, you can rewrite any functionally recursive loop (e.g. loops using function invocation) as a procedurally recursive loop (e.g. procedural loops such as for, while and do .. while) providing you introduce the appropriate language and data structures.
You might then find growing a stack (data structure, not the same as the callframe) to several gigabytes is trivial for most laptops using realloc... As an added bonus, you'll no longer need to delve into non-portable hacks such as implementation-defined assembler notations or signals.

Resources