In a signal handler under Linux, one has access to the saved context (all register values) of the suspended thread. These register values are obviously architecture dependent. For example, for a PowerPC Little Endian (ppcle) architecture, ucontext->uc_regs->gp_regs is an array that contains the values of the general purpose registers.
For certain architectures there are also defines (e.g., the REG_XXX defines for x86-64) which identify the purpose of the registers. For ppc64le such definitions are missing. How can I figure out which registers are which? The little IBM documentation available did not help...
I'm not aware of this being documented anywhere. However, setup_sigcontext for ppc64 fills in the gp_regs array from a struct pt_regs that forms part of the task state. Therefore, that struct can be taken as a guide for which registers are which. There is also a set of PT_Rxxx defines immediately below the definition of that struct, which confirms bits of the mapping that are not immediately obvious from the struct (e.g. general purpose register 1 is indeed in gp_regs[1]).
Related
As stated, what software-visible processor state needs to go in a jmp_buf on an x86-64 processor when setjmp(jmp_buf env) is called? What processor state does not?
I have been reading a lot about setjmp and longjmp but couldn't find a clear answer to my question. I know it is implementation dependent but I would like to know for the x86_64 architecture.
From the following implementation
it seems that on an x86-64 machine all the callee saved registers (%r12-%r15, %rbp, %rbx) need to be saved as well as the stack pointer, program counter and all the saved arguments of the current environment. However I'm not sure about that, hope someone could clarify that for me.
For example, which x86-64 registers need to be saved? What about condition flags? For example, I think the floating point registers do not need to be saved because they don't contribute to the state of the program.
That's because of the calling convention. setjmp is a function-call that can return multiple times (the first time when you actually call it, later times when a child function calls longjmp), but it's still a function call. Like any function call, the compiler assumes that all call-clobbered registers have been clobbered, so longjmp doesn't need to restore them.
So yes, they're not part of the "program state" on a function call boundary because the compiler-generated asm is definitely not keeping any values in them.
You're looking at glibc's implementation for the x86-64 System V ABI, where all vector / x87 registers are call-clobbered and thus don't have to be saved.
In the Windows x86-64 calling convention, xmm6-15 are call-preserved (just the low 128 bits, not the upper portions of y/zmm6-15), and would have to be part of the jmp_buf.
i.e. it's not the CPU architecture that's relevant here, it's the software calling convention.
Besides the call-preserved registers, one key thing is that it's only legal to longjmp to a jmp_buf saved by a parent function, not from any arbitrary function after the function that called setjmp has returned.
If setjmp had to support that, it would have to save the entire stack frame, or actually (for the function to be able to return, and that parent to be able to return, etc.) the whole stack all the way up to the top. This is obviously insane, and thus it's clear why longjmp has that restriction of only being able to jump to parent / (great) grandparent functions, so it just has to restore the stack pointer to point at the still-existing stack frame and restore whatever local variables in that function might have been modified since setjmp.
(On C / C++ implementations on architectures / calling conventions that use something other than a normal call-stack, a similar argument about the jump-target function being able to return still applies.)
As the jmp_buf is the only place that can be used to restore processor state on a longjmp, it's generally everything that is needed to restore the full state of the machine as it was when setjmpis called.
This obviously depends very much on the processor and the compiler (what exactly does it use of the CPU's features to store program state):
On an ideal pure-stack machine that holds information of CPU state nowhere but the stack, that would be the stack pointer only. Other than in very old or purely academical implementations, such machines do rarely exist. You could, however, write a compiler on a modern machine like an x86 that solely uses the stack to store such information. For such a hypothetical compiler, saving the stack pointer only would suffice to restore program state.
On a more common, practical machine, this might be the stack pointer and the full set of registers used to store program status.
On some CPUs that store program status information in other places, for example in a zero page, and compilers that make use of such CPU features, the jmp_buff would also need to store a copy of this zero page (certain 65xx CPUs or ATmel AVR MCUs and their compilers might use this feature)
I'm flipping through some C header files for a microcontroller, and I keep seeing register addresses initialized as vuint. I haven't come across this data type before, so I did a bit of searching, with no real results. The closest I got was from https://stackoverflow.com/a/12855989, which tells me that v stands for "volatile". So, I have volatile unsigned ints holding hardware register addresses. As in, I have a data type that explicitly states "This address is subject to change", representing registers that are hard-wired, and cannot change, like, ever. Is my understanding of vuint incorrect? If not, why are we representing addresses this way?
Memory mapped registers are set as volatile because the values in them can change for external reasons (hardware interrupt, etc...) that the compiler does not know about. This means that the compiler should avoid certain optimizations and ensure that reads to the address are actually made (rather than being optimized out for cached values, etc...).
Quick example, memory mapped register that contains some flags.
read flags
set bit in flags
interrupt sets another bit
<compiler optimizes and cached flags from before>
read flags <contains incorrect cached value>
I think you are misinterpreting the type. It is mostly likely a pointer to a volatile unsigned integer, indicating the unsigned integer is volatile and not the pointer. This is typical when describing hardware registers via structs. Each of the struct members will be a volatile unsigned integer and somewhere there will be a base address defined that indicates where the registers start in the memory map.
I'd like to be able to use something like this to make access to my ports clearer:
typedef struct {
unsigned rfid_en: 1;
unsigned lcd_en: 1;
unsigned lcd_rs: 1;
unsigned lcd_color: 3;
unsigned unused: 2;
} portc_t;
extern volatile portc_t *portc;
But is it safe? It works for me, but...
1) Is there a chance of race conditions?
2) Does gcc generate read-modify-write cycles for code that modifies a single field?
3) Is there a safe way to update multiple fields?
4) Is the bit packing and order guaranteed? (I don't care about portability in this case, so gcc-specific options to make it Do What I Mean are fine.)
Handling race conditions must be done by operating system level calls (which will indeed use read-modify-writes), GCC won't do that.
Idem., and no GCC does not generate read-modify-write instructions for volatile. However, a CPU will normally do the write atomically (simply because it's one instruction). This holds true if the bit-field stays within an int for example, but this is CPU/implementation dependent; I mean some may guarantee this up to 8-byte value, while other only up to 4-byte values. So under that condition, bits can't be mixed up (i.e. a few written from one thread, and others from another thread won't occur).
The only way to set multiple fields at the same time, is to set these values in an intermediate variable, and then assign this variable to the volatile.
The C standard specifies that bits are packed together (it seems that there might be exceptions when you start mixing types, but I've never seen that; everyone always uses unsigned ...).
Note: Defining something volatile does not cause a compiler to generate read-modify-writes. What volatile does is telling the compiler that an assignment to that pointer/address must always be made, and may not be optimised away.
Here's another post about the same subject matter. I found there to be quite a few other places where you can find more details.
The keyword volatile has nothing to do with race conditions, or what thread is accessing code. The keyword tells the compiler not to cache the value in registers. It tells the compiler to generate code so that every access goes to the location allocated to the variable, because each access may see a different value. This is the case with memory mapped peripherals. This doesn't help if your MPU has it's own cache. There are usually special instructions or un-cached areas of the memory map to ensure the location, and not a cached copy, is read.
As for being thread safe, just remember that even a memory access may not be thread safe is it is done in two instructions. E.g. in 8051 assembler, you have to get a 16 bit value one byte at a time. The instruction sequence can be interrupted by an IRQ or another thread and the second byte read or written, potentially corrupted.
The only thing that I know about the mechanism of how C passes values is that it is done either through a register or the stack.
Register or Stack? Exactly how?
Both. And the conventions will vary by platform.
On x86, values are usually passed by stack. On x64, passing by register is preferred.
In all cases, if you have too many parameters, some will have to be passed by stack.
Refer to x86 calling conventions
Typically (some compilers will do it differently as pointed out) for normal function calls they are passed on the stack. That is usually it is a series of push instructions that just put the data onto the stack.
There are special cases such as system calls where parameters get passed via assembly instructions and registers. In hardware cases they are passed via registers or even certain interrupt signals which consequently write to registers.
On architectures with high numbers of registers they are usually passed via registers such as some RISC and 64 bit architectures.
I'm using 32-bit microcontroller (STR91x). I'm concurrently accessing (from ISR and main loop) struct member of type enum. Access is limited to writing to that enum field in the ISR and checking in the main loop. Enum's underlying type is not larger than integer (32-bit).
I would like to make sure that I'm not missing anything and I can safely do it.
Provided that 32 bit reads and writes are atomic, which is almost certainly the case (you might want to make sure that your enum's word-aligned) then that which you've described will be just fine.
As paxdiablo & David Knell said, generally speaking this is fine. Even if your bus is < 32 bits, chances are the instruction's multiple bus cycles won't be interrupted, and you'll always read valid data.
What you stated, and what we all know, but it bears repeating, is that this is fine for a single-writer, N-reader situation. If you had more than one writer, all bets are off unless you have a construct to protect the data.
If you want to make sure, find the compiler switch that generates an assembly listing and examine the assembly for the write in the ISR and the read in the main loop. Even if you are not familiar with ARM assembly, I'm sure you could quickly and easily be able to discern whether or not the reads and writes are atomic.
ARM supports 32-bit aligned reads that are atomic as far as interrupts are concerned. However, make sure your compiler doesn't try to cache the value in a register! Either mark it as a volatile, or use an explicit memory barrier - on GCC this can be done like so:
int tmp = yourvariable;
__sync_synchronize(yourvariable);
Note, however, that current versions of GCC person a full memory barrier for __sync_synchronize, rather than just for the one variable, so volatile is probably better for your needs.
Further, note that your variable will be aligned automatically unless you are doing something Weird (ie, explicitly specifying the location of the struct in memory, or requesting a packed struct). Unaligned variables on ARM cannot be read atomically, so make sure it's aligned, or disable interrupts while reading.
Well, it depends entirely on your hardware but I'd be surprised if an ISR could be interrupted by the main thread.
So probably the only thing you have to watch out for is if the main thread could be interrupted halfway through a read (so it may get part of the old value and part of the new).
It should be a simple matter of consulting the specs to ensure that interrupts are only processed between instructions (this is likely since the alternative would be very complex) and that your 32-bit load is a single instruction.
An aligned 32 bit access will generally be atomic (unless it were a particularly ludicrous compiler!).
However the rock-solid solution (and one generally applicable to non-32 bit targets too) is to simply disable the interrupt temporarily while accessing the data outside of the interrupt. The most robust way to do this is through an access function to statically scoped data rather than making the data global where you then have no single point of access and therefore no way of enforcing an atomic access mechanism when needed.