One of my colleagues ran in some strange problems with programming an ATMega, related to accessing input - output ports.
Observing the problem after some research I concluded we should avoid accessing SFR's using operations which may compile to SBI or CBI instructions if we aim for a safe C standard compliant software. I am looking for whether this decision was righteous or not, so if my concerns here are valid.
The datasheet of the Atmel processor is here, it's an ATMega16. I will refer to some pages of this document below.
I will refer to the C standard using the version found on this site under the WG14 N1256 link.
The SBI and CBI instructions of the processor operate at bit-level accessing only the bit in question. So they are not true Read-Modify-Write (R-M-W) instructions since they, as I understand, do not perform a read (of the targeted 8 bit SFR).
On page 50 of the above datasheet the first sentence begins like All AVR ports have true Read-Modify-Write functionality..., while ongoing it specifies that this only applies to accesses with the SBI and CBI instructions which technically are not R-M-W. The datasheet does not define what reading for example the PORTx registers are supposed to return (it however indicates that they are readable). So I assumed reading these SFRs are undefined (they might return the last thing written on them or the current input state or whatever).
On page 70 it lists some external interrupt flags, this is interesting because this is where the nature of the SBI and CBI instructions come to be important. The flags are set when an interrupt occurred, and they may be cleared by writing them to one. So if SBI was a true R-M-W instruction, it would clear all three flags regardless of the bit specified in the opcode.
And now let's get into the matters of C.
The compiler itself is truly irrelevant, the only important fact is that it might use the CBI and SBI instructions in certain situations which I think make it non-compliant.
In the above mentioned C99 standard, the section 5.1.2.3 Program execution, point 2 and 3 refers to this (on page 13), and 6.7.3 Type qualifiers, point 6 (on page 109). The latter mentions that What constitutes an access to an object that has volatile-qualified type is implementation-defined, however a few phrases before it requires that any expression referring to such an object shall be evaluated strictly according to the rules of the abstract machine.
Also note that hardware ports such as that used in the example are declared volatile in the appropriate headers.
Example:
PORTA |= 1U << 6;
This is known to translate to an SBI. This implies that only a Write access happens on the volatile (PORTA) object. However if one would write:
var = 6;
...
PORTA |= 1U << var;
That would not translate to an SBI even though it will still only set one bit (since SBI has the bit to set encoded in the opcode). So this will expand to a true R-M-W sequence with a potentially different result than above (in the case of PORTA this is undefined behaviour as far as I could deduct from the datasheet).
By the C standard this behaviour might or might not be permitted. It is messy in that term too that here two things happen which mix in. One, the more apparent is the lack of the Read access in one of the cases. The other, less apparent is how the Write is performed.
If the compiled code omits the Read, it might fail to trigger hardware behaviour which is tied to such an access. However the AVR as far as I know has no such mechanism, so it might pass by the standard.
The Write is more interesting, however it also takes in the Read.
Omitting the Read in the case of using SBI implies that the affected SFR's must all work like latches (or any bit not working like so is either tied to 0 or 1), so the compiler can be sure of what it would read from them if it actually did the access. If this was not be the case then the compiler would at least be buggy. By the way this also clashes with that the datasheet did not define what is read from the PORTx registers.
How the write is performed is also a source of inconsistency: the result is different depending on how the compiler compiles it (a CBI or SBI affecting only one bit, a byte write affecting all bits). So writing code to clear / set one bit might either "work" (as in not "accidentally" clearing interrupt flags), or not if the compiler produces a true R-M-W sequence instead.
Maybe these are technically permitted by the C standard (as "implementation defined" behaviour, and the compiler deducting these cases that the Read access is not necessary to the volatile object), but at least I would consider it a buggy or inconsistent implementation.
Another example:
PORTA = PORTA | (1U << 6);
It is clearly visible that normally to conform with the standard a Read and then a Write of PORTA should be carried out. While according to the behaviour of SBI, it will lack a Read access, although as above this may pass for a mix of implementation defined behaviour and the compiler deducting that the Read is unnecessary here. (Or was my assumption wrong? That is assuming a |= b identical to a = a | b?)
So based on these I settled with that we should avoid these types of code as it is (or may be in the future) unclear how they might behave depending on whether the compiler would use SBI or CBI, or a true R-M-W sequence.
To tell the truth I mostly went after various forum posts etc. resolving this, not analysing actual compiler output. Not my project after all (and now I am not at work). I accepted it reading AVRFreaks for example that AVR-GCC would output these instructions in the above mentioned situations which alone may pose a problem even if with the actual version we used we wouldn't observe this. (However I think this case it stood as my suggestion to implement port accesses using a shadow work variables fixed the problems my colleague observed)
Note: I edited the middle based on some research on the C (C99) standard.
Edit: Reading the AVR Libc FAQ I again found something which contradicts the automatic use of SBI or CBI. It is the last question & answer where it specifically states that since the ports are declared volatile the compiler can not optimize out the read access, according to the rules of the C language (as it phrases).
I also understand that it is very unlikely that this particular behaviour (that is using SBI or CBI) would directly introduce bugs, but by masking "bugs" it may introduce very nasty ones in the long run if someone accidentally generalizes based on this behaviour while not understanding the AVR at assembly level.
You should probably stop trying to apply the C memory model to I/O registers. They are not plain memory. In the case of PORTn registers, it is in fact irrelevant whether it is a single bit write or a R-M-W operation unless you're mixing in interrupts. If you do read-modify-write an interrupt may alter state in between, causing a race condition; but that would be exactly the same issue for memory. The advantage of the SBI/CBI instructions there is that they are atomic.
The PORTn registers are readable, and also drive the output buffers. They are not different functions on read and write (as on PIC), but a normal register. Newer PICs also have the output registers readable on LAT addresses, precisely so you won't need a shadow variable. Other SFRs such as PINn or interrupt flags have more complicated behaviour. On recent AVRs, writing to PINn instead toggles bits in PORTn, which again is useful for its fast and atomic operation. Writing 1s to interrupt flag registers clears them, again to prevent race conditions.
The point is, these features are in place to produce correct behaviour for hardware aware programs, even if some of it looks odd in C code (i.e. using reg=_BV(2); instead of reg&=~_BV(2);). Precise compliance with the C standard is an impractical goal when the code is by its very nature hardware specific (though semantic similarity does help, which the interrupt flag behaviour fails at). Wrapping the odd constructs in inline functions or macros with names that explain what they truly do is probably a good idea, or at least commenting what the effects are. A set of such I/O routines could also form the basis of a hardware abstraction layer that may help you port code.
Trying to interpret the C specification strictly here is also rather confusing, as it doesn't admit to addressing bits (which is what SBI and CBI do), and digging through my old (1992) copy finds that volatile accesses may result in several implementation defined behaviours, including the possibility of no accesses at all.
Related
I'm working on a portable library for baremetal embedded applications.
Assume that I have a timer ISR that increments a counter and, in the main loop, this counter read is from in a most certainly not atomic load.
I'm trying to ensure load consistency (i.e. that I'm not reading garbage because the load was interrupted and the value changed) without resorting to disabling interrupts. It does not matter if the value changed after reading the counter as long as the read value is proper. Does this do the trick?
uint32_t read(volatile uint32_t *var){
uint32_t value;
do { value = *var; } while(value != *var);
return value;
}
It's highly unlikely that there's any sort of a portable solution for this, not least because plenty of C-only platforms are really C-only and use one-off compilers, i.e. nothing mainstream and modern-standards-compliant like gcc or clang. So if you're truly targeting entrenched C, then it's all quite platform-specific and not portable - to the point where "C99" support is a lost cause. The best you can expect for portable C code is ANSI C support - referring to the very first non-draft C standard published by ANSI. That is still, unfortunately, the common denominator - that major vendors get away with. I mean: Zilog somehow gets away with it, even if they are now but a division of Littelfuse, formerly a division of IXYS Semiconductor that Littelfuse had acquired.
For example, here are some compilers where there's only a platform-specific way of doing it:
Zilog eZ8 using a "recent" Zilog C compiler (anything 20 years old or less is OK): 8-bit value read-modify-write is atomic. 16-bit operations where the compiler generates word-aligned word instructions like LDWX, INCW, DECW are atomic as well. If the read-modify-write otherwise fits into 3 instructions or less, you'd prepend the operation with asm("\tATM");. Otherwise, you'd need to disable the interrupts: asm("\tPUSHF\n\tDI");, and subsequently re-enable them: asm("\tPOPF");.
Zilog ZNEO is a 16 bit platform with 32-bit registers, and read-modify-write accesses on registers are atomic, but memory read-modify-write round-trips via a register, usually, and takes 3 instructions - thus prepend the R-M-W operation with asm("\tATM").
Zilog Z80 and eZ80 require wrapping the code in asm("\tDI") and asm("\tEI"), although this is valid only when it's known that the interrupts are always enabled when your code runs. If they may not be enabled, then there's a problem since Z80 does not allow reading the state of IFF1 - the interrupt enable flip-flop. So you'd need to save a "shadow" of its state somewhere, and use that value to conditionally enable interrupts. Unfortunately, eZ80 does not provide an interrupt controller register that would allow access to IEF1 (eZ80 uses the IEFn nomenclature instead of IFFn) - so this architectural oversight is carried over from the venerable Z80 to the "modern" one.
Those aren't necessarily the most popular platforms out there, and many people don't bother with Zilog compilers due to their fairly poor quality (low enough that yours truly had to write an eZ8-targeting compiler*). Yet such odd corners are the mainstay of C-only code bases, and library code has no choice but to accommodate this, if not directly then at least by providing macros that can be redefined with platform-specific magic.
E.g. you could provide empty-by-default macros MYLIB_BEGIN_ATOMIC(vector) and MYLIB_END_ATOMIC(vector) that would be used to wrap code that requires access atomic with respect to a given interrupt vector (or e.g. -1 if with respect to all interrupt vectors). Naturally, replace MYLIB_ with a "namespace" prefix specific to your library.
To enable platform-specific optimizations such as ATM vs DI on "modern" Zilog platforms, an additional argument could be provided to the macro to separate the presumed "short" sequences that the compiler is apt to generate three-instruction sequences for vs. longer ones. Such micro-optimization requires usually an assembly output audit (easily automatable) to verify the assumption of the instruction sequence length, but at least the data to drive the decision would be available, and the user would have a choice of using it or ignoring it.
*If some lost soul wants to know anything bordering on the arcane re. eZ8 - ask away. I know entirely too much about that platform, in details so gory that even modern Hollywood CG and SFX would have a hard time reproducing the true depth of the experience on-screen. I'm also possibly the only one out there running the 20MHz eZ8 parts occasionally at 48MHz clock - as sure a sign of demonic possession as the multiverse allows. If you think it's outrageous that such depravity makes it into production hardware - I'm with you. Alas, business case is business case, laws of physics be damned.
Are you running on any systems that have uint32_t larger than a single assembly instruction word read/write size? If not, the IO to memory should be a single instructions and therefore atomic (assuming the bus is also word sized...) You get in trouble when the compiler breaks it up into multiple smaller read/writes. Otherwise, I've always had to resort to DI/EI. You could have the user configure your library such that it has information if atomic instructions or minimum 32-bit word size are available to prevent interrupt twiddling. If you have these guarantees, you don't need to verification code.
To answer the question though, on a system that must split the read/writes, your code is not safe. Imagine a case where you read your value in correctly in the "do" part, but the value gets split during the "while" part check. Further, in an extreme case, this is an infinite loop. For complete safety, you'd need a retry count and error condition to prevent that. The loop case is extreme for sure, but I'd want it just in case. That of course makes the run time longer.
Let's show a failure case for examples - will use 16-bit numbers on a machine that reads 8-bit values at a time to make it easier to follow:
Value to read from memory *var is 0x1234
Read 8-bit 0x12
*var becomes 0x5678
Read 8-bit 0x78 - value is now 0x1278 (invalid)
*var becomes 0x1234
Verification step reads 8-bit 0x12
*var becomes 0x5678
Verification reads 8-bit 0x78
Value confirmed correct 0x1278, but this is an error as *var was only 0x1234 and 0x5678.
Another failure case would be when *var just happens to change at the same frequency as your code is running, which could lead to an infinite loop as each verification fails. Or even if it did break out eventually, this would be a very hard to track performance bug.
Let me explain what I mean by data consistency issue. Take following scenario for example
uint16 x,y;
x=0x01FF;
y=x;
Clearly, these variables are 16 bit but if an 8 bit CPU is used with this code, read or write operations would not be atomic. Thereby an interrupt can occur in between and change the value.This is one situation which MIGHT lead to data inconsistency.
Here's another example,
if(x>7) //x is global variable
{
switch(x)
{
case 8://do something
break;
case 10://do something
break;
default: //do default
}
}
In the above excerpt code, if an interrupt is changing the value of x from 8 to 5 after the if statement but before the switch statement,we end up in default case, instead of case 8.
Please note, I'm looking for ways to detect such scenarios (but not solutions)
Are there any tools that can detect such issues in Embedded C?
It is possible for a static analysis tool that is context (thread/interrupt) aware to determine the use of shared data, and that such a tool could recognise specific mechanisms to protect such data (or lack thereof).
One such tool is Polyspace Code Prover; it is very expensive and very complex, and does a lot more besides that described above. Specifically to quote (elided) from the whitepaper here:
With abstract interpretation the following program elements are interpreted in new ways:
[...]
Any global shared data may change at any time in a multitask program, except when protection
mechanisms, such as memory locks or critical sections, have been applied
[...]
It may have improved in the long time since I used it, but one issue I had was that it worked on a lock-access-unlock idiom, where you specified to the tool what the lock/unlock calls or macros were. The problem with that is that the C++ project I worked on used a smarter method where a locking object (mutex, scheduler-lock or interrupt disable for example) locked on instantiation (in the constructor) and unlocked in the destructor so that it unlocked automatically when the object went out of scope (a lock by scope idiom). This meant that the unlock was implicit and invisible to Polyspace. It could however at least identify all the shared data.
Another issue with the tool is that you must specify all thread and interrupt entry points for concurrency analysis, and in my case these were private-virtual functions in task and interrupt classes, again making them invisible to Polyspace. This was solved by conditionally making the entry-points public for the abstract analysis only, but meant that the code being tested does not have the exact semantics of the code to be run.
Of course these are non-problems for C code, and in my experience Polyspace is much more successfully applied to C in any case; you are far less likely to be writing code in a style to suit the tool rather than the tool working with your existing code-base.
There are no such tools as far as I am aware. And that is probably because you can't detect them.
Pretty much every operation in your C code has the potential to get interrupted before it is finished. Less obvious than the 16 bit scenario, there is also this:
uint8_t a, b;
...
a = b;
There is no guarantees that this is atomic! The above assignment may as well translate to multiple assembler instructions, such as 1) load a into register, 2) store register at memory address. You can't know this unless you disassemble the C code and check.
This can create very subtle bugs. Any assumption of the kind "as long as I use 8 bit variables on my 8 bit CPU, I can't get interrupted" is naive. Even if such code would result in atomic operations on a given CPU, the code is non-portable.
The only reliable, fully-portable solution is to use some manner of semaphore. On embedded systems, this could be as simple as a bool variable. Another solution is to use inline assembler, but that can't be ported cross platform.
To solve this, C11 introduced the qualifier _Atomic to the language. However, C11 support among embedded systems compilers is still mediocre.
The title field isn't long enough to a capture a detailed question, so for the record, my actual question defines "unreasonable" in a specific way:
Is it legal1 for a C implementation have an arithmetic right
shift operator that returns different results, over time, for the
identical argument values? That is, must >> be a true function?
Imagine you wanted to write portable code using right-shift >> on signed values in C. Unfortunately, for you, the most efficient implementation of some critical part of your algorithm is fastest when signed right shifts fill the sign bit (i.e., they are arithmetic right shifts). Now since this behavior is implementation defined you are kind of screwed if you want to write portable code that takes advantage of it.
Just reading the compiler's documentation (which it is required to provide in the standard, but may or may not actually be easy to access or even exist) is great if you know all the compilers you will run on and will ever run on, but since that is often impossible, you might look for a way to do this portably.
One way I thought of is to simply test the compiler's behavior at runtime: it it appears to implement arithmetic2 right shift, use the optimized algorithm, but if not use a fallback that doesn't rely on it. Of course, just checking that say (short)0xFF00 >> 4 == 0xFFF0 isn't enough since it doesn't exclude that perhaps char or int values work differently, or even the weird case that it fills for some shift amounts or values but not for others3.
So given that a comprehensive approach would be to exhaustively check all shift values and amounts. For all 8-bit char inputs that's only 28 LHS values and 23 RHS values for a total of 211, while short (typically, but let's say int16_t if you want to be pedantic) totals only 220, which could be validated in a fraction of a second on modern hardware4. Doing all 237 32-bit values would take a few seconds on decent hardware, but still possibly reasonable. 64-bit is out for the foreseeable future however.
Let's say you did that and found that the >> behavior exactly matches the desired arithmetic shift behavior. Are you safe to rely on it according to the standard: does the standard constrain the implementation not to change its behavior at runtime? That is, does the behavior have to be expressible as a function of it's inputs, or can (short)0xFF00 >> 4 be 0xFFF0 one moment and then 0x0FF0 (or any other value) the next?
Now this question is mostly theoretical, but violating this probably isn't as crazy as it might seem, given the presence of hybrid architectures such as big.LITTLE that dynamically move processes from on CPU to another with possible small behavioral differences, or live VM migration between say chips manufactured by Intel and AMD with subtle (usually undocumented) instruction behavior differences.
1 By "legal" I mean according to the C standard, not according to the laws of your country, of course.
2 I.e., it replicates the sign bit for newly shifted in bits.
3 That would be kind of insane, but not that insane: x86 has different behavior (overflow flag bit setting) for shifts of 1 rather than other shifts, and ARM masks the shift amount in a surprising way that a simple test probably wouldn't detect.
4 At least anything better than a microcontroller. The inner validation loop is a few simple instructions, so a 1 GHz CPU could validate all ~1 million short values in ~1 ms at one instruction-per-cycle.
Let's say you did that and found that the >> behavior exactly matches the desired arithmetic shift behavior. Are you safe to rely on it according to the standard: does the standard constrain the implementation not to change its behavior at runtime? That is, does the behavior have to be expressible as a function of it's inputs, or can (short)0xFF00 >> 4 be 0xFFF0 one moment and then 0x0FF0 (or any other value) the next?
The standard does not place any requirements on the form or nature of the (required) definition of the behavior of right shifting a negative integer. In particular, it does not forbid the definition to be conditional on compile-time or run-time properties beyond the operands. Indeed, this is the case for implementation-defined behavior in general. The standard defines the term simply as
unspecified behavior where each implementation documents how the choice is made.
So, for example, an implementation might provide a macro, a global variable, or a function by which to select between arithmetic and logical shifts. Implementations might also define right-shifting a negative number to do less plausible, or even wildly implausible things.
Testing the behavior is a reasonable approach, but it gets you only a probabilistic answer. Nevertheless, in practice I think it's pretty safe to perform just a small number of tests -- maybe as few as one -- for each LHS data type of interest. It is extremely unlikely that you'll run into an implementation that implements anything other than standard arithmetic (always) or logical (always) right shift, or that you'll encounter one where the choice between arithmetic and logical shifting varies for different operands of the same (promoted) type.
The authors of the Standard made no effort to imagine and forbid every unreasonable way implementations might behave. In the days prior to the Standard, implementations almost always defined many behaviors based upon how the underlying platform worked, and the authors of the Standard almost certainly expected that most implementations would do likewise. Since the authors of the Standard didn't want to rule out the possibility that it might sometimes be useful for implementations to behave in other ways (e.g. to support code which targeted other platforms), however, they left many things pretty much wide open.
The Standard is somewhat vague as to the level of precision with which "Implementation Defined" behaviors need to be documented. I think the intention is that a person of reasonable intelligence reading the specification should be able to predict how a piece of code with Implementation-Defined behavior would behave, but I'm not sure that defining an operation as "yielding whatever happens to be in register 7" would be non-conforming.
From a practical perspective, what should matter is whether one is targeting quality implementations for platforms that have been developed within the last quarter century, or whether one is trying to force even deliberately-obtuse compilers to behave in controlled fashion. If the former, it may be worth
using a static assert to ensure the -1>>1==-1, but quality compilers for modern
hardware where that test passes are going to use arithmetic right shift
consistently. While targeting code to obtuse compilers might possibly have
some purpose, it's not generally possible to guard against all the ways that
a pathological-but-conforming compiler could sabotage one's code, and efforts
spent attempting to do so could often be more effectively spent on getting a
quality compiler.
I am working with the registers of an ARM Cortex M3. In the documentation, some of the bits may be "reserved". It is unclear to me how I should deal with these reserved bits when writing on the registers.
Are these reserved bits even writeable? Should I be cautious to not touch them? Will something bad happen if I touch them?
This is a classic embedded world problem as to what to do with reserved bits! First, you should NOT write randomly into it lest your code becomes un-portable. What happens when the architecture assigns a new meaning to the reserved bits in future? Your code will break. So the best mantra when dealing with registers having reserved bits is Read-Modify-Write. i.e read the register contents, modify only the bits you want and then write back the value so that reserved bits are untouched ( untouched, does not mean we dont write into them, but in the sense, that we wrote that which was there before )
For example, say there is a register in which only the LSBit has meaning and all others are reserved. I would do this
ldr r0,=memoryAddress
ldr r1,[r0]
orr r1,r1,#1
str r1,[r0]
If there is no other clue in the documentation, write a zero. You cannot avoid writing to a few reserved bits spread around in a 32-bit register.
Read-Modify-Write should work most of the time, however there are cases where reserved bits are undefined on read but must be written with a specific value. See this post from the LPC2000 group (the whole thread is quite interesting too). So, always check the docs carefully, and also any errata that's available. When in doubt or docs are unclear, don't hesitate to write to the manufacturer.
Ideally you should read-modify-write, no guarantee for success, when you change to a newer chip with different bits, you are changing your code anyway. I have seen vendors where writing zeros to the reserved bits failed when they revved the chip and the code had to be touched. So there are no guarantees. The biggest clue is if in the vendors code you see a register or set that are clearly read-modify-write or clearly just a write. This could be different developers writing different sections of the example or there is a register in that peripheral that is sensitive, has an undocumented bit, and needs the read-modify-write.
On the chips that I work on I make sure that undocumented (to the customer), but not unused bits are marked in some way to stand out from other unused bits. We normally mark unused/reserved bits as zero, and these other bits get a name, and a must write this value marking. Not all vendors do this.
The bottom line is there is no guarantee, assume all documentation and example programs have bugs and you have to hack your way through to figure out what is right and what is wrong. No matter what path you take (read-modify-write, write zeros, etc) you will be wrong from time to time and have to re-do the code to match a hardware change. I strongly suggest that if a vendor has a chip id of some sort, that your software reads that ID and if it is an id that you have not tested your code against, declare a failure and not program that part. In production testing long before a customer sees the product, the part change will get detected and software will be involved in understanding the reason for the part change, the resolution being the alternate part is not compatible and rejected or the software changes, etc.
Reserved most of the time mean that they aren't used in this chip, but they might be used on feature devices (other product line). (Most chip manufacturers produce one peripheral driver and they use it for all there chips. This way it's mostly copy past work and there is less change for errors) Most of the time it doesn't matter if you write to reserved bits in peripheral registers, this because there isn't any logic attached to it.
It is possible that if you write something to it, it won't be stored and next time you attempt to read the register / bits it seams unchanged.
I'm working on an embedded project (PowerPC target, Freescale Metrowerks Codewarrior compiler) where the registers are memory-mapped and defined in nice bitfields to make twiddling the individual bit flags easy.
At the moment, we are using this feature to clear interrupt flags and control data transfer. Although I haven't noticed any bugs yet, I was curious if this is safe. Is there some way to safely use bit fields, or do I need to wrap each in DISABLE_INTERRUPTS ... ENABLE_INTERRUPTS?
To clarify: the header supplied with the micro has fields like
union {
vuint16_t R;
struct {
vuint16_t MTM:1; /* message buffer transmission mode */
vuint16_t CHNLA:1; /* channel assignement */
vuint16_t CHNLB:1; /* channel assignement */
vuint16_t CCFE:1; /* cycle counter filter enable */
vuint16_t CCFMSK:6; /* cycle counter filter mask */
vuint16_t CCFVAL:6; /* cycle counter filter value */
} B;
} MBCCFR;
I assume setting a bit in a bitfield is not atomic. Is this a correct assumption? What kind of code does the compiler actually generate for bitfields? Performing the mask myself using the R (raw) field might make it easier to remember that the operation is not atomic (it is easy to forget that an assignment like CAN_A.IMASK1.B.BUF00M = 1 isn't atomic).
Your advice is appreciated.
Atomicity depends on the target and the compiler. AVR-GCC for example trys to detect bit access and emit bit set or clear instructions if possible. Check the assembler output to be sure ...
EDIT: Here is a resource for atomic instructions on PowerPC directly from the horse's mouth:
http://www.ibm.com/developerworks/library/pa-atom/
It is correct to assume that setting bitfields is not atomic. The C standard isn't particularly clear on how bitfields should be implemented and various compilers go various ways on them.
If you really only care about your target architecture and compiler, disassemble some object code.
Generally, your code will achieve the desired result but be much less efficient than code using macros and shifts. That said, it's probably more readable to use your bit fields if you don't care about performance here.
You could always write a setter wrapper function for the bits that is atomic, if you're concerned about future coders (including yourself) being confused.
Yes, your assumption is correct, in the sense that you may not assume atomicity. On a specific platform you might get it as an extra, but you can't rely on it in any case.
Basically the compiler performs masking and things for you. He might be able to take advantage of corner cases or special instructions. If you are interested in efficiency look into the assembler that your compiler produces with that, usually it is quite instructive. As a rule of thumb I'd say that modern compilers produces code that is as efficient as medium programming effort would be. Real deep bit twiddeling for your specific compiler could perhaps gain you some cycles.
I think that using bitfields to model hardware registers is not a good idea.
So much about how bitfields are handled by a compiler is implementation-defined (including how fields that span byte or word boundaries are handled, endianess issues, and exactly how getting, setting and clearing bits is implemented). See C/C++: Force Bit Field Order and Alignment
To verify that register accesses are being handled how you might expect or need them to be handled, you would have to carefully study the compiler docs and/or look at the emitted code. I suppose that if the headers supplied with the microprocessor toolset uses them you can be assume that most of my concerns are taken care of. However, I'd guess that atomic access isn't necessarily...
I think it's best to handle these type of bit-level accesses of hardware registers using functions (or macros, if you must) that perform explicit read/modify/write operations with the bit mask that you need, if that's what your processor requires.
Those functions could be modified for architectures that support atomic bit-level accesses (such as the ARM Cortex M3's "bit-banding" addressing). I don't know if the PowerPC supports anything like this - the M3 is the only processor I've dealt with that supports it in a general fashion. And even the M3's bit-banding supports 1-bit accesses; if you're dealing with a field that's 6-bits wide, you have to go back to the read/modify/write scenario.
It totally depends on the architecture and compiler whether the bitfield operations are atomic or not. My personal experience tells: don't use bitfields if you don't have to.
I'm pretty sure that on powerpc this is not atomic, but if your target is a single core system then you can just:
void update_reg_from_isr(unsigned * reg_addr, unsigned set, unsigned clear, unsigned toggle) {
unsigned reg = *reg_addr;
reg |= set;
reg &= ~clear;
reg ^= toggle;
*reg_addr = reg;
}
void update_reg(unsigned * reg_addr, unsigned set, unsigned clear, unsigned toggle) {
interrupts_block();
update_reg_from_isr(reg_addr, set, clear, toggle);
interrupts_enable();
}
I don't remember if powerpc's interrupt handlers are interruptible, but if they are then you should just use the second version always.
If your target is a multiprocessor system then you should make locks (spinlocks, which disable interrupts on the local processor and then wait for any other processors to finish with the lock) that protect access to things like hardware registers, and acquire the needed locks before you access the register, and then release the locks immediately after you have finished updating the register (or registers).
I read once how to implement locks in powerpc -- it involved telling the processor to watch the memory bus for a certain address while you did some operations and then checking back at the end of those operations to see if the watch address had been written to by another core. If it hadn't then your operation was sucessful; if it had then you had to redo the operation. This was in a document written for compiler, library, and OS developers. I don't remember where I found it (probably somewhere on IBM.com) but a little hunting should turn it up. It probably also has info on how to do atomic bit twiddling.