MISRA 11.3: cast from int to pointer - c

I have one question on how to solve MISRA 2004 11.3 violation.
The code is as follows:
tm_uint8 read( tm_uint8* data)
{
data[0] = *((tm_uint8*)0x00003DD2);
data[1] = *((tm_uint8*)0x00003DD3);
data[2] = *((tm_uint8*)0x00003DD4);
}
I want to write the value stored at the physical address. It compiles but I have a MISRA violation for 11.3. I want to solve it. Can anyone help me with that?

The rationale behind this rule is that MISRA worries about misaligned access when casting from an integer to a pointer. In your case, I assume tm_uint8_t is 1 byte, so alignment shouldn't be an issue here. In that case, the warning is simply a false positive and you can ignore it. This is an advisory rule, so you don't need to raise a deviation.
There is no other work-around, except never working with absolute addresses. Which is most likely not an option here. As you can tell, this rule is very cumbersome when writing hardware-related code, there is just no way such code can follow the rule.

Note: MISRA-C:2004 Rule 11.3 is equivalent to MISRA C:2012 Rule 11.4
It is accepted that some MISRA C/C++ Rules may cause a violation to be raised, in situations where the method used is necessary.
Because of this, MISRA C provides a mechanism for Deviating a Rule - and this would be the appropriate route for you to follow... please do not try and find a way around the Rule using "clever" code!
As highlighted by this question, accessing specific memory (and/or I/O devices) is one particular case. In fact, one of the examples included in MISRA C:2012 shows this exact use case as being non-compliant with the Rule.
In 2016, the MISRA C Working Group published additional guidance on Compliance, including enhancing the Deviation process... this gives help on what is a justifiable Deviation - and accessing hardware is one of those!
In due course, it is planned to provide more "layered" guidance... but that will not be immediate.
[Please note profile for disclaimer]

Related

MISRA rule 18.3 preventing copying data to RAM

MISRA has a problem with the following code:
extern uint32_t __etext;
extern uint32_t __data_start__, __data_end__;
uint32_t* src = (uint32_t*)&__etext; // ROM location of code to be copied into ram
uint32_t* dst = (uint32_t*)&__data_start__; // Start of RAM data section
while (dst < &__data_end__) // Loop until we reach the end of the data section
{
*dst++ = *src++;
}
I am getting a violation for rule 18.3:
The relational operators >, >=, < and <= shall not be applied to objects of pointer type except when they point to the same object.”
The rationale behind the rule is that attempting to make comparisons between pointers will produce undefined behavior if the two pointers do not point to the same object.
Why is this incorrect code? This seems like pretty generic boot code which is doing the right thing.
MISRA C:2012 required Rule 18.3 is undecidable because it is often impossible to determine, statically, whether two pointers are pointing to the same object.
In the example cited, as long as you can demonstrate that the two pointers are, indeed, pointing to the same object (or block of memory) and that __data_end__ is higher up the memory map than __data_start__ then this is an easy documentation task.
s3.4 of MISRA Compliance applies - this appears to fall within Category (2). This is not the same as a formal deviation, but does still need appropiate review/signoff.
Whatever you do, do not change the code to try and create some clever mechanism that you claim is MISRA compliant!
Why is this incorrect code? This seems like pretty generic boot code which is doing the right thing.
Because the code invokes undefined behavior, as specified by the C standard (additive operators 6.5.6/8), MISRA or no MISRA. For this reason, the compiler might generate incorrect code.
Possible work-arounds:
Create a big array or struct object covering the whole area to copy from and the area to copy into.
Use a compiler with a known and documented non-standard extension which allows you address absolute addresses regardless of what happens to be stored there. Then document it yourself in turn, in your MISRA documentation. (Most embedded compilers should support this. gcc might, but I don't know how without looking it up.)
Use integers instead of pointers and deviate from the MISRA rule regarding casting between integers and pointers. Which is an impossible rule to follow in embedded systems anyway.
Other problems:
(Severe) You have missing volatile bugs all over the code, which could result in incorrect code generation.
(Severe) You aren't using const correctness for variables stored in ROM which is surely a bug.
(Severe) Using extern in a safety/mission-critical application is highly questionable practice.
(Minor) Identifiers starting with double underscore are reserved by the compiler. If this code is from your home-made CRT it might be ok, but not if this code is from some generic bootloader.
(Minor) *dst++ = *src++; violates MISRA C 13.3. You need to do *dst = *src; dst++; src++;.
With linker defined symbols like this, the only way to satisfy this rule (without breaking 18.2 instead) is to cast the pointers to integers before comparing them. I don't think this change would help readability, so I would suggest making an exemption (called a "deviation record" in MISRA) and disabling the rule for this line of code.

Best way to avoid cast integer to pointer when dealing with memory access (MISRA 2008)

I have a bare-metal program(driver) which reads/writes some memory-mapped registers.
e.g:
void foo_read(uint64_t reg_base, uint32_t *out_value)
{
*out = READREG(reg_base + FOO_REG_OFFSET);
}
reg_base is the base address of the memory-mapped device (64-bit
address)
FOO_REG_OFFSET is the offset of the register (#define FOO_REG_OFFSET
0x00000123). Register "foo" is 32-bit "wide".
READREG is defined like this:
#define READREG(address) (*(uint32_t*)(address))
As you can guess MISRA 2008 is not happy with the cast from unsigned long long to pointer (5-2-7/5-2-8 are reported). My question is: what is the best/appropriate way to access memory and get rid of MISRA warnings? I've tried to cast to uintptr_t before casting to pointer, but This didn't help.
Thank you.
OK a few things here - first of all, your definition for READ_REG is missing a volatile -- it should be something like
#define READREG(address) (*(uint32_t volatile *)(address))
Secondly - and this is CPU-specific of course - generally speaking, reading a 32-bit value from an odd address (offset 0x123) won't work - at a minimum it will be slow (multiple bus cycles), and on many architectures, you'll hit a processor exception. (BTW, please note that pointer arithmetic doesn't come into play here, since the 2 values are added before being cast to a pointer.)
To answer your original question:
what is the best/appropriate way to access memory and get rid of MISRA
warnings
Well -- you are violating a MISRA rule (you have to in this case, we've all been there...) so you will get a warning.
So what you need to do is suppress the warning(s) in a disciplined, systematic and easily identifiable way. In my opinion, there is no better example and explanation of this than in the Quantum Platform (QP) Event-driven framework's code, which is open source. Specifically:
Check out the QP's MISRA Compliance matrix for examples of how this is handled -- for example, just search the PDF for the Q_UINT2PTR_CAST macro
Check out the QP's actual source code - for example, the macro that wraps/encapsulates such "int to ptr" casts (this way they are done in a manner that is easy to identify, and easy to change/suppress warnings for in a single place)
Lastly, check out the PC-Lint config file qpc.lnt, where you can see how/where the warnings are suppressed in a single place. THis is explained in this app note, section 6.3:
6.3 Rule 5-2-8(req)
An object with integer type or pointer to void type shall not be
converted to an object with pointer type.
The QP/C++ applications might deviate from rule 5-2-8 when they need
to access specific hard-coded hardware addresses directly. The QP/C++
framework encapsulates this deviation in the macro Q_UINT2PTR_CAST().
The following code snippet provides a use case of this macro:
#define QK_ISR_EXIT() . . . \
*Q_UINT2PTR_CAST(uint32_t, 0xE000ED04U) = \
I don't have time to talk about MISRA warning suppresions, compliance issues, etc. but the above gives you everything you need.
P.S. Not sure which MISRA guidelines you are referring to -- for C, there are the 2004 & 2012 guidelines, and for C++, there are the 2008 guidelines (I know, it's almost 2017!)

Issues covered by rule 3.1 of misra c 2004 "Implementation-defined behavior documented"

In this rule you have to go to ISO/IEC 9899:1990 Appendix G and study each case of Implementation defined behavior to document them.
It's a difficult task to determine what are the manual checks to do in the code.
Is there some kind of list of manual checks to do because of this rule?
MISRA-C is primarily concerned with avoiding unpredictable behavior in the C language, those “traps and pitfalls” (such as undefined and unspecified behavior) all C developers should be aware of that a compiler will not always warn you about. This includes implementation-defined behavior, where the C standard specifies the behavior of certain constructs after compilation can vary. These tend to be less critical from a safety point of view, provided the compiler documentation describes its intended behavior as required by the standard.
That is, for each specific compiler the behavior is well-defined, but the concern is to assure the developers have verified this, including documenting language extensions, known bugs in the compiler (and build chain) and workarounds.
Although it is possible to manually check C code fully for MISRA􏰀-C compliancy, it is not recommended. The guidelines were developed with static analysis tools in mind. Not all guidelines can be fully checked by tools, but the better MISRA-C tools (be careful in your evaluations, there are not many “good” ones), will at least assist where it can identify automatically where code relies on implementation-specific behavior. This includes all the checks required in Rule 3.1., where implementation-defined behavior cannot be completely checked by a tool, then a manual review will be required.
Also, if you are starting a new MISRA-C project, I highly recommend referring to MISRA-C:2012, even if you are required to be MISRA-C:2004 compliant. Having MISRA-C:2012 around helps, because it has clarified many of the guidelines, including additional rationale, explanations and examples. The standard (which can be obtained at misra-c.com ) lists the C90 and C99 implementation-defined behaviors that are considered to have the potential to cause unintended behavior. This may or may not overlap with guidelines that address implementation-defined behaviors that MISRA-C is specifically concerned about.
First of all, the standard definition of implementation-defined behavior is: specific behavior which the compiler must document. So you can always refer to the compiler documentation whenever there is a need to document how a certain implementation-defined behavior is implemented.
What's left to you to do then is to document where the code relies on implementation-defined behavior. This is preferably done in source code comments.
Spontaneously, here are the most important things which you need to look for in the code. The list is not including those cases that are already covered by other MISRA rules (for example signedness of char).
The size of all the integer types. The size of int being most important, as it determines which type that is given to integer literals, C "boolean" expressions, implicitly promoted integers etc etc.
Obscure integer formats that aren't standard two's complement.
Any reliance on endianess.
The enum type format.
The floating point format.
Pointer to integer conversions, in case they are obscure on the given system.
Behavior of function inlining and the register keyword, if these are used.
Alignment issues including struct padding. Reliance on the size of a struct/union.
#include paths, in case they are obscure. Particularly if they are absolute and not relative.
Bitwise operators mixed with signed types (in most cases this is a bug or design mistake).

Misra C Rule 12.2 - false positive warning?

My CCS 6.1 ARM compiler (for LM3Sxxxx Stellaris) throws a warning :
"MISRA Rule 12.2. The value of an expression shall be the same under any order of evaluation that the standard permits"
for following code:
typedef struct {
...
uint32_t bufferCnt;
uint8_t buffer[100];
...
} DIAG_INTERFACE_T;
static DIAG_INTERFACE_T diagInterfaces[1];
...
DIAG_INTERFACE_T * diag = &diagInterfaces[0];
uint8_t data = 0;
diag->bufferCnt = 0;
diag->buffer[diag->bufferCnt++] = data; // line where warning is issued
...
I don't see a problem in my code. Is it false positive or my bug?
Put diag->bufferCnt++ in a separate statement (as it is also advised by Hans in OP comments) and the warning should not appear.
But regarding MISRA rule 12.2 I see no violation of 12.2 (there is a single sequence point in your statement and no unspecified behavior) in your program and I think it's a bug in your MISRA software.
For information there is also an advisory 12.13 rule in MISRA that says:
(MISRA-C:2004, 12.13) "The increment (++) and decrement (--) operators should not be mixed with other operators in an expression"
The problem with MISRA is their terminology use is far from perfect, for 12.3, while -> or = are C operators, in the explanation they then seem to talk only about arithmetic operators...
Although you don’t indicate it, this is MISRA-C:2004, Rule 12.2, and is now MISRA-C:2012 Rule 13.2. As oauh says, this has nothing to do with "order of evaluation”.
I highly recommend referring to MISRA-C:2012 even if you are required to be MISRA-C:2004 compliant, having MISRA-C:2012 around helps, because it has clarified many of the guidelines, including additional rationale, explanations and examples.
You should not be using a compiler to solely check for MISRA-C compliancy, its nice, but compilers #1 goal is not to warn you about all the traps and pitfalls of the language it is dedicated to take advantage of (optimization). They're not very precise either, as in this case. Also, there are many undefined behaviors across translation units, compilers cannot warn about. Its best to also use a dedicated MISRA Static analysis tool, one that is not compiler specific, but that warns about all unpredictable constructs from the ISO C standards point of view, not a particular implementation.
As oauh also said, this is a violation of MISRA-C:Rule 12.13, which is now MISRA-C:2012 Rule 13.3 which has been relaxed to permit ++ and -- to be mixed with other operators, provided that the ++ or -- is the only source of side-effects (in your case the assignment is also a side effect in C terminology).
The Rule is not critical, i.e. its well defined behavior, but the different values resulting from the prefix version and the postfix version can cause confusion, thus it is “advisory” meaning no formal deviation is required (again, a decent MISRA-C tool would allow you to suppress this particular violation).

Portability of using stddef.h's offsetof rather than rolling your own

This is a nitpicky-details question with three parts. The context is that I wish to persuade some folks that it is safe to use <stddef.h>'s definition of offsetof unconditionally rather than (under some circumstances) rolling their own. The program in question is written entirely in plain old C, so please ignore C++ entirely when answering.
Part 1: When used in the same manner as the standard offsetof, does the expansion of this macro provoke undefined behavior per C89, why or why not, and is it different in C99?
#define offset_of(tp, member) (((char*) &((tp*)0)->member) - (char*)0)
Note: All implementations of interest to the people whose program this is supersede the standard's rule that pointers may only be subtracted from each other when they point into the same array, by defining all pointers, regardless of type or value, to point into a single global address space. Therefore, please do not rely on that rule when arguing that this macro's expansion provokes undefined behavior.
Part 2: To the best of your knowledge, has there ever been a released, production C implementation that, when fed the expansion of the above macro, would (under some circumstances) behave differently than it would have if its offsetof macro had been used instead?
Part 3: To the best of your knowledge, what is the most recently released production C implementation that either did not provide stddef.h or did not provide a working definition of offsetof in that header? Did that implementation claim conformance with any version of the C standard?
For parts 2 and 3, please answer only if you can name a specific implementation and give the date it was released. Answers that state general characteristics of implementations that may qualify are not useful to me.
There is no way to write a portable offsetof macro. You must use the one provided by stddef.h.
Regarding your specific questions:
The macro invokes undefined behavior. You cannot subtract pointers except when they point into the same array.
The big difference in practical behavior is that the macro is not an integer constant expression, so it can't safely be used for static initializers, bitfield widths, etc. Also strict bounds-checking-type C implementations might completely break it.
There has never been any C standard that lacked stddef.h and offsetof. Pre-ANSI compilers might lack it, but they have much more fundamental problems that make them unusable for modern code (e.g. lack of void * and const).
Moreover, even if some theoretical compiler did lack stddef.h, you could just provide a drop-in replacement, just like the way people drop in stdint.h for use with MSVC...
To answer #2: yes, gcc-4* (I'm currently looking at v4.3.4, released 4 Aug 2009, but it should hold true for all gcc-4 releases to date). The following definition is used in their stddef.h:
#define offsetof(TYPE, MEMBER) __builtin_offsetof (TYPE, MEMBER)
where __builtin_offsetof is a compiler builtin like sizeof (that is, it's not implemented as a macro or run-time function). Compiling the code:
#include <stddef.h>
struct testcase {
char array[256];
};
int main (void) {
char buffer[offsetof(struct testcase, array[0])];
return 0;
}
would result in an error using the expansion of the macro that you provided ("size of array ‘buffer’ is not an integral constant-expression") but would work when using the macro provided in stddef.h. Builds using gcc-3 used a macro similar to yours. I suppose that the gcc developers had many of the same concerns regarding undefined behavior, etc that have been expressed here, and created the compiler builtin as a safer alternative to attempting to generate the equivalent operation in C code.
Additional information:
A mailing list thread from the Linux kernel developer's list
GCC's documentation on offsetof
A sort-of-related question on this site
Regarding your other questions: I think R's answer and his subsequent comments do a good job of outlining the relevant sections of the standard as far as question #1 is concerned. As for your third question, I have not heard of a modern C compiler that does not have stddef.h. I certainly wouldn't consider any compiler lacking such a basic standard header as "production". Likewise, if their offsetof implementation didn't work, then the compiler still has work to do before it could be considered "production", just like if other things in stddef.h (like NULL) didn't work. A C compiler released prior to C's standardization might not have these things, but the ANSI C standard is over 20 years old so it's extremely unlikely that you'll encounter one of these.
The whole premise to this problems begs a question: If these people are convinced that they can't trust the version of offsetof that the compiler provides, then what can they trust? Do they trust that NULL is defined correctly? Do they trust that long int is no smaller than a regular int? Do they trust that memcpy works like it's supposed to? Do they roll their own versions of the rest of the C standard library functionality? One of the big reasons for having language standards is so that you can trust the compiler to do these things correctly. It seems silly to trust the compiler for everything else except offsetof.
Update: (in response to your comments)
I think my co-workers behave like yours do :-) Some of our older code still has custom macros defining NULL, VOID, and other things like that since "different compilers may implement them differently" (sigh). Some of this code was written back before C was standardized, and many older developers are still in that mindset even though the C standard clearly says otherwise.
Here's one thing you can do to both prove them wrong and make everyone happy at the same time:
#include <stddef.h>
#ifndef offsetof
#define offsetof(tp, member) (((char*) &((tp*)0)->member) - (char*)0)
#endif
In reality, they'll be using the version provided in stddef.h. The custom version will always be there, however, in case you run into a hypothetical compiler that doesn't define it.
Based on similar conversations that I've had over the years, I think the belief that offsetof isn't part of standard C comes from two places. First, it's a rarely used feature. Developers don't see it very often, so they forget that it even exists. Second, offsetof is not mentioned at all in Kernighan and Ritchie's seminal book "The C Programming Language" (even the most recent edition). The first edition of the book was the unofficial standard before C was standardized, and I often hear people mistakenly referring to that book as THE standard for the language. It's much easier to read than the official standard, so I don't know if I blame them for making it their first point of reference. Regardless of what they believe, however, the standard is clear that offsetof is part of ANSI C (see R's answer for a link).
Here's another way of looking at question #1. The ANSI C standard gives the following definition in section 4.1.5:
offsetof( type, member-designator)
which expands to an integral constant expression that has type size_t,
the value of which is the offset in bytes, to the structure member
(designated by member-designator ), from the beginning of its
structure (designated by type ).
Using the offsetof macro does not invoke undefined behavior. In fact, the behavior is all that the standard actually defines. It's up to the compiler writer to define the offsetof macro such that its behavior follows the standard. Whether it's implemented using a macro, a compiler builtin, or something else, ensuring that it behaves as expected requires the implementor to deeply understand the inner workings of the compiler and how it will interpret the code. The compiler may implement it using a macro like the idiomatic version you provided, but only because they know how the compiler will handle the non-standard code.
On the other hand, the macro expansion you provided indeed invokes undefined behavior. Since you don't know enough about the compiler to predict how it will process the code, you can't guarantee that particular implementation of offsetof will always work. Many people define their own version like that and don't run into problems, but that doesn't mean that the code is correct. Even if that's the way that a particular compiler happens to define offsetof, writing that code yourself invokes UB while using the provided offsetof macro does not.
Rolling your own macro for offsetof can't be done without invoking undefined behavior (ANSI C section A.6.2 "Undefined behavior", 27th bullet point). Using stddef.h's version of offsetof will always produce the behavior defined in the standard (assuming a standards-compliant compiler). I would advise against defining a custom version since it can cause portability problems, but if others can't be persuaded then the #ifndef offsetof snippet provided above may be an acceptable compromise.
(1) The undefined behavior is already there before you do the substraction.
First of all, (tp*)0 is not what you think it is. It is a null
pointer, such a beast is not necessarily represented with all-zero
bit pattern.
Then the member operator -> is not simply an offset addition. On a CPU with segmented memory this might be a more complicated operation.
Taking the address with a & operation is UB if the expression is
not a valid object.
(2) For the point 2., there are certainly still archictures out in the wild (embedded stuff) that use segmented memory. For 3., the point that R makes about integer constant expressions has another drawback: if the code is badly optimized the & operation might be done at runtime and signal an error.
(3) Never heard of such a thing, but this is probably not enough to convice your colleagues.
I believe that nearly every optimizing compiler has broken that macro at multiple points in time. Your coworkers have apparently been lucky enough not to have been hit by it.
What happens is that some junior compiler engineer decides that because the zero page is never mapped on their platform of choice, any time anyone does anything with a pointer to that page, that's undefined behavior and they can safely optimize away the whole expression. At that point, everyone's homebrew offsetof macros break until enough people scream about it, and those of us who were smart enough not to roll our own go happily about our business.
I don't know of any compiler where this is the behavior in the current released version, but I think I've seen it happen at some point with every compiler I've ever worked with.

Resources