I'm using QAC and I get the below message for the respective source code line. How can I cast it in order for QAC to "understand" it ?
Compiler used is gcc - it doesn't warn about this issue, as it is set to "iso c99".
#define DIAGMGR_SIGNED_2_BYTES_178 ((s16)178)
sK = (s16)(sE1 / DIAGMGR_SIGNED_2_BYTES_178);
^
Result of signed division or remainder operation may be implementation
defined
.
A division ('/') or remainder ('%') operation is being performed in a
signed integer type and the result may be implementation-defined.
Message 3103 is generated for an integer division or remainder
operation in a signed type where:
One or both operands are non-constant and of signed integer type, or
Both operands are integer constant expressions, one of negative value and one of positive value
A signed integer division or remainder operation in which one operand
is positive and the other is negative may be performed in one of two
ways:
The division will round towards zero and any non-zero remainder will be a negative value
The division will round away from zero and any non-zero remainder will be a positive value In the ISO:C99 standard the first approach is
always used. In the ISO:C90 standard either approach may be used - the
result is implementation defined. For example:
/PRQA S 3120,3198,3408,3447 ++/
extern int r;
extern int si;
extern void foo(void)
{
r = -7 / 4; /* Message 3103 *//* Result is -1 in C99 but may be -2 in C90 */
r = -7 % 4; /* Message 3103 *//* Result is -3 in C99 but may be 1 in C90 */
si = si / r; /* Message 3103 */
}
You need to configure the tool so that it understands that your code is C99. In the old C90 standard, division with negative numbers could be implemented in two different ways, see this. This was a known "bug" in the C90 standard, which has been fixed since C99.
This is a standard warning for most static analysis tools, particularly if they are set to check for MISRA-C compliance. Both MISRA-C:2004 and 2012 require that the programmer is aware of this C standard "bug".
Work-arounds in C90:
If you know for certain that the operands aren't negative, simply cast them to unsigned type, or use unsigned type to begin with.
If you know that the operands might be negative:
If either operand is negative, set flags to indicate which one(s) it was.
Take the absolute values of both operands.
Perform division on absolute values.
Re-add sign to the numbers.
That's unfortunately the only portable work-around in C90. Alternatively you could add a static assertion to prevent the code from compiling on systems that truncate negative numbers downwards.
If you are using C99, no work-arounds are needed, as it always truncates towards zero. You can then safely disable the warning.
Related
The following macro is from an MCAL source of a microcontroller and it converts timer ticks to milliseconds.
#define TICKS2MS(x) ( (uint64) (((((uint64)(x)) * 1) + 0) / 100000) )
Could you please help me understand the significance of multiplying by 1 and adding 0?
The multiplication and addition are in fact pointless, as is the outer cast.
Both operators perform the usual arithmetic conversions on both operands.
For the multiplication, the left operand has type uint64 (as a result of the cast) and the right operand has type int. Since uint64 is the larger type it will be the type of the result. The operand 1 does not change value as a result of the conversion, so in multiplying by 1 the result has the same type and value as (uint64)(x).
Similarly for the addition, the operands are of type uint64 and int respectively, meaning the resulting type is uint64, and 0 does not change value after the conversion. So by adding 0 the result has the same type and value as (uint64)(x) * 1 which has the same type and value as (uint64)(x).
The cast at the end is also superfluous, as the casted expression already has type uint64. As above, the division operator performs the usual arithmetic conversions on its operands so dividing a uint64 by an int results in a uint64.
So the above macro is equivalent to:
#define TICKS2MS(x) ((uint64)(x) / 100000)
I see the words MCAL, TICKS2MS(x) and uint64 (not the uint64_t from stdint.h) .. and it sounds to me like this is an AUTOSAR Environment. (You should have added the autosartag to the question)
The different explicit casts are then most likely due to MISRA-C checks. The MISRA-C check tools can be quite nitpicking about this.
If that macro TICK2MS(x) is part of an xxx_Cfg.h or xxx_PBCfg.h (where xxx is like Gpt, Mcu or Os), then it is most likely generated by the AUTOSAR/MCAL Driver configuration tool. And then, the factor, offset and divider could be part of adjusting to prescaler and maybe a TimerMaxValue (wrap around) or a correction due to odd prescaler/frequency to ticks constellation, in order to correct the failure. If the config is such that such feature is not used, the default values might be generated by the configuration tool.
Since the macro contains only constants, the compiler can optimize this into a simple ((uint64)(x) / 100000).
Context
We are porting C code that was originally compiled using an 8-bit C compiler for the PIC microcontroller. A common idiom that was used in order to prevent unsigned global variables (for example, error counters) from rolling over back to zero is the following:
if(~counter) counter++;
The bitwise operator here inverts all the bits and the statement is only true if counter is less than the maximum value. Importantly, this works regardless of the variable size.
Problem
We are now targeting a 32-bit ARM processor using GCC. We've noticed that the same code produces different results. So far as we can tell, it looks like the bitwise complement operation returns a value that is a different size than we would expect. To reproduce this, we compile, in GCC:
uint8_t i = 0;
int sz;
sz = sizeof(i);
printf("Size of variable: %d\n", sz); // Size of variable: 1
sz = sizeof(~i);
printf("Size of result: %d\n", sz); // Size of result: 4
In the first line of output, we get what we would expect: i is 1 byte. However, the bitwise complement of i is actually four bytes which causes a problem because comparisons with this now will not give the expected results. For example, if doing (where i is a properly-initialized uint8_t):
if(~i) i++;
we will see i "wrap around" from 0xFF back to 0x00. This behaviour is different in GCC compared with when it used to work as we intended in the previous compiler and 8-bit PIC microcontroller.
We are aware that we can resolve this by casting like so:
if((uint8_t)~i) i++;
or, by
if(i < 0xFF) i++;
however in both of these workarounds, the size of the variable must be known and is error-prone for the software developer. These kinds of upper bounds checks occur throughout the codebase. There are multiple sizes of variables (eg., uint16_t and unsigned char etc.) and changing these in an otherwise working codebase is not something we're looking forward to.
Question
Is our understanding of the problem correct, and are there options available to resolving this that do not require re-visiting each case where we've used this idiom? Is our assumption correct, that an operation like bitwise complement should return a result that is the same size as the operand? It seems like this would break, depending on processor architectures. I feel like I'm taking crazy pills and that C should be a bit more portable than this. Again, our understanding of this could be wrong.
On the surface this might not seem like a huge issue but this previously-working idiom is used in hundreds of locations and we're eager to understand this before proceeding with expensive changes.
Note: There is a seemingly similar but not exact duplicate question here: Bitwise operation on char gives 32 bit result
I didn't see the actual crux of the issue discussed there, namely, the result size of a bitwise complement being different than what's passed into the operator.
What you are seeing is the result of integer promotions. In most cases where an integer value is used in an expression, if the type of the value is smaller than int the value is promoted to int. This is documented in section 6.3.1.1p2 of the C standard:
The following may be used in an expression wherever an intor
unsigned int may be used
An object or expression with an integer type (other than intor unsigned int) whose integer conversion rank is less
than or equal to the rank of int and unsigned int.
A bit-field of type _Bool, int ,signed int, orunsigned int`.
If an int can represent all values of the original type (as
restricted by the width, for a bit-field), the value is
converted to an int; otherwise, it is converted to an
unsigned int. These are called the integer promotions. All
other types are unchanged by the integer promotions.
So if a variable has type uint8_t and the value 255, using any operator other than a cast or assignment on it will first convert it to type int with the value 255 before performing the operation. This is why sizeof(~i) gives you 4 instead of 1.
Section 6.5.3.3 describes that integer promotions apply to the ~ operator:
The result of the ~ operator is the bitwise complement of its
(promoted) operand (that is, each bit in the result is set if and only
if the corresponding bit in the converted operand is not set). The
integer promotions are performed on the operand, and the
result has the promoted type. If the promoted type is an unsigned
type, the expression ~E is equivalent to the maximum value
representable in that type minus E.
So assuming a 32 bit int, if counter has the 8 bit value 0xff it is converted to the 32 bit value 0x000000ff, and applying ~ to it gives you 0xffffff00.
Probably the simplest way to handle this is without having to know the type is to check if the value is 0 after incrementing, and if so decrement it.
if (!++counter) counter--;
The wraparound of unsigned integers works in both directions, so decrementing a value of 0 gives you the largest positive value.
in sizeof(i); you request the size of the variable i, so 1
in sizeof(~i); you request the size of the type of the expression, which is an int, in your case 4
To use
if(~i)
to know if i does not value 255 (in your case with an the uint8_t) is not very readable, just do
if (i != 255)
and you will have a portable and readable code
There are multiple sizes of variables (eg., uint16_t and unsigned char etc.)
To manage any size of unsigned :
if (i != (((uintmax_t) 2 << (sizeof(i)*CHAR_BIT-1)) - 1))
The expression is constant, so computed at compile time.
#include <limits.h> for CHAR_BIT and #include <stdint.h> for uintmax_t
Here are several options for implementing “Add 1 to x but clamp at the maximum representable value,” given that x is some unsigned integer type:
Add one if and only if x is less than the maximum value representable in its type:
x += x < Maximum(x);
See the following item for the definition of Maximum. This method
stands a good chance of being optimized by a compiler to efficient
instructions such as a compare, some form of conditional set or move,
and an add.
Compare to the largest value of the type:
if (x < ((uintmax_t) 2u << sizeof x * CHAR_BIT - 1) - 1) ++x
(This calculates 2N, where N is the number of bits in x, by shifting 2 by N−1 bits. We do this instead of shifting 1 N bits because a shift by the number of bits in a type is not defined by the C standard. The CHAR_BIT macro may be unfamiliar to some; it is the number of bits in a byte, so sizeof x * CHAR_BIT is the number of bits in the type of x.)
This can be wrapped in a macro as desired for aesthetics and clarity:
#define Maximum(x) (((uintmax_t) 2u << sizeof (x) * CHAR_BIT - 1) - 1)
if (x < Maximum(x)) ++x;
Increment x and correct if it wraps to zero, using an if:
if (!++x) --x; // !++x is true if ++x wraps to zero.
Increment x and correct if it wraps to zero, using an expression:
++x; x -= !x;
This is is nominally branchless (sometimes beneficial for performance), but a compiler may implement it the same as above, using a branch if needed but possibly with unconditional instructions if the target architecture has suitable instructions.
A branchless option, using the above macro, is:
x += 1 - x/Maximum(x);
If x is the maximum of its type, this evaluates to x += 1-1. Otherwise, it is x += 1-0. However, division is somewhat slow on many architectures. A compiler may optimize this to instructions without division, depending on the compiler and the target architecture.
Before stdint.h the variable sizes can vary from compiler to compiler and the actual variable types in C are still int, long, etc and are still defined by the compiler author as to their size. Not some standard nor target specific assumptions. The author(s) then need to create stdint.h to map the two worlds, that is the purpose of stdint.h to map the uint_this that to int, long, short.
If you are porting code from another compiler and it uses char, short, int, long then you have to go through each type and do the port yourself, there is no way around it. And either you end up with the right size for the variable, the declaration changes but the code as written works....
if(~counter) counter++;
or...supply the mask or typecast directly
if((~counter)&0xFF) counter++;
if((uint_8)(~counter)) counter++;
At the end of the day if you want this code to work you have to port it to the new platform. Your choice as to how. Yes, you have to spend the time hit each case and do it right, otherwise you are going to keep coming back to this code which is even more expensive.
If you isolate the variable types on the code before porting and what size the variable types are, then isolate the variables that do this (should be easy to grep) and change their declarations using stdint.h definitions which hopefully won't change in the future, and you would be surprised but the wrong headers are used sometimes so even put checks in so you can sleep better at night
if(sizeof(uint_8)!=1) return(FAIL);
And while that style of coding works (if(~counter) counter++;), for portability desires now and in the future it is best to use a mask to specifically limit the size (and not rely on the declaration), do this when the code is written in the first place or just finish the port and then you won't have to re-port it again some other day. Or to make the code more readable then do the if x<0xFF then or x!=0xFF or something like that then the compiler can optimize it into the same code it would for any of these solutions, just makes it more readable and less risky...
Depends on how important the product is or how many times you want send out patches/updates or roll a truck or walk to the lab to fix the thing as to whether you try to find a quick solution or just touch the affected lines of code. if it is only a hundred or few that is not that big of a port.
6.5.3.3 Unary arithmetic operators
...
4 The result of the ~ operator is the bitwise complement of its (promoted) operand (that is,
each bit in the result is set if and only if the corresponding bit in the converted operand is
not set). The integer promotions are performed on the operand, and the result has the
promoted type. If the promoted type is an unsigned type, the expression ~E is equivalent
to the maximum value representable in that type minus E.
C 2011 Online Draft
The issue is that the operand of ~ is being promoted to int before the operator is applied.
Unfortunately, I don't think there's an easy way out of this. Writing
if ( counter + 1 ) counter++;
won't help because promotions apply there as well. The only thing I can suggest is creating some symbolic constants for the maximum value you want that object to represent and testing against that:
#define MAX_COUNTER 255
...
if ( counter < MAX_COUNTER-1 ) counter++;
There is some debate between my colleague and I about the U suffix after hexadecimally represented literals. Note, this is not a question about the meaning of this suffix or about what it does. I have found several of those topics here, but I have not found an answer to my question.
Some background information:
We're trying to come to a set of rules that we both agree on, to use that as our style from that point on. We have a copy of the 2004 Misra C rules and decided to use that as a starting point. We're not interested in being fully Misra C compliant; we're cherry picking the rules that we think will most increase efficiency and robustness.
Rule 10.6 from the aforementioned guidelines states:
A “U” suffix shall be applied to all constants of unsigned type.
I personally think this is a good rule. It takes little effort, looks better than explicit casts and more explicitly shows the intention of a constant. To me it makes sense to use it for all unsigned contants, not just numerics, since enforcing a rule doesn't happen by allowing exceptions, especially for a commonly used representation of constants.
My colleague, however, feels that the hexadecimal representation doesn't need the suffix. Mostly because we almost exclusively use it to set micro-controller registers, and signedness doesn't matter when setting registers to hex constants.
My Question
My question is not one about who is right or wrong. It is about finding out whether there are cases where the absence or presence of the suffix changes the outcome of an operation. Are there any such cases, or is it a matter of consistency?
Edit: for clarification; Specifically about setting micro-controller registers by assigning hexadecimal values to them. Would there be a case where the suffix could make a difference there? I feel like it wouldn't. As an example, the Freescale Processor Expert generates all register assignments as unsigned.
Appending a U suffix to all hexadecimal constants makes them unsigned as you already mentioned. This may have undesirable side-effects when these constants are used in operations along with signed values, especially comparisons.
Here is a pathological example:
#define MY_INT_MAX 0x7FFFFFFFU // blindly applying the rule
if (-1 < MY_INT_MAX) {
printf("OK\n");
} else {
printf("OOPS!\n");
}
The C rules for signed/unsigned conversions are precisely specified, but somewhat counter-intuitive so the above code will indeed print OOPS.
The MISRA-C rule is precise as it states A “U” suffix shall be applied to all constants of unsigned type. The word unsigned has far reaching consequences and indeed most constants should not really be considered unsigned.
Furthermore, the C Standard makes a subtile difference between decimal and hexadecimal constants:
A hexadecimal constant is considered unsigned if its value can be represented by the unsigned integer type and not the signed integer type of the same size for types int and larger.
This means that on 32-bit 2's complement systems, 2147483648 is a long or a long long whereas 0x80000000 is an unsigned int. Appending a U suffix may make this more explicit in this case but the real precaution to avoid potential problems is to mandate the compiler to reject signed/unsigned comparisons altogether: gcc -Wall -Wextra -Werror or clang -Weverything -Werror are life savers.
Here is how bad it can get:
if (-1 < 0x8000) {
printf("OK\n");
} else {
printf("OOPS!\n");
}
The above code should print OK on 32-bit systems and OOPS on 16-bit systems. To make things even worse, it is still quite common to see embedded projects use obsolete compilers which do not even implement the Standard semantics for this issue.
For your specific question, the defined values for micro-processor registers used specifically to set them via assignment (assuming these registers are memory-mapped), need not have the U suffix at all. The register lvalue should have an unsigned type and the hex value will be signed or unsigned depending on its value, but the operation will proceed the same. The opcode for setting a signed number or an unsigned number is the same on your target architecture and on any architectures I have ever seen.
With all integer-constants
Appending u/U insures the integer-constant will be some unsigned type.
Without a u/U
For a decimal-constant, the integer-constant will be some signed type.
For a hexadecimal/octal-constant, the integer-constant will be signed or unsigned type, depending of value and integer type ranges.
Note: All integer-constants have positive values.
// +-------- unary operator
// |+-+----- integer-constant
int x = -123;
absence or presence of the suffix changes the outcome of an operation?
When is this important?
With various expressions, the sign-ness and width of the math needs to be controlled and preferable not surprising.
// Examples: assume 32-bit `unsigned`, `long`, 64-bit `long long`
// Bad signed int overflow (UB)
unsigned a = 4000 * 1000 * 1000;
// OK
unsigned b = 4000u * 1000 * 1000;
// undefined behavior
unsigned c = 1 << 31
// OK
unsigned d = 1u << 31
printf("Size %zu\n", sizeof(0xFFFFFFFF)); // 8 type is `long long`
printf("Size %zu\n", sizeof(0xFFFFFFFFu)); // 4 type is `unsigned`
// 2 ** 63
long long e = -9223372036854775808; // C99: bad "9223372036854775808" not representable
long long f = -9223372036854775807 - 1; // ok
long long g = -9223372036854775808u; // implementation defined behavior **
some_unsigned_type h_max = -1; OK, max value for the target type.
some_unsigned_type i_max = -1u; OK, but not max value for wide unsigned types
// when negating a negative `int`
unsigned j = 0 - INT_MIN; // typically int overflow or UB
unsigned k = 0u - INT_MIN; // Never UB
** or an implementation-defined signal is raised.
For the specific question, which was loading register(s), then the U makes it an unsigned value, but whether the compiler treats the n-bit word pattern as a signed or unsigned value it will move the same bit pattern, assuming there isn't any size extension that would propagate an MSB. The difference that might matter is if the register load operation will set any processor condition flags based on a signed or unsigned loading. As an overall guide if the processor supports storing a constant to configuration register or a memory address then loading a peripheral register is unlikely to set the processor's NEG condition flag. Loading a general purpose register connected to an ALU, one that can be the target of an arithmetic operation like add increment or decrement, might set a negative flag on loading so that e.g. a trailing "branch (if) negative" opcode would execute the branch. You would want to check the processor's references to be sure. Small instruction set processors tend to have only a load register instruction, while larger instruction sets are more likely to have a load unsigned variant of the load instruction that doesn't set the NEG bit in the processor's flags, but again, check the processor's references. if you don't have access to the processor's errata (the boo-boo list) and need a specific flag state. All of this only tends to come up when an optimizing compiler re-aranges code with an inline assembly instruction and other uncommon situations. Examine the generate assembly code, turn off some or all compiler optimizations for the module when needed, etc.
According to this link, performing INT_MIN / -1 division operation will result in the program to terminate in i386 CPUs. My processor is of 32-bit architecture and I use GCC compiler. I have done the following experiments to check it.
int a = INT_MIN;
int b = -1;
int c = a / b;
printf("%d\n",c);
As per the information specified in the link mentioned above this program gets terminated throwing a Floating point exception. But it wasn't the same when I tried it in a different manner.
int c = INT_MIN / -1;
printf("%d\n",c);
The compiler threw the following warning after compiling this program.
iso.c: In function ‘main’:
iso.c:6:18: warning: integer overflow in expression [-Woverflow]
int c = INT_MIN / -1;
_____________^
But I got the output -2147483648. Once again I did more two experiments.
int a = INT_MIN;
int b = -1;
printf("%d\n",a / b);
This was a floating point exception.
printf("%d\n",INT_MIN / -1);
This threw the following compiler warning.
iso.c: In function ‘main’:
iso.c:6:24: warning: integer overflow in expression [-Woverflow]
printf("%d\n",INT_MIN / -1);
__________________^
And the output of this program was again -2147483648.
After doing all these experiments, I have noticed that result of division operation done directly on constants differs from the result of the division operation done on variables. So what exactly is making this difference?
Both results are acceptable according to standard. Draft n1256 for C99 says (emphasize mine):
6.5 Expressions...
5 If an exceptional condition occurs during the evaluation of an expression (that is, if the
result is not mathematically defined or not in the range of representable values for its
type), the behavior is undefined.
In 2's complement integer representation, INT_MIN/-1 is INT_MAX + 1 so the operation invokes Undefined Behaviour, so any result (including crash) is acceptable
As explained by #tilz0R in his comment, when the values are passed in variables, the operation is executed at run time and raises a SIGFPE signal. But when the operation only involves compile time constants, the operation is executed by the compiler at compiler time. In gcc implementation, the compiler protect itself against the error and simply uses its best represention for INT_MAX + 1. In a 32 bit 2's complement implementation, INT_MAX is 0x7fffffff, so INT_MAX + 1 is (after signed overflow) 0x80000000 or INT_MIN again (-2147483648)
For example, suppose you have these variables:
int i = 9;
int j = 7;
Depending on the implementation, the value of, (-i)/j, could be either –1 or –2. How is it possible to get these two different results?
Surprisingly the result is implementation defined in C89:
ANSI draft § 3.3.5
When integers are divided and the division is inexact, if both
operands are positive the result of the / operator is the largest
integer less than the algebraic quotient and the result of the %
operator is positive. If either operand is negative, whether the
result of the / operator is the largest integer less than the
algebraic quotient or the smallest integer greater than the algebraic
quotient is implementation-defined
However this was changed in C99
N1256
§ 6.5.5/6
When integers are divided, the result of the / operator is the algebraic quotient with any fractional part discarded*
With a footnote:
* This is often called "truncation toward zero"
To clarify, "implementation defined" means the implementation must decide which one, it doesn't mean sometimes you'll get one thing and sometimes you'll get another (unless the implementation defined it to do something really strange like that, I guess).
In C89, the result of division / can be truncated either way for negative operands. (In C99, the result will be truncated toward zero.)
The historical reason is explained in C99 Rationale:
Rationale for International Standard — Programming Languages — C §6.5.5 Multiplicative operators
In C89, division of integers involving negative operands could round upward or downward in an implementation-defined manner; the intent was to avoid incurring overhead in run-time code to check for special cases and enforce specific behavior. In Fortran, however, the result will always truncate toward zero, and the overhead seems to be acceptable to the numeric programming community. Therefore, C99 now requires similar behavior, which should facilitate porting of code from Fortran to C.