Integer conversion resulted in truncation - c

In embedded C, I just input a value of 500000 into a 16-bit slot. It's giving me the warning of "Integer conversion resulted in truncation". In this event, does that mean the value is set to 65535, 41248 (which is the remainder of 500000/65536), or another value? Or is there not enough information given here to determine it's value and there are other factors at play? Please let me know if more info is needed.
(Sample code, in case it helps)
TA0CCR0 = 500000-1;
TA0CCTL1 = OUTMOD_7;
TA0CCR1 = 250000;
TA0CCTL2 = OUTMOD_7;
TA0CCR2 = 850;
TA0CTL = TASSEL__SMCLK | MC__UP | TACLR;

It depends on whether the variable is signed or unsigned.
Please see C18 §6.3.1.3
Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.

The C language was designed to be "strongly typed, weakly checked" (direct quote from Dennis Ritchie, one of its authors), meaning that even though all type validation is done at compile time, the compiler will usually pick the path of least resistance when generating machine code. On most architectures, using a 16-bit type simply means that it will use 16-bit load and store instructions for that variable, which will automatically make its value mod(65536). So while the compiler notices that you're trying to put a value that's larger than (2^16)-1 into a 16-bit integer, it also won't really do anything about it. So yes, your variable will contain the value 41248.

does that mean the value is set to 65535, 41248 (which is the remainder of 500000/65536), or another value?
With unsigned types, the value is wrapped (mod 63336).
With signed values, it is implementation defined by the compiler. Commonly, it is also wrapped. Robust portable code does not assume this.
Recommendation: Use unsigned math and types to achieve specified consistent results.

Related

Can I treat an `enum` variable as an `int` in C17?

TL;DR: Is it right to assume, given enum NAME {...};, that enum NAME n is the same as int n during execution? Can n be operated on as if it were a signed int, even though it is declared as enum NAME? The reason: I really want to use enum types for return flags, as a type 'closed' with respect to bit-operations.
For example: Let typedef enum FLAGS { F1 = 0x00000001, F2 = 0x00000002, F3 = 0x00000004 } FLAGS ;
Then, FLAGS f = F1 | F2; assigns 3 to f, throwing no related errors or warnings. This and numerous other compiler-permitted usage scenarios, such as f++, makes me think I could legit treat f as if it were a signed int. Compiler used: MSVC'19, 16.9.1, with setting "C17 (2018) Standard (/std:c17)";
I searched the standard (the sketch here) and looked at other related questions, to find no mention of what suspect (and wished) to be a "silent promotion" of enum NAME x to signed int x, even though the identifiers have that type. This leads me to believe that the way enum behaves when assigned a value that isn't a member, is implementation dependent. I'm asking, in part, in order to confirm or deny this claim.
C 2018 6.7.2.2 4 says:
Each enumerated type shall be compatible with char, a signed integer type, or an unsigned integer type. The choice of type is implementation-defined, but shall be capable of representing the values of all the members of the enumeration…
So the answer to “Can I treat an enum variable as an int in C17?” is no, as an object with enumerated type might be effectively a char or other integer type different from int.
However, it is effectively an integer type, so FLAGS f = F1 | F2; will work: The FLAGS type must be capable of representing its values F1 and F2, so whatever type is used for FLAGS must contain all the bits of F1 and of F2, so it contains all the bits of F1 | F2.
Technically, you could construct a trap representation by manipulating bits, so it is not guaranteed that the type is closed under bit operations. For example, if a C implementation used two’s complement for 32-bit int but reserved the bit pattern 1000…0000 as a trap representation, then INT_MIN & -2 would be a trap representation. (INT_MIN would have the bit pattern 1000…0001, for 231−1, and -2 would have the pattern 1111…1110.) This does not occur in C implementations without trap representations in its integer types.
We might question whether the fact that two types (an enumeration and its implementation-defined integer type) are compatible means we can use one as the other. Two types are compatible if they are the same (6.2.7 1), and the only things that can make types compatible but not the same involve qualifiers (like const) that are not an issue for this or involve other properties (such as array dimensions) that are not relevant to simple integer types.
This is in chapter 6.4.4.3 of the PDF you linked:
An identifier declared as an enumeration constant has type int.
Your thought of a promotion of enum NAME x to signed int x is not really true, as it is the identifier NAME that is of type int. The value x is of the type you use to define the identifier, and it is promoted to int.
Additionally, integer promotion takes place in integer operations.
EDIT
Some compilers are quite serious about the difference between enum and int, especially if they have an option to reduce the bit width to the smallest possible. For example, the one I'm using in a job's project, automatically inserts checks on each usage of an enum value against the defined values. Additionally, IIRC, it rejects all implicit conversions, we need to cast explicitly similarly to:
FLAGS f = (FLAGS)((int)F1 | (int)F2);
But this is an extension of this special beast called with specific safety options...

Bitwise operation results in unexpected variable size

Context
We are porting C code that was originally compiled using an 8-bit C compiler for the PIC microcontroller. A common idiom that was used in order to prevent unsigned global variables (for example, error counters) from rolling over back to zero is the following:
if(~counter) counter++;
The bitwise operator here inverts all the bits and the statement is only true if counter is less than the maximum value. Importantly, this works regardless of the variable size.
Problem
We are now targeting a 32-bit ARM processor using GCC. We've noticed that the same code produces different results. So far as we can tell, it looks like the bitwise complement operation returns a value that is a different size than we would expect. To reproduce this, we compile, in GCC:
uint8_t i = 0;
int sz;
sz = sizeof(i);
printf("Size of variable: %d\n", sz); // Size of variable: 1
sz = sizeof(~i);
printf("Size of result: %d\n", sz); // Size of result: 4
In the first line of output, we get what we would expect: i is 1 byte. However, the bitwise complement of i is actually four bytes which causes a problem because comparisons with this now will not give the expected results. For example, if doing (where i is a properly-initialized uint8_t):
if(~i) i++;
we will see i "wrap around" from 0xFF back to 0x00. This behaviour is different in GCC compared with when it used to work as we intended in the previous compiler and 8-bit PIC microcontroller.
We are aware that we can resolve this by casting like so:
if((uint8_t)~i) i++;
or, by
if(i < 0xFF) i++;
however in both of these workarounds, the size of the variable must be known and is error-prone for the software developer. These kinds of upper bounds checks occur throughout the codebase. There are multiple sizes of variables (eg., uint16_t and unsigned char etc.) and changing these in an otherwise working codebase is not something we're looking forward to.
Question
Is our understanding of the problem correct, and are there options available to resolving this that do not require re-visiting each case where we've used this idiom? Is our assumption correct, that an operation like bitwise complement should return a result that is the same size as the operand? It seems like this would break, depending on processor architectures. I feel like I'm taking crazy pills and that C should be a bit more portable than this. Again, our understanding of this could be wrong.
On the surface this might not seem like a huge issue but this previously-working idiom is used in hundreds of locations and we're eager to understand this before proceeding with expensive changes.
Note: There is a seemingly similar but not exact duplicate question here: Bitwise operation on char gives 32 bit result
I didn't see the actual crux of the issue discussed there, namely, the result size of a bitwise complement being different than what's passed into the operator.
What you are seeing is the result of integer promotions. In most cases where an integer value is used in an expression, if the type of the value is smaller than int the value is promoted to int. This is documented in section 6.3.1.1p2 of the C standard:
The following may be used in an expression wherever an intor
unsigned int may be used
An object or expression with an integer type (other than intor unsigned int) whose integer conversion rank is less
than or equal to the rank of int and unsigned int.
A bit-field of type _Bool, int ,signed int, orunsigned int`.
If an int can represent all values of the original type (as
restricted by the width, for a bit-field), the value is
converted to an int; otherwise, it is converted to an
unsigned int. These are called the integer promotions. All
other types are unchanged by the integer promotions.
So if a variable has type uint8_t and the value 255, using any operator other than a cast or assignment on it will first convert it to type int with the value 255 before performing the operation. This is why sizeof(~i) gives you 4 instead of 1.
Section 6.5.3.3 describes that integer promotions apply to the ~ operator:
The result of the ~ operator is the bitwise complement of its
(promoted) operand (that is, each bit in the result is set if and only
if the corresponding bit in the converted operand is not set). The
integer promotions are performed on the operand, and the
result has the promoted type. If the promoted type is an unsigned
type, the expression ~E is equivalent to the maximum value
representable in that type minus E.
So assuming a 32 bit int, if counter has the 8 bit value 0xff it is converted to the 32 bit value 0x000000ff, and applying ~ to it gives you 0xffffff00.
Probably the simplest way to handle this is without having to know the type is to check if the value is 0 after incrementing, and if so decrement it.
if (!++counter) counter--;
The wraparound of unsigned integers works in both directions, so decrementing a value of 0 gives you the largest positive value.
in sizeof(i); you request the size of the variable i, so 1
in sizeof(~i); you request the size of the type of the expression, which is an int, in your case 4
To use
if(~i)
to know if i does not value 255 (in your case with an the uint8_t) is not very readable, just do
if (i != 255)
and you will have a portable and readable code
There are multiple sizes of variables (eg., uint16_t and unsigned char etc.)
To manage any size of unsigned :
if (i != (((uintmax_t) 2 << (sizeof(i)*CHAR_BIT-1)) - 1))
The expression is constant, so computed at compile time.
#include <limits.h> for CHAR_BIT and #include <stdint.h> for uintmax_t
Here are several options for implementing “Add 1 to x but clamp at the maximum representable value,” given that x is some unsigned integer type:
Add one if and only if x is less than the maximum value representable in its type:
x += x < Maximum(x);
See the following item for the definition of Maximum. This method
stands a good chance of being optimized by a compiler to efficient
instructions such as a compare, some form of conditional set or move,
and an add.
Compare to the largest value of the type:
if (x < ((uintmax_t) 2u << sizeof x * CHAR_BIT - 1) - 1) ++x
(This calculates 2N, where N is the number of bits in x, by shifting 2 by N−1 bits. We do this instead of shifting 1 N bits because a shift by the number of bits in a type is not defined by the C standard. The CHAR_BIT macro may be unfamiliar to some; it is the number of bits in a byte, so sizeof x * CHAR_BIT is the number of bits in the type of x.)
This can be wrapped in a macro as desired for aesthetics and clarity:
#define Maximum(x) (((uintmax_t) 2u << sizeof (x) * CHAR_BIT - 1) - 1)
if (x < Maximum(x)) ++x;
Increment x and correct if it wraps to zero, using an if:
if (!++x) --x; // !++x is true if ++x wraps to zero.
Increment x and correct if it wraps to zero, using an expression:
++x; x -= !x;
This is is nominally branchless (sometimes beneficial for performance), but a compiler may implement it the same as above, using a branch if needed but possibly with unconditional instructions if the target architecture has suitable instructions.
A branchless option, using the above macro, is:
x += 1 - x/Maximum(x);
If x is the maximum of its type, this evaluates to x += 1-1. Otherwise, it is x += 1-0. However, division is somewhat slow on many architectures. A compiler may optimize this to instructions without division, depending on the compiler and the target architecture.
Before stdint.h the variable sizes can vary from compiler to compiler and the actual variable types in C are still int, long, etc and are still defined by the compiler author as to their size. Not some standard nor target specific assumptions. The author(s) then need to create stdint.h to map the two worlds, that is the purpose of stdint.h to map the uint_this that to int, long, short.
If you are porting code from another compiler and it uses char, short, int, long then you have to go through each type and do the port yourself, there is no way around it. And either you end up with the right size for the variable, the declaration changes but the code as written works....
if(~counter) counter++;
or...supply the mask or typecast directly
if((~counter)&0xFF) counter++;
if((uint_8)(~counter)) counter++;
At the end of the day if you want this code to work you have to port it to the new platform. Your choice as to how. Yes, you have to spend the time hit each case and do it right, otherwise you are going to keep coming back to this code which is even more expensive.
If you isolate the variable types on the code before porting and what size the variable types are, then isolate the variables that do this (should be easy to grep) and change their declarations using stdint.h definitions which hopefully won't change in the future, and you would be surprised but the wrong headers are used sometimes so even put checks in so you can sleep better at night
if(sizeof(uint_8)!=1) return(FAIL);
And while that style of coding works (if(~counter) counter++;), for portability desires now and in the future it is best to use a mask to specifically limit the size (and not rely on the declaration), do this when the code is written in the first place or just finish the port and then you won't have to re-port it again some other day. Or to make the code more readable then do the if x<0xFF then or x!=0xFF or something like that then the compiler can optimize it into the same code it would for any of these solutions, just makes it more readable and less risky...
Depends on how important the product is or how many times you want send out patches/updates or roll a truck or walk to the lab to fix the thing as to whether you try to find a quick solution or just touch the affected lines of code. if it is only a hundred or few that is not that big of a port.
6.5.3.3 Unary arithmetic operators
...
4 The result of the ~ operator is the bitwise complement of its (promoted) operand (that is,
each bit in the result is set if and only if the corresponding bit in the converted operand is
not set). The integer promotions are performed on the operand, and the result has the
promoted type. If the promoted type is an unsigned type, the expression ~E is equivalent
to the maximum value representable in that type minus E.
C 2011 Online Draft
The issue is that the operand of ~ is being promoted to int before the operator is applied.
Unfortunately, I don't think there's an easy way out of this. Writing
if ( counter + 1 ) counter++;
won't help because promotions apply there as well. The only thing I can suggest is creating some symbolic constants for the maximum value you want that object to represent and testing against that:
#define MAX_COUNTER 255
...
if ( counter < MAX_COUNTER-1 ) counter++;

Why does (int) float == float.truncate instead of garbage (How does casting actually work?)

Going on understanding of these datatypes as primitives
(int) char, and (char) int are intepretations of data. (int) c gives the integer value of that character, and (char) 14 gives you back the character encoded by 14.
I've always understood this as being a "memory parse", such that it just takes the value at that position and then applies a type filter to it.
Given that floating points are stored as some version of scientific notation, what is stored in memory should be garbage as an integer. Looking into this utility http://www.h-schmidt.net/FloatConverter/IEEE754.html it appears that the whole number portion is separated.
However, since this is in the higher portion of memory, how does the int cast know to "reformat"? Does the compiler identify that it was a float and apply special handling, or what's going on?
Your understanding of casts is completely wrong. Casts are nothing but explicit requests for a value conversion from one type to another. They do not reinterpret the representation of one type as if it had a different type. The source code:
float f = 42.5;
int x;
x = (int)f;
simply instructs the compiler to produce code that truncates the floating point value of the expression f to an integer and store the result in the object x.
I've always understood this as being a "memory parse", such that it just takes the value at that position and then applies a type filter to it.
That is an incorrect understanding.
The language specifies conversions between the fundamental arithmetic types. Lookup "Usual Arithmetic Conversions" on the web. You will find a lot of links that describe that. For converting a floating point type to an integral type, this is what the C99 Standard has to say:
6.3.1.4 Real floating and integer
1 When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
float f = 4.5;
int i = (int); // i is 4
f = -6.3;
i = (int)f; // i is -6

Initializing bit-fields

When you write
struct {
unsigned a:3, b:2;
} x = {10, 11};
is x.b guaranteed to be 3 by ANSI C (C89)? I have read and reread the standard, but can't seem to find exactly that case.
For example, "result that cannot be represented by the
resulting unsigned integer type is reduced modulo the number that is
one greater than the largest value that can be represented by the
resulting unsigned integer type." speaks about computation, not about initialization. And moreover, bit-field is not really a type.
Also, (when speaking about unsigned t:4) "contains values in the range [0,15]", but it doesn't necessarily mean that initializer must be reduced modulo 16 to be mapped to [0,15].
Struct initialization is really painstakingly detailedly described, but I really can't seem to find exactly that behavior. (Of course compilers do exactly that. And IBM documentation says " when you assign a value that is out of range to a bit field, the low-order bit pattern is preserved and the appropriate bits are assigned.", but I'd like to know if ANSI C standardizes that.
"ANSI C"/C89 has been obsolete for 25 years. Therefore, my answer cites the current C standard ISO 9899:2011, also known as C11.
Pretty much everything related to bit-fields in the C standard is poorly defined. Typically, you will not find anything explicitly addressing the behavior of bit fields, but their behavior is rather specified implicitly, "between the lines". This is why you should avoid using bit fields.
However, I believe that this specific case is well-defined: it should work like any other integer initialization.
The detailed struct initialization rules you mention (6.7.9) show how the literal 11 in the initializer list is related to the variable b. Nothing strange with that. What then applies is "simple assignment", the same thing that would happen as if you wrote x.b = 11;.
When doing any kind of assignment or initialization in C, the right operand is converted to the type of the left operand. This is specified by C11 6.5.16:
In simple assignment (=), the value of the right operand is converted
to the type of the assignment expression and replaces the value stored
in the object designated by the left operand.
In your case, the literal 11 of type int is converted to a bit field of unsigned int:2.
Therefore, the rule you are looking for should be found in the chapter dealing with conversions (C11 6.3). What applies is what you already cited in your question, C11 6.3.1.3:
...if the new type is unsigned, the value is converted by repeatedly
adding or subtracting one more than the maximum value that can be
represented in the new type until the value is in the range of the new
type.
The maximum value of an unsigned int:2 is 3. One more than the maximum value is 3+1=4. The compiler should repeatedly subtract this from the value 11:
11 - (3+1) = 7 does not fit, subtract once more:
7 - (3+1) = 3 does fit, store value 3
But then of course, this is the very same thing as taking the 2 least significant bits of the decimal value 11 and storing them in the bit field.
WRT "speaks about computation, not about initialization", the C89 standard explicitly applies the rules of assignment and conversion to initialization. It also says:
A bit-field is interpreted as an integral type consisting of the specified number of bits.
Given those, while a compiler warning would clearly be in order, it seems that throwing away upper-order bits is guaranteed by the standard.

Data types in C

A long double is known to use 80 bits.
2^80 = 1208925819614629174706176;
Why, when declaring a variable such as:
long double a = 1208925819614629174706175; // 2^80 - 1
I get a warning saying: Integer constant is too large for its type.
1208925819614629174706175 is an integer literal, not a double. Your program would happily convert it, but it would have to be a valid integer first. Instead, use a long double literal: 1208925819614629174706175.0L.
Firstly, it is not known how many bits a long double type is using. It depends on the implementation.
Secondly, just because some floating-point type uses some specific number of bits it does not mean that this type can precisely represent an integer value using all these bits (if that's what you want). Floating-point types are called floating-point types because they represent non-integer values, which normally implies a non-trivial internal representation. Due to specifics of that representation, only a portion of these bits can be used for the actual digits of the number. This means that your 2^80 - 1 number will get truncated/rounded in one way or another. So, regardless of how you do it, don't be surprised if the compiler warns you about the data loss.
Thirdly, as other answers have already noted, the constant you are using in the text of your program is an integral constant. The limitations imposed on that constant have nothing to do with floating-point types at all. Use a floating-point constant instead of an integral one.
The value 1208925819614629174706175 is first crated as a const int, and then converted to a long double, when the assignment happens.

Resources