c standard and bitshifts - c

This question was first inspired by the (unexpected) results of this code:
uint16_t t16 = 0;
uint8_t t8 = 0x80;
uint8_t t8_res;
t16 = (t8 << 1);
t8_res = (t8 << 1);
printf("t16: %x\n", t16); // Expect 0, get 0x100
printf(" t8: %x\n", t8_res); // Expect 0, get 0
But it turns out this makes sense:
6.5.7 Bitwise shift operators
Constraints
2 Each of the operands shall have integer type
Thus the originally confused line is equivalent to:
t16 = (uint16_t) (((int) t8) << 1);
A little non-intuitive IMHO, but at least well-defined.
Ok, great, but then we do:
{
uint64_t t64 = 1;
t64 <<= 31;
printf("t64: %lx\n", t64); // Expect 0x80000000, get 0x80000000
t64 <<= 31;
printf("t64: %lx\n", t64); // Expect 0x0, get 0x4000000000000000
}
// edit: following the same literal argument as above, the following should be equivalent:
t64 = (uint64_t) (((int) t64) << 31);
// hence my confusion / expectation [end_edit]
Now, we get the intuitive result, but not what would be derived from my (literal) reading of the standard. When / how does this "further automatic type promotion" take place? Or is there a limitation elsewhere that a type can never be demoted (that would make sense?), in that case, how do the promotion rules apply for:
uint32_t << uint64_t
Since the standard does say both arguments are promoted to int; should both arguments be promoted to the same type here?
// edit:
More specifically, what should the result of:
uint32_t t32 = 1;
uint64_t t64_one = 1;
uint64_t t64_res;
t64_res = t32 << t64_one;
// end edit
The answer to the above question is resolved when we recognize that the spec does not demand a promotion to int specifically, rather to an integer type, which uint64_t qualifies as.
// CLARIFICATION EDIT:
Ok, but now I am confused again. Specifically, if uint8_t is an integer type, then why is it being promoted to int at all? It does not seem to be related to the constant int 1, as the following exercise demonstrates:
{
uint16_t t16 = 0;
uint8_t t8 = 0x80;
uint8_t t8_one = 1;
uint8_t t8_res;
t16 = (t8 << t8_one);
t8_res = (t8 << t8_one);
printf("t16: %x\n", t16);
printf(" t8: %x\n", t8_res);
}
t16: 100
t8: 0
Why is the (t8 << t8_one) expression being promoted if uint8_t is an integer type?
--
For reference, I'm working from ISO/IEC 9899:TC9, WG14/N1124 May 6, 2005. If that's out of date and someone could also provide a link to a more recent copy, that'd be appreciated as well.

I think the source of your confusion might be that the following two statements are not equivalent:
Each of the operands shall have integer type
Each of the operands shall have int type
uint64_t is an integer type.

The constraint in §6.5.7 that "Each of the operands shall have integer type." is a constraint that means you cannot use the bitwise shift operators on non-integer types like floating point values or pointers. It does not cause the effect you are noting.
The part that does cause the effect is in the next paragraph:
3. The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand.
The integer promotions are described in §6.3.1.1:
2. The following may be used in an expression wherever an int
or unsigned int may be used:
An object or expression with an integer type whose integer conversion rank is less than or equal to the rank of int and
unsigned int.
A bit-field of type _Bool, int, signed int, or unsigned int.
If an int can represent all values of the original type, the value
is converted to an int; otherwise, it is converted to an unsigned
int. These are called the integer promotions. All other types are
unchanged by the integer promotions.
uint8_t has a lesser rank than int, so the value is converted to an int (since we know that an int must be able to represent all the values of uint8_t, given the requirements on the ranges of those two types).
The ranking rules are complex, but they guarantee that a type with a higher rank cannot have a lesser precision. This means, in effect, that types cannot be "demoted" to a type with lesser precision by the integer promotions (it is possible for uint64_t to be promoted to int or unsigned int, but only if the range of the type is at least that of uint64_t).
In the case of uint32_t << uint64_t, the rule that kicks in is "The type of the result is that of the promoted left operand". So we have a few possibilities:
If int is at least 33 bits, then uint32_t will be promoted to int and the result will be int;
If int is less than 33 bits and unsigned int is at least 32 bits, then uint32_t will be promoted to unsigned int and the result will be unsigned int;
If unsigned int is less than 32 bits then uint32_t will be unchanged and the result will be uint32_t.
On today's common desktop and server implementations, int and unsigned int are usually 32 bits, and so the second possibility will occur (uint32_t is promoted to unsigned int). In the past it was common for int / unsigned int to be 16 bits, and the third possibility would occur (uint32_t left unpromoted).
The result of your example:
uint32_t t32 = 1;
uint64_t t64_one = 1;
uint64_t t64_res;
t64_res = t32 << t64_one;
Will be the value 2 stored into t64_res. Note though that this is not affected by the fact that the result of the expression is not uint64_t - and example of an expression that would be affected is:
uint32_t t32 = 0xFF000;
uint64_t t64_shift = 16;
uint64_t t64_res;
t64_res = t32 << t64_shift;
The result here is 0xf0000000.
Note that although the details are fairly intricate, you can boil it all down to a fairly simple rule that you should keep in mind:
In C, arithmetic is never done in types narrower than int /
unsigned int.

You found the wrong rule in the standard :( The relevant is something like "the usual integer type promotions apply". This is what hits you for the first example. If an integer type like uint8_t has a rank that is smaller than int it is promoted to int. uint64_t has not a rank that is smaller than int or unsigned so no promotion is performed and the << operator is applied to the uint64_t variable.
Edit: All integer types smaller than int are promoted for arithmetic. This is just a fact of life :) Whether or not uint32_t is promoted depends on the platform, because it might have the same rank or higher than int (not promoted) or a smaller rank (promoted).
Concerning the << operator the type of the right operand is not really important, what counts for the number of bits is the left one (with the above rules). More important for the right one is its value. It musn't be negative or exceed the width of the (promoted) left operand.

Related

C uses different data type for arithmetic in the middle of an expression?

In Go (the language I'm most familiar with), the result of a mathematical operation is always the same data type as the operands, meaning if the operation overflows, the result will be incorrect. For example:
func main() {
var a byte = 100
var b byte = 9
var r byte = (a << b) >> b
fmt.Println(r)
}
This prints 0, as all the bits are shifted out of the bounds of a byte during the initial << 9 operation, then zeroes are shifted back in during the >> 9 operation.
However, this isn't the case in C:
int main() {
unsigned char a = 100;
unsigned char b = 9;
unsigned char r = (a << b) >> b;
printf("%d\n", r);
return 0;
}
This code prints 100. Although this yields the "correct" result, this is unexpected to me, as I'd only expect promotion if one of the operands were larger than a byte, but in this case all operands are bytes. It's as though the temporary variable holding the result of the << 9 operation is larger than the resulting variable, and is only downcast back to a byte after the full RHS is evaluated, and thus after the >> 9 operation restores the bits.
Obviously, if explicitly storing the result of the >> 9 into a byte before continuing, you get the same result as in Go:
int main() {
unsigned char a = 100;
unsigned char b = 9;
unsigned char c = a << b;
unsigned char r = c >> b;
printf("%d\n", r);
return 0;
}
This isn't merely the case with bitwise operators. I've tested with multiplication/division too, and it demonstrates the same behaviour.
My question is: is this behaviour of C defined? If so, where? Does it actually use a specific data type for the interim values of a complex expression? Or is this actually undefined behaviour, like an incidental result of the operations being performed in a 32/64 bit CPU register before being saved back to memory?
C 2018 6.5.7 discusses the shift operators. Paragraph 3 says:
The integer promotions are performed on each of the operands…
6.3.1.1 2 specifies the integer promotions:
… If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.
Thus in a << b where a and b are unsigned char, a is promoted to int, which is at least 16 bits. (A C implementation may define unsigned char to be more than eight bits. It could be the same width as int. In this case, the integer promotions would not convert a or b.)
Note that if the integer promotions were not applied, the behavior of evaluating a << b with b equal to 9 would not be defined by the C standard, as the behavior of the shift operators is not defined for shift amounts greater than or equal to the width of the left operator.
6.5.5 specifies the multiplicative operators. Paragraph 3 says:
The usual arithmetic conversions are performed on the operands.
6.3.1.8 specifies the usual arithmetic conversions:
… First, if the corresponding real type of either operand is long double, the other operand is converted, without change of type domain [complex or real], to a type whose corresponding real type is long double.
Otherwise, if the corresponding real type of either operand is double, the other operand is converted, without change of type domain, to a type whose corresponding real type is double.
Otherwise, if the corresponding real type of either operand is float, the other operand is converted, without change of type domain, to a type whose corresponding real type is float.
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.
Rank has a technical definition that largely corresponds to width (number of bits in an integer type).
Thus, in a * b where a and b are unsigned char, they are both promoted to int (with the caveat above about wide unsigned char) and no further conversions are necessary. If one operand were wider than int, say long long int, while the other is unsigned char then both operands would be converted to that wider type.
Welcome to integer promotions! One behavior of the C language (an often criticized one, I'd add) is that types like char and short are promoted to int before doing any arithmetic operation with them, and the result is also int. What does this mean?
unsigned char foo(unsigned char x) {
return (x << 4) >> 4;
}
int main(void) {
if (foo(0xFF) == 0x0F) {
printf("Yay!\n");
}
else {
printf("... hey, wait a minute!\n");
}
return 0;
}
Needless to say, the above code prints ... hey, wait a minute!. Let's discover why:
// this line of code:
return (x << 4) >> 4;
// is converted to this (because of integer promotion):
return ((int) x << 4) >> 4;
Therefore, this is what happens:
x is unsigned char (8-bit) and its value is 0xFF,
x << 4 needs to be executed, but first x is converted to int (32-bit),
x << 4 becomes 0x000000FF << 4, and the result 0x00000FF0 is also int,
0x00000FF0 >> 4 is executed, yielding 0x000000FF,
finally, 0x000000FF is converted to unsigned char (because that's the return value of foo()), so it becomes 0xFF,
and that's why foo(0xFF) yields 0xFF instead of 0x0F.
How to prevent this? Simple: convert the result of x << 4 to unsigned char. In the previous example, 0x00000FF0 would have become 0xF0.
unsigned char foo(unsigned char x) {
return ((unsigned char) (x << 4)) >> 4;
}
foo(0xFF) == 0x0F
NOTE: in the previous examples, it is assumed that unsigned char is 8 bits and int is 32 bits, but the examples work for basically any situation in which CHAR_BIT == 8 (because C17 requires that sizeof(int) * CHAR_BIT >= 16).
P.S.: this answer is not as exhaustive as the C official standard document, of course. But you can find all the (valid and defined) behavior of C described in the latest draft of the ISO/IEC 9899:2018 standard (a.k.a. C17/C18).

Integer division in C with unsigned short

I feel like the bloodiest beginner - Why does the following not work:
// declarations
unsigned short currentAddr= 0x0000;
unsigned short addr[20] = {1, 0};
// main
addr[1] = (~currentAddr)/2+1;
printf("addr[1] wert: %hu\n", addr[1]); // equals 1, expecte 0x8000
addr[1] = ~currentAddr>>1;
printf("addr[1] wert: %hu\n", addr[1]); // equals 65535, expected 0x7FFF
In printf and also in my debugger's watchlist the value for addr[1] is not as expected. My aim is to have half the maximum of the variable, here 0x8000.
Info: I am doing ~currentAddr to get the max. 0xFFFF in case short is in a different length on my embedded platform than here on my PC.
cheers, Stefan
What went wrong
The integer promotions are performed on the operand of the unary ~.
On many systems int is larger than short. On such systems, for unsigned short currentAddr = 0, the value of currentAddr is first promoted to int in the expression ~currentAddr. Then ~currentAddr evaluates to -1 (assuming twos-complement representation).
On some systems int and short may be the same size (though int must be at least as large as short); here currentAddr would instead be promoted to unsigned int since an int cannot hold all values of an unsigned integer type of the same size. In such a case, ~currentAddr would evaluate to UINT_MAX. For 16-bit int (short must be at least 16-bit, so here int and short would be the same size) the result of ~currentAddr would be 65,535.
The OP's system must have int larger than short. In the case of addr[1] = (~currentAddr)/2+1; this becomes addr[1] = (-1)/2+1; which evaluates to 1.
In the second case, addr[1] = ~currentAddr>>1; evaluates to addr[1] = (-1)>>1;. Here, the result of right-shifting a negative value is implementation-defined. In the present case, the result appears to be INT_MAX, which is converted to unsigned short in the assignment to addr[1], which takes the value USHRT_MAX in the conversion. This value is 65,535 on OP's system.
What to do about it
To obtain maximum and minimum values for standard integer types clearly and reliably, use the macros found in limits.h instead of attempting bit manipulations. This method will not disappoint:
#include <stdio.h>
#include <limits.h>
int main(void)
{
unsigned short val;
val = (USHRT_MAX / 2) + 1;
printf("(USHRT_MAX / 2) + 1: %#hx\n", val);
val = USHRT_MAX >> 1;
printf(" USHRT_MAX >> 1: %#hx\n", val);
return 0;
}
Program output:
(USHRT_MAX / 2) + 1: 0x8000
USHRT_MAX >> 1: 0x7fff
The problem lies here:
addr[1] = (~currentAddr)/2+1;
You expect currentAddr to 0xFFFF, which is partially right. But, what you might have missed out is integer promotion rule, which makes it 0xFFFFFFFF which is hexadecimal representation of -1.
Now, there is simple math:
(~currentAddr)/2+1 is nothing but 0x01 or 1, When you ~currentAddr>>1; do this shift, it is again becoming -1.
From
My aim is to have half the maximum of the variable, here 0x8000
If I understand you correctly, what you are trying to do is get the value which is equal to (Maximum value of Unsigned short)/2. If it is so, the proper way of doing it will be using USHRT_MAX. Of course, you'll need to include limits.h file in your source code.
Update:
Referring to your comments to David's answer, following changes works as expected. (You have tested, I haven't)
unsigned short c;
c = ~currentAddr;
unsigned short c_z = sizeof (c);
unsigned short ci;
ci = (c >> 1) + 1;
unsigned short ci_z = sizeof (ci);
addr[1] = ci;
Now, why this isn't promoted to integer as opposed to previous case
c = ~currentAddr;
It is promoted, but it yields an expected result because, as chux explainned (which I couldn't have done) it is (temporarily) promoted to int during its operation, but resolved as (converted to) a unsigned short again when it is stored in memory allocated to c.
The C standard answers the question:
From the C99 standard: 6.5.16.1 Simple assignment
In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.
In your case since both the LHS and RHS are of the same type, there is no need for any conversion.
Also, it says:
The type of an assignment expression is the type the left operand would have after lvalue conversion.
The same is specified by C11 6.5.16.1/2:
In simple assignment (=), the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.
Try this yourself:
int main(void)
{
unsigned short c;
unsigned short currentAddr= 0x0000;
c = ~currentAddr;
printf("\n0x%x", c);
printf("\n0x%x", (~currentAddr));
return 0;
}
This should print:
0xffff
0xffffffff
addr[1] = (~currentAddr)/2+1;
Let us break it down: currentAddr is an unsigned short involved in a computation so the value/type is first promoted to int or unsigned. In C this is integer promotion.
If an int can represent all values of the original type ..., the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions. C11dr §6.3.1.1 2
When USHRT_MAX <= INT_MAX, (e.g. 16 bit short int/unsigned, 32-bit int/unsigned), code is like below. With currentAddr == 0 and typical 2's complement behavior, ~0 --> -1 and addr[1] --> 1.
int tmp = currentAddr;
addr[1] = (~tmp)/2+1;
When USHRT_MAX > INT_MAX, (e.g. 16 bit short int/unsigned, 16-bit int/unsigned), code is like below. With currentAddr == 0 and unsigned behavior, ~0 --> 0xFFFF and addr[1] --> 0x8000.
unsigned tmp = currentAddr;
addr[1] = (~tmp)/2+1;
My aim is to have half the maximum of the variable
The best way to get the maximum of an unsigned short is to use SHRT_MAX and skip the ~ code. It will work as expected regardless of unsigned short, int, unsigned range. It also better documents code intent.
#include <limits.h>
addr[1] = USHRT_MAX/2+1;
Because the number 2 is int and int can hold unsigned short,so,the the actual operation is addr[1] = (unsigned short)(((int)(~currentAddr)/2)+1)

kbuild C: ~ Operator Converts Unsigned to Signed? [duplicate]

Let say I have a 32-bit machine.
I know during integer promotion the expressions are converted to:
int if all values of the original type can be represented in int
unsigned otherwise
Could you please explain what will happen for the following expression? and In general, how ranking works here?
First snippet:
int16_t x, pt;
int32_t speed;
uint16_t length;
x = (speed*pt)/length;
Second one:
x = pt + length;
#EDIT:
I found the following link that has described the issue very clearly:
Implicit type conversion.
Concretely, read the answer of Lundin, very helpful!
The integer promotion rule, correctly cited C11 6.3.1.1:
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions. All other types are unchanged by the integer
promotions.
Where "otherwise, it is converted to an unsigned int" is in practice only used in one particular special case, namely where the smaller integer type unsigned short has the same size as unsigned int. In that case it will remain unsigned.
Apart from that special case, all small integer types will always get promoted to (signed) int regardless of their signedness.
Assuming 32 bit int, then:
x = (speed*pt)/length;
speed is signed 32, it will not get promoted. pt will get integer promoted to int (signed 32). The result of speed*pt will have type int.
length will get integer promoted to int. The division will get carried out with operands of type int and the resulting type will be int.
The result will get converted to signed 16 as it is assigned to x (lvalue conversion during assignment).
x = pt + length; is similar, here both operands of + will get promoted to int before addition and the result will afterwards get converted to signed 16.
For details see Implicit type promotion rules.
The integer promotion rules are defined in 6.3.1.8 Usual arithmetic conversions.
1. int16_t x, pt;
int32_t speed;
uint16_t length;
x = (speed*pt)/length;
2. x = pt + length;
Ranking means effectively the number of bits from the type as defined by CAM in limits.h. The standards imposes for the types of lower rank in CAM to correspond types of lower rank in implementation.
For your code,
speed * pt
is multiplication between int32_t and int16_t, which means, it is transformed in
speed * (int16_t => int32_t) pt
and the result tmp1 will be int32_t.
Next, it will continue
tmp1_int32 / length
Length will be converted from uint16_t to int32_t, so it will compute tmp2 so:
tmp1_int32 / (uint16_t => int32_t) length
and the result tmp2 will be of type int32_t.
Next it will evaluate an assignment expression, left side of 16 bits and the right side of 32, so it will cut the result so:
x = (int32_t => int16_t) tmp2_int32
Your second case will be evaluated as
x = (int32_t => int16_t) ( (int16_t => int32_t) pt + (uint16_t => int32_t) length )
In case an operator has both operands with rank smaller than the rank of int, the CAM allows to add both types if the operation does not overflow and then to convert the result to integer.
In other words, it is possible to covert INT16+INT16 either in
INT16+INT16
or in
(int32_t => int16_t) ((int16_t => int32_t) INT16 + (int16_t => int32_t) INT16)
provided the addition can be done without overflow.

Why are integer types promoted during addition in C?

So we had a field issue, and after days of debugging, narrowed down the problem to this particular bit of code, where the processing in a while loop wasn't happening :
// heavily redacted code
// numberA and numberB are both of uint16_t
// Important stuff happens in that while loop
while ( numberA + 1 == numberB )
{
// some processing
}
This had run fine, until we hit the uint16 limit of 65535. Another bunch of print statements later, we discovered that numberA + 1 had a value of 65536, while numberB wrapped back to 0. This failed the check and no processing was done.
This got me curious, so I put together a quick C program (compiled with GCC 4.9.2) to check this:
#include <stdio.h>
#include <stdint.h>
int main()
{
uint16_t numberA, numberB;
numberA = 65535;
numberB = numberA + 1;
uint32_t numberC, numberD;
numberC = 4294967295;
numberD = numberC + 1;
printf("numberA = %d\n", numberA + 1);
printf("numberB = %d\n", numberB);
printf("numberC = %d\n", numberC + 1);
printf("numberD = %d\n", numberD);
return 0;
}
And the result was :
numberA = 65536
numberB = 0
numberC = 0
numberD = 0
So it appears that the result of numberA + 1 was promoted to uint32_t. Is this intended by the C language ? Or is this some compiler / hardware oddity?
So it appears that the result of numberA + 1 was promoted to uint32_t
The operands of the addition were promoted to int before the addition took place, and the result of the addition is of the same type as the effective operands (int).
Indeed, if int is 32-bit wide on your compilation platform (meaning that the type that represents uint16_t has lower “conversion rank” than int), then numberA + 1 is computed as an int addition between 1 and a promoted numberA as part of the integer promotion rules, 6.3.1.1:2 in the C11 standard:
The following may be used in an expression wherever an int or unsigned int may be used: […] An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.
[…]
If an int can represent all values of the original type […], the value is converted to an int
In your case, unsigned short which is in all likelihood what uint16_t is defined as on your platform, has all its values representable as elements of int, so the unsigned short value numberA gets promoted to int when it occurs in an arithmetic operation.
For arithmetic operators such as +, the usual arithmetic conversions are applied.
For integers, the first step of those conversions is called the integer promotions, and this promotes any value of type smaller than int to be an int.
The other steps don't apply to your example so I shall omit them for conciseness.
In the expression numberA + 1, the integer promotions are applied. 1 is already an int so it remains unchanged. numberA has type uint16_t which is narrower than int on your system, so numberA gets promoted to int.
The result of adding two ints is another int, and 65535 + 1 gives 65536 since you have 32-bit ints.
So your first printf outputs this result.
In the line:
numberB = numberA + 1;
the above logic still applies to the + operator, this is equivalent to:
numberB = 65536;
Since numberB has an unsigned type, uint16_t specifically, 65536 is reduced (mod 65536) which gives 0.
Note that your last two printf statements cause undefined behaviour; you must use %u for printing unsigned int. To cope with different sizes of int, you can use "%" PRIu32 to get the format specifier for uint32_t.
When the C language was being developed, it was desirable to minimize the number of kinds of arithmetic compilers had to deal with. Thus, most math operators (e.g. addition) supported only int+int, long+long, and double+double. While the language could have been simplified by omitting int+int (promoting everything to long instead), arithmetic on long values generally takes 2-4 times as much code as arithmetic on int values; since most programs are dominated by arithmetic on int types, that would have been very costly. Promoting float to double, by contrast, will in many cases save code, because it means that only two functions are needed to support float: convert to double, and convert from double. All other floating-point arithmetic operations need only support one floating-point type, and since floating-point math is often done by calling library routines the cost of calling a routine to add two double values is often the same as the cost to call a routine to add two float values.
Unfortunately, the C language became widespread on a variety of platforms before anyone really figured out what 0xFFFF + 1 should mean, and by that time there were already some compilers where the expression yielded 65536 and some where it yielded zero. Consequently, writers of standards have endeavored to write them in a fashion that would allow compilers to keep on doing whatever they were doing, but which was rather unhelpful from the standpoint of anyone hoping to write portable code. Thus, on platforms where int is 32 bits, 0xFFFF+1 will yield 65536, and on platforms where int is 16 bits, it will yield zero. If on some platform int happened to be 17 bits, 0xFFFF+1 would authorize the compiler to negate the laws of time and causality [btw, I don't know if any 17-bit platforms, but there are some 32-bit platforms where uint16_t x=0xFFFF; uint16_t y=x*x; will cause the compiler to garble the behavior of code which precedes it].
Literal 1 in of int, i.e. in your case int32 type, so operations with int32 and int16 give results of int32.
To have result of numberA + 1 statement as uint16_t try explicit type cast for 1, e.g.: numberA + (uint16_t)1

How is shift operator evaluated in C?

I recently noticed a (weird) behavior when I conducted operations using shift >> <<!
To explain it, let me write this small runnable code that does two operations which are supposed to be identical(In my understanding), but I'm surprised with different results!
#include <stdio.h>
int main(void) {
unsigned char a=0x05, b=0x05;
// first operation
a = ((a<<7)>>7);
// second operation
b <<= 7;
b >>= 7;
printf("a=%X b=%X\n", a, b);
return 0;
}
When ran, a = 5 and b = 1. I expect them both to be equal to 1! Can someone kindly explain why I got such a result?
P.S: In my environment the size of unsigned char is 1 byte
In the first example:
a is converted to an int, shifted left, then right and then converted back to usigned char.
This will result to a=5 obviously.
In the second example:
b is converted to int, shifted left, then converted back to unsigned char.
b is converted to int, shifted right, then converted back to unsigned char.
The difference is that you lose information in the second example during the conversion to unsigned char
Detailed explanation of the things going on between the lines:
Case a:
In the expression a = ((a<<7)>>7);, a<<7 is evaluated first.
The C standard states that each operand of the shift operators is implicitly integer promoted, meaning that if they are of types bool, char, short etc (collectively the "small integer types"), they get promoted to an int.
This is standard practice for almost every operator in C. What makes the shift operators different from other operators is that they don't use the other kind of common, implicit promotion called "balancing". Instead, the result of a shift always have the type of the promoted left operand. In this case int.
So a gets promoted to type int, still containing the value 0x05. The 7 literal was already of type int so it doesn't get promoted.
When you left shift this int by 7, you get 0x0280. The result of the operation is of type int.
Note that int is a signed type, so had you kept shifting data further, into the sign bits, you would have invoked undefined behavior. Similarly, had either the left or the right operand been a negative value, you would also invoke undefined behavior.
You now have the expression a = 0x280 >> 7;. No promotions take place for the next shift operation, since both operands are already int.
The result is 5 and of the type int. You then convert this int to an unsigned char, which is fine, since the result is small enough to fit.
Case b:
b <<= 7; is equivalent to b = b << 7;.
As before, b gets promoted to an int. The result will again be 0x0280.
You then attempt to store this result in an unsigned char. It will not fit, so it will get truncated to only contain the least significant byte 0x80.
On the next line, b again gets promoted to an int, containing 0x80.
And then you shift 0x80 by 7, getting the result 1. This is of type int, but can fit in an unsigned char, so it will fit in b.
Good advice:
Never ever use bit-wise operators on signed integer types. This doesn't make any sense in 99% of the cases but can lead to various bugs and poorly defined behavior.
When using bit-wise operators, use the types in stdint.h rather than the primitive default types in C.
When using bit-wise operators, use explicit casts to the intended type, to prevent bugs and unintended type changes, but also to make it clear that you actually understand how implicit type promotions work, and that you didn't just get the code working by accident.
A better, safer way to write your program would have been:
#include <stdio.h>
#include <stdint.h>
int main(void) {
uint8_t a=0x05;
uint8_t b=0x05;
uint32_t tmp;
// first operation
tmp = (uint32_t)a << 7;
tmp = tmp >> 7;
a = (uint8_t)tmp;
// second operation
tmp = (uint32_t)b << 7;
tmp = tmp >> 7;
b = (uint8_t)tmp;
printf("a=%X b=%X\n", a, b);
return 0;
}
The shift operations would do integer promotions to its operands, and in your code the resulting int is converted back to char like this:
// first operation
a = ((a<<7)>>7); // a = (char)((a<<7)>>7);
// second operation
b <<= 7; // b = (char) (b << 7);
b >>= 7; // b = (char) (b >> 7);
Quote from the N1570 draft (which became the standard of C11 later):
6.5.7 Bitwise shift operators:
Each of the operands shall have integer type.
The integer promotions are performed on each of the operands. The type of the result is that of the promoted left operand. If the value of the right operand is negative or is greater than or equal to the width of the promoted left operand, the behavior is undefined.
And it's supposed that in C99 and C90 there are similar statements.

Resources