Store signed char inside unsigned int - c

I have an unsigned int that actually stores a signed int, and the signed int ranges from -128 to 127.
I would like to store this value back in the unsigned int so that I can simply
apply a mask 0xFF and get the signed char.
How do I do the conversion ?
i.e.
unsigned int foo = -100;
foo = (char)foo;
char bar = foo & 0xFF;
assert(bar == -100);

The & 0xFF operation will produce a value in the range 0 to 255. It's not possible to get a negative number this way. So, even if you use & 0xFF somewhere, you will still need to apply a conversion later to get to the range -128 to 127.
In your code:
char bar = foo & 0xFF;
there is an implicit conversion to char. This relies on implementation-defined behaviour but this will work on all but the most esoteric of systems. The most common implementation definition is the inverse of the conversion that applies when converting unsigned char to char.
(Your previous line foo = (char)foo; should be removed).
However,
char bar = foo;
would produce exactly the same effect (again, except for on those esoteric systems).

Since the unsigned int foo value does not reach the boundaries of -128 or 127 the implicit conversion will work for this case. But if unsigned int foo had a bigger value you will be losing bytes at the moment when storing it in a char variable and will get unexpected results on your program.

Answering for C,
If you have an unsigned int whose value was set by assignment of a value of type char (where char happens to be a signed type) or of type signed char, where the assigned value was negative, then the stored value is the arithmetic sum of the assigned negative value and one more than UINT_MAX. This will be far beyond the range of values representable by (signed) char on any C system I have ever encountered. If you convert that value back to (signed) char, whether implicitly or via a cast, "either the result is implementation-defined, or an implementation-defined signal is raised" (C2011, 6.3.1.3/3).
Converting back to the original char value in a way that avoids implementation-defined behavior is a bit tricky (but relying on implementation-defined behavior may afford much easier approaches). Certainly, masking off all but the 8 lowest-order value bits does not do the trick, as it always gives you a positive value. Also, it assumes that char is 8 bits wide, which, in principle, is not guaranteed. It does not necessarily even give you the correct bit pattern, as C permits negative integers to be represented in any of three different ways.
Here's an approach that will work on any conforming C system:
unsigned int foo = SOME_SIGNED_CHAR_VALUE;
signed char bar;
/* ... */
if (foo <= SCHAR_MAX) {
/* foo's value is representable as a signed char */
bar = foo;
} else {
/* mask off the highest-order value bits to yield a value that fits in an int */
int foo2 = foo & INT_MAX;
/* reverse the conversion to unsigned int, as if unsigned int had the same
number of value bits as int; the other bits are already accounted for */
bar = (foo2 - INT_MAX) - 1;
}
That relies only on characteristics of integer representation and conversion that C itself defines.

Don't do it.
Casting to a smaller size may truncate the value. Casting from signed to unsigned or opposite may results wrong value (e.g. 255 -> -1).
If you have to make calculations with different data types, pick one common type, prefereably signed and long int (32-bit), and check boundaries before casting down (to smaller size).
Signed helps you detect underflows (e.g. when result gets less than 0), long int (or just simply: int, which means natural word length) suits for machines (32-bit or 64-bit), and it's big enough for most purposes.
Also try to avoid mixed types in formulas, especially when they contain division (/).

Related

Can I overflow uint32_t in a temporary result?

Basically what happens on a 32bit system when I do this:
uint32_t test (void)
{
uint32_t myInt;
myInt = ((0xFFFFFFFF * 0xFFFFFFFF) % 256u );
return myInt;
}
Let's assume that int is 32 bits.
0xFFFFFFFF will have type unsigned int. There are special rules that explain this, but because it is a hexadecimal constant and it doesn't fit in int, but does fit in unsigned int, it ends up as unsigned int.
0xFFFFFFFF * 0xFFFFFFFF will first go through the usual arithmetic conversions, but since both sides are unsigned int, nothing happens. The result of the multiplication is 0xfffffffe00000001 which is reduced to unsigned int by using the modulo 232 value, resulting in the value 1 with type unsigned int.
(unsigned int)1 % 256u is equal to 1 and has type unsigned int. Usual arithmetic conversions apply here too, but again, both operands are unsigned int so nothing happens.
The result is converted to uint32_t, but it's already unsigned int which has the same range.
However, let's instead suppose that int is 64 bits.
0xFFFFFFFF will have type int.
0xFFFFFFFF * 0xFFFFFFFF will overflow! This is undefined behavior. At this point we stop trying to figure out what the program does, because it could do anything. Maybe the compiler would decide not to emit code for this function, or something equally absurd.
This would happen in a so-called "ILP64" or "SILP64" architecture. These architectures are rare but they do exist. We can avoid these portability problems by using 0xFFFFFFFFu.
Unsigned integer overflowing means you can try to put a value greater than the range of what it can hold - but it will wrap and will put a number modulo UINT32_MAX+1. In fact in this case also that will happen provided you append the U or u with the integer literals. Otherwise integer literals are consideredwhen turns out to be signed (As you didn't specify anything), will result in overflow due to multiplication and signed integer overflow which is Undefined Behavior.
Again back to the explanation, here when you are multiplying this (ensuring that they are unsigned it (the result) will wrap in UINT32_MAX+1 by wrapping in it is meant that if it is bigger than uint32_t then the result will be applied over modulous of UINT32_MAX) and then we apply modulo operation with 256u and then that result is stored in uint32_t and returned from the method. (Note that the result of multiplication if overflows will be at first taken as modulo of UINT_MAX+1)

What does (int)(unsigned char)(x) do in C?

In ctype.h, line 20, __ismask is defined as:
#define __ismask(x) (_ctype[(int)(unsigned char)(x)])
What does (int)(unsigned char)(x) do? I guess it casts x to unsigned char (to retrieve the first byte only regardless of x), but then why is it cast to an int at the end?
(unsigned char)(x) effectively computes an unsigned char with the value of x % (UCHAR_MAX + 1). This has the effect of giving a positive value (between 0 and UCHAR_MAX). With most implementations UCHAR_MAX has a value of 255 (although the standard permits an unsigned char to support a larger range, such implementations are uncommon).
Since the result of (unsigned char)(x) is guaranteed to be in the range supported by an int, the conversion to int will not change value.
Net effect is the least significant byte, with a positive value.
Some compilers give a warning when using a char (signed or not) type as an array index. The conversion to int shuts the compiler up.
The unsigned char-cast is to make sure the value is within the range 0..255, the resulting value is then used as an index in the _ctype array which is 255 bytes large, see ctype.h in Linux.
A cast to unsigned char safely extracts the least significant CHAR_BITs of x, due to the wraparound properties of an unsigned type. (A cast to char could be undefined if a char is a signed type on a platform: overflowing a signed type is undefined behaviour in c). CHAR_BIT is usually 8.
The cast to int then converts the unsigned char. The standard guarantees that an int can always hold any value that unsigned char can take.
A better alternative, if you wanted to extract the 8 least significant bits would be to apply & 0xFF and cast that result to an unsigned type.
I think char is implementation dependent, either signed or unsigned. So you need to be explicit by writing unsigned char, in order not to cast to a negative number. Then cast to int.

Is arithmetic overflow equivalent to modulo operation?

I need to do modulo 256 arithmetic in C. So can I simply do
unsigned char i;
i++;
instead of
int i;
i=(i+1)%256;
No. There is nothing that guarantees that unsigned char has eight bits. Use uint8_t from <stdint.h>, and you'll be perfectly fine. This requires an implementation which supports stdint.h: any C99 compliant compiler does, but older compilers may not provide it.
Note: unsigned arithmetic never overflows, and behaves as "modulo 2^n". Signed arithmetic overflows with undefined behavior.
Yes, the behavior of both of your examples is the same. See C99 6.2.5 ยง9 :
A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
unsigned char c = UCHAR_MAX;
c++;
Basically yes, there is no overflow, but not because c is of an unsigned type. There is a hidden promotion of c to int here and an integer conversion from int to unsigned char and it is perfectly defined.
For example,
signed char c = SCHAR_MAX;
c++;
is also not undefined behavior, because it is actually equivalent to:
c = (int) c + 1;
and the conversion from int to signed char is implementation-defined here (see c99, 6.3.1.3p3 on integer conversions). To simplify CHAR_BIT == 8 is assumed.
For more information on the example above, I suggest to read this post:
"The Little C Function From Hell"
http://blog.regehr.org/archives/482
Very probably yes, but the reasons for it in this case are actually fairly complicated.
unsigned char i = 255;
i++;
The i++ is equivalent to i = i + 1.
(Well, almost. i++ yields the value of i before it was incremented, so it's really equivalent to (tmp=i; i = i + 1; tmp). But since the result is discarded in this case, that doesn't raise any additional issues.)
Since unsigned char is a narrow type, an unsigned char operand to the + operator is promoted to int (assuming int can hold all possible values in the range of unsigned char). So if i == 255, and UCHAR_MAX == 255, then the result of the addition is 256, and is of type (signed) int.
The assignment implicitly converts the value 256 from int back to unsigned char. Conversion to an unsigned type is well defined; the result is reduced modulo MAX+1, where MAX is the maximum value of the target unsigned type.
If i were declared as an unsigned int:
unsigned int i = UINT_MAX;
i++;
there would be no type conversion, but the semantics of the + operator for unsigned types also specify reduction module MAX+1.
Keep in mind that the value assigned to i is mathematically equivalent to (i+1) % UCHAR_MAX. UCHAR_MAX is usually 255, and is guaranteed to be at least 255, but it can legally be bigger.
There could be an exotic system on which UCHAR_MAX is too be to be stored in a signed int object. This would require UCHAR_MAX > INT_MAX, which means the system would have to have at least 16-bit bytes. On such a system, the promotion would be from unsigned char to unsigned int. The final result would be the same. You're not likely to encounter such a system. I think there are C implementations for some DSPs that have bytes bigger than 8 bits. The number of bits in a byte is specified by CHAR_BIT, defined in <limits.h>.
CHAR_BIT > 8 does not necessarily imply UCHAR_MAX > INT_MAX. For example, you could have CHAR_BIT == 16 and sizeof (int) == 2 i.e., 16-bit bytes and 32 bit ints).
There's another alternative that hasn't been mentioned, if you don't want to use another data type.
unsigned int i;
// ...
i = (i+1) & 0xFF; // 0xFF == 255
This works because the modulo element == 2^n, meaning the range will be [0, 2^n-1] and thus a bitmask will easily keep the value within your desired range. It's possible this method would not be much or any less efficient than the unsigned char/uint8_t version, either, depending on what magic your compiler does behind the scenes and how the targeted system handles non-word loads (for example, some RISC architectures require additional operations to load non-word-size values). This also assumes that your compiler won't detect the usage of power-of-two modulo arithmetic on unsigned values and substitute a bitmask for you, of course, as in cases like that the modulo usage would have greater semantic value (though using that as the basis for your decision is not exactly portable, of course).
An advantage of this method is that you can use it for powers of two that are not also the size of a data type, e.g.
i = (i+1) & 0x1FF; // i %= 512
i = (i+1) & 0x3FF; // i %= 1024
// etc.
This should work fine because it should just overflow back to 0. As was pointed out in a comment on a different answer, you should only do this when the value is unsigned, as you may get undefined behavior with a signed value.
It is probably best to leave this using modulo, however, because the code will be better understood by other people maintaining the code, and a smart compiler may be doing this optimization anyway, which may make it pointless in the first place. Besides, the performance difference will probably be so small that it wouldn't matter in the first place.
It will work if the number of bits that you are using to represent the number is equal to number of bits in binary (unsigned) representation (100000000) of the divisor -1
which in this case is : 9-1= 8 (char)

Regarding type safety when storing an unsigned char value in char variable

I have a char array holding several characters. I want to compare one of these characters with an unsigned char variable. For example:
char myarr = { 20, 14, 5, 6, 42 };
const unsigned char foobar = 133;
myarr[2] = foobar;
if(myarr[2] == foobar){
printf("You win a shmoo!\n");
}
Is this comparison type safe?
I know from the C99 standard that char, signed char, and unsigned char are three different types (section 6.2.5 paragraph 14).
Nevertheless, can I safely convert between unsigned char and char, and back, without losing precision and without risking undefined (or implementation-defined) behavior?
In section 6.2.5 paragraph 15:
The implementation shall define char to have the same range,
representation, and behavior as either signed char or unsigned char.
In section 6.3.1.3 paragraph 3:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
I'm afraid that if char is defined as a signed char, then myarr[2] = foobar could result in an implementation-defined value that will not be converted correctly back to the original unsigned char value; for example, an implementation may always result in the value 42 regardless of the unsigned value involved.
Does this mean that it is not safe to store an unsigned value in a signed variable of the same type?
Also what is an implementation-defined signal; does this mean an implementation could simply end the program in this case?
In section 6.3.1.1 paragraph 1:
-- The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char.
-- The rank of any unsigned integer type shall equal the rank of the corresponding
signed integer type, if any.
In section 6.2.5 paragraph 8:
For any two integer types with the same signedness and different integer conversion rank
(see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a
subrange of the values of the other type.
In section 6.3.1 paragraph 2:
If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int.
In section 6.3.1.8 paragraph 1:
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
The range of char is guaranteed to be the same range as that of signed char or unsigned char, which are both subranges of int and unsigned int respectively as a result of their smaller integer conversion rank.
Since, integer promotions rules dictate that char, signed char, and unsigned char be promoted to at least int before being evaluated, does this mean that char could maintain its "signedness" throughout the comparision?
For example:
signed char foo = -1;
unsigned char bar = 255;
if(foo == bar){
printf("same\n");
}
Does foo == bar evaluate to a false value, even if -1 is equivalent to 255 when an explicit (unsigned char) cast is used?
UPDATE:
In section J.3.5 paragraph 1 regarding which cases result in implementation-defined values and behavior:
-- The result of, or the signal raised by, converting an integer to a signed integer type
when the value cannot be represented in an object of that type (6.3.1.3).
Does this mean that not even an explicit conversion is safe?
For example, could the following code result in implementation-defined behavior since char could be defined as a signed integer type:
char blah = (char)255;
My original post is rather broad and consists of many specific questions of which I should have given each its own page. However, I address and answer each question here so future visitors can grok the answers more easily.
Answer 1
Question:
Is this comparison type safe?
The comparison between myarr[2] and foobar in this particular case is safe since both variables hold unsigned values. In general, however, this is not true.
For example, suppose an implementation defines char to have the same behavior as signed char, and int is able to represent all values representable by unsigned char and signed char.
char foo = -25;
unsigned char bar = foo;
if(foo == bar){
printf("This line of text will not be printed.\n");
}
Although bar is set equal to foo, and the C99 standard guarantees that there is no loss of precision when converting from signed char to unsigned char (see Answer 2), the foo == bar conditional expression will evaluate false.
This is due to the nature of integer promotion as required by section 6.3.1 paragraph 2 of the C99 standard:
If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int.
Since in this implementation int can represent all values of both signed char and unsigned char, the values of both foo and bar are converted to type int before being evaluated. Thus the resulting conditional expression is -25 == 231 which evaluates to false.
Answer 2
Question:
Nevertheless, can I safely convert between unsigned char and char, and back, without losing precision and without risking undefined (or implementation-defined) behavior?
You can safely convert from char to unsigned char without losing precision (nor width nor information), but converting in the other direction -- unsigned char to char -- can lead to implementation-defined behavior.
The C99 standard makes certain guarantees which enable us to convert safely from char to unsigned char.
In section 6.2.5 paragraph 15:
The implementation shall define char to have the same range,
representation, and behavior as either signed char or unsigned char.
Here, we are guaranteed that char will have the same range, representation, and behavior as signed char or unsigned char. If the implementation chooses the unsigned char option, then the conversion from char to unsigned char is essentially that of unsigned char to unsigned char -- thus no width nor information is lost and there are no issues.
The conversion for the signed char option is not as intuitive, but is implicitly guaranteed to preserve precision.
In section 6.2.5 paragraph 6:
For each of the signed integer types, there is a corresponding (but different) unsigned
integer type (designated with the keyword unsigned) that uses the same amount of
storage (including sign information) and has the same alignment requirements.
In 6.2.6.1 paragraph 3:
Values stored in unsigned bit-fields and objects of type unsigned char shall be
represented using a pure binary notation.
In section 6.2.6.2 paragraph 2:
For signed integer types, the bits of the object representation shall be divided into three
groups: value bits, padding bits, and the sign bit. There need not be any padding bits; there shall be exactly one sign bit. Each bit that is a value bit shall have the same value as
the same bit in the object representation of the corresponding unsigned type (if there are
M value bits in the signed type and N in the unsigned type, then M <= N).
First, signed char is guaranteed to occupy the same amount of storage as an unsigned char, as are all signed integers in respect to their unsigned counterparts.
Second, unsigned char is guaranteed to have a pure binary representation (i.e. no padding bits and no sign bit).
signed char is required to have exactly one sign bit, and no more than the same number of value bits as unsigned char.
Given these three facts, we can prove via pigeonhole principle that the signed char type has at most one less than the number of value bits as the unsigned char type. Similarly, signed char can safely be converted to unsigned char with not only no loss of precision, but no loss of width or information as well:
unsigned char has storage size of N bits.
signed char must have the same storage size of N bits.
unsigned char has no padding or sign bits and therefore has N value bits
signed char can have at most N non-padding bits, and must allocate exactly one bit as the sign bit.
signed char can have at most N-1 value bits and exactly one sign bit
All signed char bits therefore match up one-to-one to the respective unsigned char value bits; in other words, for any given signed char value, there is a unique unsigned char representation.
/* binary representation prefix: 0b */
(signed char)(-25) = 0b11100111
(unsigned char)(231) = 0b11100111
Unfortunately, converting from unsigned char to char can lead to implementation-defined behavior. For example, if char is defined by the implementation to behave as signed char, then an unsigned char variable may hold a value that is outside the range of values representable by a signed char. In such cases, either the result is implementation-defined or an implementation-defined signal is raised.
In section 6.3.1.3 paragraph 3:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
Answer 3
Question:
Does this mean that it is not safe to store an unsigned value in a signed variable of the same type?
Trying to convert an unsigned type value to a signed type value can result in implementation-defined behavior if the unsigned type value cannot be represented in the new signed type.
unsigned foo = UINT_MAX;
signed bar = foo; /* possible implementation-defined behavior */
In section 6.3.1.3 paragraph 3:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
An implementation-defined result would be any value returned within the range of values representable by the new signed type. An implementation could theoretically return the same value consistently (e.g. 42) for these cases and thus loss information occurs -- i.e. there is no guarantee that converting from unsigned to signed to back to unsigned will result in the same original unsigned value.
An implementation-defined signal is that which conforms to the rules laid out in section 7.14 of the C99 standard; an implementation is permitted to define additional conforming signals which are not explicitly enumerated by the C99 standard.
In this particular case, an implementation could theoretically raise the SIGTERM signal which requests the termination of the program. Thus, attempting to convert an unsigned type value to signed type could result in a program termination.
Answer 4
Question:
Does foo == bar evaluate to a false value, even if -1 is equivalent to 255 when an explicit (unsigned char) cast is used?
Consider the following code:
signed char foo = -1;
unsigned char bar = 255;
if((unsigned char)foo == bar){
printf("same\n");
}
Although signed char and unsigned char values are promoted to at least int before the evaluation of a conditional expression, the explicit unsigned char cast will convert the signed char value to unsigned char before the integer promotions occur. Furthermore, converting to an unsigned value is well-defined in the C99 standard and does not lead to implementation-defined behavior.
In section 6.3.1.3 paragraph 2:
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
This the conditional expression essentially becomes 255 = 255 which evaluates to true.
until the value is in the range of the new type.
Answer 5
Questions:
Does this mean that not even an explicit conversion is safe?
In general, an explicit cast to char for a value outside the range of values representable by signed char can lead to implementation-defined behavior (see Answer 3). A conversion need not be implicit for section 6.3.1.3 paragraph 3 of the C99 standard to apply.
"does this mean that char could maintain its 'signedness' throughout the comparison?" yes; -1 as a signed char will be promoted to a signed int, which will retain its -1 value. As for the unsigned char, it will also keep its 255 value when being promoted, so yes, the comparison will be false. If you want it to evaluate to true, you will need an explicit cast.
It has to do with how the memory for the char's are stored, in an unsigned char, all 8 bits are used to represent the value of the char while a signed char uses only 7 bits for the number and the 8'th bit to represent the sign.
For an example, lets take a simpler 3 bit value (I will call this new value type tinychar):
bits unsigned signed
000 0 0
001 1 1
010 2 2
011 3 3
100 4 -4
101 5 -3
110 6 -2
111 7 -1
By looking at this chart, you can see the difference in value between a signed and an unsigned tinychar based on how the bits are arranged. Up until you start getting into the negative range, the values are identical for both types. However, once you reach the point where the left-most bit changes to 1, the value suddenly becomes a negative for the signed. The way this works is if you reach the maximum positive value (3) and then add one more you end up with the maximum negative value (-4) and if you subtract one from 0 you will underflow and cause the signed tinychar to become -1 while an unsigned tinychar would become 7. You can also see the equivalence (==) between an unsigned 7 and the signed -1 tinychar because the bits are the same (111) for both.
Now if you expand this to have a total of 8 bits, you should see similar results.
I've tested your code and it doesn't compare (signed char)-1 and (unsigned char)255 the same.
You should convert signed char into unsigned char first, because it doesn't use the MSB sign bit in operations.
I have bad experience with using signed char type for buffer operations. Things like your problem then happen. Then be sure you have turned on all warnings during compilation and try to fix them.

How to cast or convert an unsigned int to int in C?

My apologies if the question seems weird. I'm debugging my code and this seems to be the problem, but I'm not sure.
Thanks!
It depends on what you want the behaviour to be. An int cannot hold many of the values that an unsigned int can.
You can cast as usual:
int signedInt = (int) myUnsigned;
but this will cause problems if the unsigned value is past the max int can hold. This means half of the possible unsigned values will result in erroneous behaviour unless you specifically watch out for it.
You should probably reexamine how you store values in the first place if you're having to convert for no good reason.
EDIT: As mentioned by ProdigySim in the comments, the maximum value is platform dependent. But you can access it with INT_MAX and UINT_MAX.
For the usual 4-byte types:
4 bytes = (4*8) bits = 32 bits
If all 32 bits are used, as in unsigned, the maximum value will be 2^32 - 1, or 4,294,967,295.
A signed int effectively sacrifices one bit for the sign, so the maximum value will be 2^31 - 1, or 2,147,483,647. Note that this is half of the other value.
Unsigned int can be converted to signed (or vice-versa) by simple expression as shown below :
unsigned int z;
int y=5;
z= (unsigned int)y;
Though not targeted to the question, you would like to read following links :
signed to unsigned conversion in C - is it always safe?
performance of unsigned vs signed integers
Unsigned and signed values in C
What type-conversions are happening?
IMHO this question is an evergreen. As stated in various answers, the assignment of an unsigned value that is not in the range [0,INT_MAX] is implementation defined and might even raise a signal. If the unsigned value is considered to be a two's complement representation of a signed number, the probably most portable way is IMHO the way shown in the following code snippet:
#include <limits.h>
unsigned int u;
int i;
if (u <= (unsigned int)INT_MAX)
i = (int)u; /*(1)*/
else if (u >= (unsigned int)INT_MIN)
i = -(int)~u - 1; /*(2)*/
else
i = INT_MIN; /*(3)*/
Branch (1) is obvious and cannot invoke overflow or traps, since it
is value-preserving.
Branch (2) goes through some pains to avoid signed integer overflow
by taking the one's complement of the value by bit-wise NOT, casts it
to 'int' (which cannot overflow now), negates the value and subtracts
one, which can also not overflow here.
Branch (3) provides the poison we have to take on one's complement or
sign/magnitude targets, because the signed integer representation
range is smaller than the two's complement representation range.
This is likely to boil down to a simple move on a two's complement target; at least I've observed such with GCC and CLANG. Also branch (3) is unreachable on such a target -- if one wants to limit the execution to two's complement targets, the code could be condensed to
#include <limits.h>
unsigned int u;
int i;
if (u <= (unsigned int)INT_MAX)
i = (int)u; /*(1)*/
else
i = -(int)~u - 1; /*(2)*/
The recipe works with any signed/unsigned type pair, and the code is best put into a macro or inline function so the compiler/optimizer can sort it out. (In which case rewriting the recipe with a ternary operator is helpful. But it's less readable and therefore not a good way to explain the strategy.)
And yes, some of the casts to 'unsigned int' are redundant, but
they might help the casual reader
some compilers issue warnings on signed/unsigned compares, because the implicit cast causes some non-intuitive behavior by language design
If you have a variable unsigned int x;, you can convert it to an int using (int)x.
It's as simple as this:
unsigned int foo;
int bar = 10;
foo = (unsigned int)bar;
Or vice versa...
If an unsigned int and a (signed) int are used in the same expression, the signed int gets implicitly converted to unsigned. This is a rather dangerous feature of the C language, and one you therefore need to be aware of. It may or may not be the cause of your bug. If you want a more detailed answer, you'll have to post some code.
Some explain from C++Primer 5th Page 35
If we assign an out-of-range value to an object of unsigned type, the result is the remainder of the value modulo the number of values the target type can hold.
For example, an 8-bit unsigned char can hold values from 0 through 255, inclusive. If we assign a value outside the range, the compiler assigns the remainder of that value modulo 256.
unsigned char c = -1; // assuming 8-bit chars, c has value 255
If we assign an out-of-range value to an object of signed type, the result is undefined. The program might appear to work, it might crash, or it might produce garbage values.
Page 160:
If any operand is an unsigned type, the type to which the operands are converted depends on the relative sizes of the integral types on the machine.
...
When the signedness differs and the type of the unsigned operand is the same as or larger than that of the signed operand, the signed operand is converted to unsigned.
The remaining case is when the signed operand has a larger type than the unsigned operand. In this case, the result is machine dependent. If all values in the unsigned type fit in the large type, then the unsigned operand is converted to the signed type. If the values don't fit, then the signed operand is converted to the unsigned type.
For example, if the operands are long and unsigned int, and int and long have the same size, the length will be converted to unsigned int. If the long type has more bits, then the unsigned int will be converted to long.
I found reading this book is very helpful.

Resources