Is unsigned integer subtraction defined behavior? - c

I have come across code from someone who appears to believe there is a problem subtracting an unsigned integer from another integer of the same type when the result would be negative. So that code like this would be incorrect even if it happens to work on most architectures.
unsigned int To, Tf;
To = getcounter();
while (1) {
Tf = getcounter();
if ((Tf-To) >= TIME_LIMIT) {
break;
}
}
This is the only vaguely relevant quote from the C standard I could find.
A computation involving unsigned operands can never overflow, because a
result that cannot be represented by the resulting unsigned integer
type is reduced modulo the number that is one greater than the largest
value that can be represented by the resulting type.
I suppose one could take that quote to mean that when the right operand is larger the operation is adjusted to be meaningful in the context of modulo truncated numbers.
i.e.
0x0000 - 0x0001 == 0x 1 0000 - 0x0001 == 0xFFFF
as opposed to using the implementation dependent signed semantics:
0x0000 - 0x0001 == (unsigned)(0 + -1) == (0xFFFF but also 0xFFFE or 0x8001)
Which or what interpretation is right? Is it defined at all?

When you work with unsigned types, modular arithmetic (also known as "wrap around" behavior) is taking place. To understand this modular arithmetic, just have a look at these clocks:
9 + 4 = 1 (13 mod 12), so to the other direction it is: 1 - 4 = 9 (-3 mod 12). The same principle is applied while working with unsigned types. If the result type is unsigned, then modular arithmetic takes place.
Now look at the following operations storing the result as an unsigned int:
unsigned int five = 5, seven = 7;
unsigned int a = five - seven; // a = (-2 % 2^32) = 4294967294
int one = 1, six = 6;
unsigned int b = one - six; // b = (-5 % 2^32) = 4294967291
When you want to make sure that the result is signed, then stored it into signed variable or cast it to signed. When you want to get the difference between numbers and make sure that the modular arithmetic will not be applied, then you should consider using abs() function defined in stdlib.h:
int c = five - seven; // c = -2
int d = abs(five - seven); // d = 2
Be very careful, especially while writing conditions, because:
if (abs(five - seven) < seven) // = if (2 < 7)
// ...
if (five - seven < -1) // = if (-2 < -1)
// ...
if (one - six < 1) // = if (-5 < 1)
// ...
if ((int)(five - seven) < 1) // = if (-2 < 1)
// ...
but
if (five - seven < 1) // = if ((unsigned int)-2 < 1) = if (4294967294 < 1)
// ...
if (one - six < five) // = if ((unsigned int)-5 < 5) = if (4294967291 < 5)
// ...

The result of a subtraction generating a negative number in an unsigned type is well-defined:
[...] A computation involving unsigned operands can never overflow,
because a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type.
(ISO/IEC 9899:1999 (E) §6.2.5/9)
As you can see, (unsigned)0 - (unsigned)1 equals -1 modulo UINT_MAX+1, or in other words, UINT_MAX.
Note that although it does say "A computation involving unsigned operands can never overflow", which might lead you to believe that it applies only for exceeding the upper limit, this is presented as a motivation for the actual binding part of the sentence: "a result that cannot be represented by the resulting unsigned integer type is
reduced modulo the number that is one greater than the largest value that can be
represented by the resulting type." This phrase is not restricted to overflow of the upper bound of the type, and applies equally to values too low to be represented.

Well, the first interpretation is correct. However, your reasoning about the "signed semantics" in this context is wrong.
Again, your first interpretation is correct. Unsigned arithmetic follow the rules of modulo arithmetic, meaning that 0x0000 - 0x0001 evaluates to 0xFFFF for 32-bit unsigned types.
However, the second interpretation (the one based on "signed semantics") is also required to produce the same result. I.e. even if you evaluate 0 - 1 in the domain of signed type and obtain -1 as the intermediate result, this -1 is still required to produce 0xFFFF when later it gets converted to unsigned type. Even if some platform uses an exotic representation for signed integers (1's complement, signed magnitude), this platform is still required to apply rules of modulo arithmetic when converting signed integer values to unsigned ones.
For example, this evaluation
signed int a = 0, b = 1;
unsigned int c = a - b;
is still guaranteed to produce UINT_MAX in c, even if the platform is using an exotic representation for signed integers.

With unsigned numbers of type unsigned int or larger, in the absence of type conversions, a-b is defined as yielding the unsigned number which, when added to b, will yield a. Conversion of a negative number to unsigned is defined as yielding the number which, when added to the sign-reversed original number, will yield zero (so converting -5 to unsigned will yield a value which, when added to 5, will yield zero).
Note that unsigned numbers smaller than unsigned int may get promoted to type int before the subtraction, the behavior of a-b will depend upon the size of int.

Well, an unsigned integer subtraction has defined behavior, also it is a tricky thing. When you subtract two unsigned integers, result is promoted to higher type int if result (lvalue) type is not specified explicitly. In the latter case, for example, int8_t result = a - b; (where a and b have int8_t type) you can obtain very weird behavior. I mean you may loss transitivity property (i.e. if a > b and b > c it is true that a > c).
The loss of transitivity can destroy a tree-type data structure work. Care must be taken not to provide comparison function for sorting, searching, tree building that uses unsigned integer subtraction to deduce which key is higher or lower.
See example below.
#include <stdint.h>
#include <stdio.h>
void main()
{
uint8_t a = 255;
uint8_t b = 100;
uint8_t c = 150;
printf("uint8_t a = %+d, b = %+d, c = %+d\n\n", a, b, c);
printf(" b - a = %+d\tpromotion to int type\n"
" (int8_t)(b - a) = %+d\n\n"
" b + a = %+d\tpromotion to int type\n"
"(uint8_t)(b + a) = %+d\tmodular arithmetic\n"
" b + a %% %d = %+d\n\n",
b - a, (int8_t)(b - a),
b + a, (uint8_t)(b + a),
UINT8_MAX + 1,
(b + a) % (UINT8_MAX + 1));
printf("c %s b (b - c = %d), b %s a (b - a = %d), AND c %s a (c - a = %d)\n",
(int8_t)(c - b) < 0 ? "<" : ">", (int8_t)(c - b),
(int8_t)(b - a) < 0 ? "<" : ">", (int8_t)(b - a),
(int8_t)(c - a) < 0 ? "<" : ">", (int8_t)(c - a));
}
$ ./a.out
uint8_t a = +255, b = +100, c = +150
b - a = -155 promotion to int type
(int8_t)(b - a) = +101
b + a = +355 promotion to int type
(uint8_t)(b + a) = +99 modular arithmetic
b + a % 256 = +99
c > b (b - c = 50), b > a (b - a = 101), AND c < a (c - a = -105)

int d = abs(five - seven); // d = 2
std::abs is not "suitable" for unsigned integers. A cast is needed though.

Related

Detecting if an unsigned integer overflow has occurred when adding two numbers

This is my implementation to detect if an unsigned int overflow has occurred when trying to add two numbers.
The max value of unsigned int (UINT_MAX) on my system is 4294967295.
int check_addition_overflow(unsigned int a, unsigned int b) {
if (a > 0 && b > (UINT_MAX - a)) {
printf("overflow has occured\n");
}
return 0;
}
This seems to work with the values I've tried.
Any rogue cases? What do you think are the pros and cons?
You could use
if((a + b) < a)
The point is that if a + b is overflowing, the result will be trimmed and must be lower then a.
Consider the case with hypothetical bound range of 0 -> 9 (overflows at 10):
b can be 9 at the most. For any value a such that a + b >= 10, (a + 9) % 10 < a.
For any values a, b such that a + b < 10, since b is not negative, a + b >= a.
I believe OP was referring to carry-out, not overflow. Overflow occurs when the addition/subtraction of two signed numbers doesn't fit into the number of type's bits size -1 (minus sign bit). For example, if an integer type has 32 bits, then
adding 2147483647 (0x7FFFFFFF) and 1 gives us -2 (0x80000000).
So, the result fits into 32 bits and there is no carry-out. The true result should be 2147483648, but this doesn't fit into 31 bits. Cpu has no idea of signed/unsigned value, so it simply add bits together, where 0x7FFFFFFF + 1 = 0x80000000. So the carry of bits #31 was added to bit #32 (1 + 0 = 1), which is actually a sign bit, changed result from + to -.
Since the sign changed, the CPU would set the overflow flag to 1 and carry flag to 0.

How to programmatically determine maximum and minimum limit of int data in C?

I am attempting exercise 2.1 of K&R. The exercise reads:
Write a program to determine the ranges of char, short, int, and long variables, both signed and unsigned, by printing appropriate values from standard headers and by direct computation. Harder if you compute them: determine the ranges of the various floating-point types.
Printing the values of constants in the standards headers is easy, just like this (only integer shown for example):
printf("Integral Ranges (from constants)\n");
printf("int max: %d\n", INT_MAX);
printf("int min: %d\n", INT_MIN);
printf("unsigned int max: %u\n", UINT_MAX);
However, I want to determine the limits programmatically.
I tried this code which seems like it should work but it actually goes into an infinite loop and gets stuck there:
printf("Integral Ranges (determined programmatically)\n");
int i_max = 0;
while ((i_max + 1) > i_max) {
++i_max;
}
printf("int max: %d\n", i_max);
Why is this getting stuck in a loop? It would seem that when an integer overflows it jumps from 2147483647 to -2147483648. The incremented value is obviously smaller than the previous value so the loop should end, but it doesn't.
Ok, I was about to write a comment but it got too long...
Are you allowed to use sizeof?
If true, then there is an easy way to find the max value for any type:
For example, I'll find the maximum value for an integer:
Definition: INT_MAX = (1 << 31) - 1 for 32-bit integer (2^31 - 1)
The previous definition overflows if we use integers to compute int max, so, it has to be adapted properly:
INT_MAX = (1 << 31) - 1
= ((1 << 30) * 2) - 1
= ((1 << 30) - 1) * 2 + 2) - 1
= ((1 << 30) - 1) * 2) + 1
And using sizeof:
INT_MAX = ((1 << (sizeof(int)*8 - 2) - 1) * 2) + 1
You can do the same for any signed/unsigned type by just reading the rules for each type.
So it actually wasn't getting stuck in an infinite loop. C code is usually so fast that I assume it's broken if it doesn't complete immediately.
It did eventually return the correct answer after I let it run for about 10 seconds. Turns out that 2,147,483,647 increments takes quite a few cycles to complete.
I should also note that I compiled with cc -O0 to disable optimizations, so this wasn't the problem.
A faster solution might look something like this:
int i_max = 0;
int step_size = 256;
while ((i_max + step_size) > i_max) {
i_max += step_size;
}
while ((i_max + 1) > i_max) {
++i_max;
}
printf("int max: %d\n", i_max);
However, as signed overflow is undefined behavior, probably it is a terrible idea to ever try to programmatically guess this in practice. Better to use INT_MAX.
The simplest I could come up with is:
signed int max_signed_int = ~(1 << ((sizeof(int) * 8) -1));
signed int min_signed_int = (1 << ((sizeof(int) * 8) -1));
unsigned int max_unsigned_int = ~0U;
unsigned int min_unsigned_int = 0U;
In my system:
// max_signed_int = 2147483647
// min_signed_int = -2147483648
// max_unsigned_int = 4294967295
// min_unsigned_int = 0
Assuming a two's complement processor, use unsigned math:
unsigned ... smax, smin;
smax = ((unsigned ...)0 - (unsigned ...)1) / (unsigned ...) 2;
smin = ~smax;
As it has been pointed here in other solutions, trying to overflow an integer in C is undefined behaviour, but, at least in this case, I think you can get an valid answer, even from the U.B. thing:
The case is tha if you increment a value and compare the new value with the last, you always get a greater value, except on an overflow (in this case you'll get a value lesser or equal ---you don't have more values greater, that's the case in an overflow) So you can try at least:
int i_old = 0, i = 0;
while (++i > i_old)
i_old = i;
printf("MAX_INT guess: %d\n", i_old);
After this loop, you will have got the expected overflow, and old_i will store the last valid number. Of course, in case you go down, you'll have to use this snippet of code:
int i_old = 0, i = 0;
while (--i < i_old)
i_old = i;
printf("MIN_INT guess: %d\n", i_old);
Of course, U.B. can even mean program stopping run (in this case, you'll have to put traces, to get at least the last value printed)
By the way, in the ancient times of K&R, integers used to be 16bit wide, a value easily accessible by counting up (easier than now, try 64bit integers overflow from 0 up)
I would use the properties of two's complement to compute the values.
unsigned int uint_max = ~0U;
signed int int_max = uint_max >> 1;
signed int int_min1 = (-int_max - 1);
signed int int_min2 = ~int_max;
2^3 is 1000. 2^3 - 1 is 0111. 2^4 - 1 is 1111.
w is the length in bits of your data type.
uint_max is 2^w - 1, or 111...111. This effect is achieved by using ~0U.
int_max is 2^(w-1) - 1, or 0111...111. This effect can be achieved by bitshifting uint_max 1 bit to the right. Since uint_max is an unsigned value, the logical shift is applied by the >> operator, means it adds in leading zeroes instead of extending the sign bit.
int_min is -2^(w-1), or 100...000. In two's complement, the most significant bit has a negative weight!
This is how to visualize the first expression for computing int_min1:
...
011...111 int_max +2^(w-1) - 1
100...000 (-int_max - 1) -2^(w-1) == -2^(w-1) + 1 - 1
100...001 -int_max -2^(w-1) + 1 == -(+2^(w-1) - 1)
...
Adding 1 would be moving down, and subtracting 1 would be moving up. First we negate int_max in order to generate a valid int value, then we subtract 1 to get int_min. We can't just negate (int_max + 1) because that would exceed int_max itself, the biggest int value.
Depending on which version of C or C++ you are using, the expression -(int_max + 1) would either become a signed 64-bit integer, keeping the signedness but sacrificing the original bit width, or it would become an unsigned 32-bit integer, keeping the original bit width but sacrificing the signedness. We need to declare int_min programatically in this roundabout way to keep it a valid int value.
If that's a bit (or byte) too complicated for you, you can just do ~int_max, observing that int_max is 011...111 and int_min is 100...000.
Keep in mind that these techniques I've mentioned here can be used for any bit width w of an integer data type. They can be used for char, short, int, long, and also long long. Keep in mind that integer literals are almost always 32-bits by default, so you may have to cast the 0U to the data type with the appropriate bit width before bitwise NOTing it. But other than that, these techniques are based on the fundamental mathematical principles of two's complement integer representation. That said, they won't work if your computer uses a different way of representing integers, for example ones' complement or most-significant sign-bit.
The assignment says that "printing appropriate values from standard headers" is allowed, and in the real world, that is what you would do. As your prof wrote, direct computation is harder, and why make things harder for its own sake when you're working on another interesting problem and you just want the result? Look up the constants in <limits.h>, for example, INT_MIN and INT_MAX.
Since this is homework and you want to solve it yourself, here are some hints.
The language standard technically allows any of three different representations for signed numbers: two's-complement, one's-complement and sign-and-magnitude. Sure, every computer made in the last fifty years has used two's-complement (with the partial exception of legacy code for certain Unisys mainframes), but if you really want to language-lawyer, you could compute the smallest number for each of the three possible representations and find the minimum by comparing them.
Attempting to find the answer by overflowing or underflowing a signed value does not work! This is undefined behavior! You may in theory, but not in practice, increment an unsigned value of the same width, convert to the corresponding signed type, and compare to the result of casting the previous or next unsigned value. For 32-bit long, this might just be tolerable; it will not scale to a machine where long is 64 bits wide.
You want to use the bitwise operators, particularly ~ and <<, to calculate the largest and smallest value for every type. Note: CHAR_BITS * sizeof(x) gives you the number of bits in x, and left-shifting 0x01UL by one fewer than that, then casting to the desired type, sets the highest bit.
For floating-point values, the only portable way is to use the constants in <math.h>; floating-point values might or might not be able to represent positive and negative infinity, are not constrained to use any particular format. That said, if your compiler supports the optional Annex G of the C11 standard, which specifies IEC 60559 complex arithmetic, then dividing a nonzero floating-point number by zero will be defined as producing infinity, which does allow you to "compute" infinity and negative infinity. If so, the implementation will #define __STDC_IEC_559_COMPLEX__ as 1.
If you detect that infinity is not supported on your implementation, for instance by checking whether INFINITY and -INFINITY are infinities, you would want to use HUGE_VAL and -HUGE_VAL instead.
#include <stdio.h>
int main() {
int n = 1;
while(n>0) {
n=n<<1;
}
int int_min = n;
int int_max = -(n+1);
printf("int_min is: %d\n",int_min);
printf("int_max is: %d\n", int_max);
return 0;
}
unsigned long LMAX=(unsigned long)-1L;
long SLMAX=LMAX/2;
long SLMIN=-SLMAX-1;
If you don't have yhe L suffix just use a variable or cast to signed before castong to unsigned.
For long long:
unsigned long long LLMAX=(unsigned long long)-1LL;

Why is a modulo operation returning an unexpected value

Why is the following code printing 255?
#include <stdint.h>
#include <stdio.h>
int main(void) {
uint8_t i = 0;
i = (i - 1) % 16;
printf("i: %d\n", i);
return 0;
}
I assumed 15, although i - 1 evaluates to an integer.
Because of integer promotions in the C standard. Briefly: any type "smaller" than int is converted to int before usage. You cannot avoid this in general.
So what goes on: i is promoted to int. The expression is evaluated as int (the constants you use are int, too). The modulus is -1. This is then converted to uint8_t: 255 by the assignment.
For printf then i is integer-promoted to int (again): (int)255. However, this does no harm.
Note that in C89, for a < 0, a % b is not necessarily negative. It was implementation-defined and could have been 15. However, since C99, -1 % 16 is guaranteed to be -1 as the division has to yield the algebraic quotient.
If you want to make sure the modulus gives a positive result, you have to evaluate the whole expression unsigned by casting i:
i = ((unsigned)i - 1) % 16;
Recommendation: Enable compiler warnings. At least the conversion for the assignment should give a truncation warning.
This is because -1 % n would return -1 and NOT n - 1 1. Since i in this case is unsigned 8 bit int, it becomes 255.
1 See this question for more details on how modulo for negative integers works in C/C++.
This works (displays 15) with Microsoft C compiler (no stdint.h, so I used a typedef):
#include <stdio.h>
typedef unsigned char uint8_t;
int main(void) {
uint8_t i = 0;
i = (uint8_t)(i - 1) % 16;
printf("i: %d\n", i);
return 0;
}
The reason for the 255 is because (i - 1) is promoted to integer, and the integer division used for % in C rounds towards zero instead of negative infinity (rounding towards negative infinity is the way it's done in math, science, and other programming languages). So for C % is zero or has the same sign as the dividend (in this case -1%16 == -1), while in math modulo is zero or has the same sign as the divisor.

summing unsigned and signed ints, same or different answer?

If I have the following code in C
int main()
{
int x = <a number>
int y = <a number>
unsigned int v = x;
unsigned int w = y;
int ssum = x * y;
unsigned int usum = v * w;
printf("%d\n", ssum);
printf("%d\n", usum);
if(ssum == usum){
printf("Same\n");
} else {
printf("Different\n");
}
return 0;
}
Which would print the most? Would it be equal since signed and unsigned would produce the same result, then if you have a negative like -1, when it gets assigned to int x it becomes 0xFF, and if you want to do -1 + (-1), if you do it the signed way to get -2 = 0xFE, and since the unsigned variables would be set to 0xFF, if you add them you would still get 0xFE. And the same holds true for 2 + (-3) or -2 + 3, in the end the hexadecimal values are identical. So in C is that what's looked at when it sees signedSum == unsignedSum? It doesnt care that one is actually a large number and the other is -2, as long at the 1's and 0's are the same?
Are there any values that would make this not true?
The examples you have given are incorrect in C. Also, converting between signed and unsigned types is not required to preserve bit patterns (the conversion is by value), although with some representations bit patterns are preserved.
There are circumstances where the result of operations will be the same, and circumstances where the result will differ.
If the (actual) sum of adding two ints would overflow an int
(i.e. value outside range that an int can represent) the result is
undefined behaviour. Anything can happen at that point (including
the program terminating abnormally) - subsequently converting to an unsigned doesn't change anything.
Converting an int with negative value to unsigned int uses modulo
arithmetic (modulo the maximum value that an unsigned can
represent, plus one). That is well defined by the standard, but
means -1 (type int) will convert to the maximum value that an
unsigned can represent (i.e. UINT_MAX, an implementation defined
value specified in <limits.h>).
Similarly, adding two variables of type unsigned int always uses
modulo arithmetic.
Because of things like this, your question "which would produce the most?" is meaningless.

How to sign extend a 9-bit value when converting from an 8-bit value?

I'm implementing a relative branching function in my simple VM.
Basically, I'm given an 8-bit relative value. I then shift this left by 1 bit to make it a 9-bit value. So, for instance, if you were to say "branch +127" this would really mean, 127 instructions, and thus would add 256 to the IP.
My current code looks like this:
uint8_t argument = 0xFF; //-1 or whatever
int16_t difference = argument << 1;
*ip += difference; //ip is a uint16_t
I don't believe difference will ever be detected as a less than 0 with this however. I'm rusty on how signed to unsigned works. Beyond that, I'm not sure the difference would be correctly be subtracted from IP in the case argument is say -1 or -2 or something.
Basically, I'm wanting something that would satisfy these "tests"
//case 1
argument = -5
difference -> -10
ip = 20 -> 10 //ip starts at 20, but becomes 10 after applying difference
//case 2
argument = 127 (must fit in a byte)
difference -> 254
ip = 20 -> 274
Hopefully that makes it a bit more clear.
Anyway, how would I do this cheaply? I saw one "solution" to a similar problem, but it involved division. I'm working with slow embedded processors (assumed to be without efficient ways to multiply and divide), so that's a pretty big thing I'd like to avoid.
To clarify: you worry that left shifting a negative 8 bit number will make it appear like a positive nine bit number? Just pad the top 9 bits with the sign bit of the initial number before left shift:
diff = 0xFF;
int16 diff16=(diff + (diff & 0x80)*0x01FE) << 1;
Now your diff16 is signed 2*diff
As was pointed out by Richard J Ross III, you can avoid the multiplication (if that's expensive on your platform) with a conditional branch:
int16 diff16 = (diff + ((diff & 0x80)?0xFF00:0))<<1;
If you are worried about things staying in range and such ("undefined behavior"), you can do
int16 diff16 = diff;
diff16 = (diff16 | ((diff16 & 0x80)?0x7F00:0))<<1;
At no point does this produce numbers that are going out of range.
The cleanest solution, though, seems to be "cast and shift":
diff16 = (signed char)diff; // recognizes and preserves the sign of diff
diff16 = (short int)((unsigned short)diff16)<<1; // left shift, preserving sign
This produces the expected result, because the compiler automatically takes care of the sign bit (so no need for the mask) in the first line; and in the second line, it does a left shift on an unsigned int (for which overflow is well defined per the standard); the final cast back to short int ensures that the number is correctly interpreted as negative. I believe that in this form the construct is never "undefined".
All of my quotes come from the C standard, section 6.3.1.3. Unsigned to signed is well defined when the value is within range of the signed type:
1 When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
Signed to unsigned is well defined:
2 Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
Unsigned to signed, when the value lies out of range isn't too well defined:
3 Otherwise, the new type is signed and the value cannot be
represented in it; either the result is implementation-defined or an
implementation-defined signal is raised.
Unfortunately, your question lies in the realm of point 3. C doesn't guarantee any implicit mechanism to convert out-of-range values, so you'll need to explicitly provide one. The first step is to decide which representation you intend to use: Ones' complement, two's complement or sign and magnitude
The representation you use will affect the translation algorithm you use. In the example below, I'll use two's complement: If the sign bit is 1 and the value bits are all 0, this corresponds to your lowest value. Your lowest value is another choice you must make: In the case of two's complement, it'd make sense to use either of INT16_MIN (-32768) or INT8_MIN (-128). In the case of the other two, it'd make sense to use INT16_MIN - 1 or INT8_MIN - 1 due to the presense of negative zeros, which should probably be translated to be indistinguishable from regular zeros. In this example, I'll use INT8_MIN, since it makes sense that (uint8_t) -1 should translate to -1 as an int16_t.
Separate the sign bit from the value bits. The value should be the absolute value, except in the case of a two's complement minimum value when sign will be 1 and the value will be 0. Of course, the sign bit can be where-ever you like it to be, though it's conventional for it to rest at the far left hand side. Hence, shifting right 7 places obtains the conventional "sign" bit:
uint8_t sign = input >> 7;
uint8_t value = input & (UINT8_MAX >> 1);
int16_t result;
If the sign bit is 1, we'll call this a negative number and add to INT8_MIN to construct the sign so we don't end up in the same conundrum we started with, or worse: undefined behaviour (which is the fate of one of the other answers).
if (sign == 1) {
result = INT8_MIN + value;
}
else {
result = value;
}
This can be shortened to:
int16_t result = (input >> 7) ? INT8_MIN + (input & (UINT8_MAX >> 1)) : input;
... or, better yet:
int16_t result = input <= INT8_MAX ? input
: INT8_MIN + (int8_t)(input % (uint8_t) INT8_MIN);
The sign test now involves checking if it's in the positive range. If it is, the value remains unchanged. Otherwise, we use addition and modulo to produce the correct negative value. This is fairly consistent with the C standard's language above. It works well for two's complement, because int16_t and int8_t are guaranteed to use a two's complement representation internally. However, types like int aren't required to use a two's complement representation internally. When converting unsigned int to int for example, there needs to be another check, so that we're treating values less than or equal to INT_MAX as positive, and values greater than or equal to (unsigned int) INT_MIN as negative. Any other values need to be handled as errors; In this case I treat them as zeros.
/* Generate some random input */
srand(time(NULL));
unsigned int input = rand();
for (unsigned int x = UINT_MAX / ((unsigned int) RAND_MAX + 1); x > 1; x--) {
input *= (unsigned int) RAND_MAX + 1;
input += rand();
}
int result = /* Handle positives: */ input <= INT_MAX ? input
: /* Handle negatives: */ input >= (unsigned int) INT_MIN ? INT_MIN + (int)(input % (unsigned int) INT_MIN)
: /* Handle errors: */ 0;
If the offset is in the 2's complement representation, then
convert this
uint8_t argument = 0xFF; //-1
int16_t difference = argument << 1;
*ip += difference;
into this:
uint8_t argument = 0xFF; //-1
int8_t signed_argument;
signed_argument = argument; // this relies on implementation-defined
// conversion of unsigned to signed, usually it's
// just a bit-wise copy on 2's complement systems
// OR
// memcpy(&signed_argument, &argument, sizeof argument);
*ip += signed_argument + signed_argument;

Resources