Simple program convert int16_t array to uint16_t - c

I have used the WinFilter Program to compute the FIR filter on C code but I got a issue:
The program only provide the 16 bits signed array and i need to that this vector be a unsigned int. So i am looking for a simple solution to relocated the array values to the next "values".
int16_t FIRCoef[Ntap] = {
-1029,
-1560,
-1188,
0,
1405,
2186,
1718,
0,
-2210,
-3647,
-3095,
0,
5160,
10947,
15482,
17197,
15482,
10947,
5160,
0,
-3095,
-3647,
-2210,
0,
1718,
2186,
1405,
0,
-1188,
-1560,
-1029,
0
};
uint16_t fir(uint16_t NewSample) {
static uint16_t x[Ntap]; //input samples
uint32_t y=0; //output sample
int n;
//shift the old samples
for(n=Ntap-1; n>0; n--)
x[n] = x[n-1];
//Calculate the new output
x[0] = NewSample;
for(n=0; n<Ntap; n++)
y += FIRCoef[n] * x[n]; // calculo da convolucao na amostra
// Calculation of the convolution in the sample
return y / DCgain;
}
I think that one solution should be like this:
uint16_t--------int16_t---------index
0 -32767 1
1 -32766 2
2 -32765 3
... ... ...
65535 32767 65535
any hint?

The value range for an int16_t is -32768 to 32767. Your question is unclear on this point, but it seems like you want simply to shift these values into the range of a uint16_t, 0 to 65535. That's reasonable, as the number of representable values for the two types is the same; it would be accomplished by adding the inverse of the minimum possible value of an int16_t to the input.
Of course, the Devil is in the details. When signed addition overflows, undefined behavior results. When an out-of-range value is converted to a signed integer type, the result is implementation defined, and can be an (implementation-defined) exception. It is desirable to avoid implementation-defined behavior and essential to avoid undefined behavior; that can be done in this case with just a little care:
uint16_t convert(int16_t in) {
return (uint16_t) 32768 + (uint16_t) in;
}
That reliably does the right thing on any conforming system that provides the uint16_t and int16_t types in the first place, because the conversion and the addition operate modulo one plus the maximum value of a uint16_t. The negative input values are converted to unsigned values in the upper half of the range of uint16_t, and the addition then rotates all the values, bringing those from the upper half of the range around to the lower half.
As for doing it to a whole array, if you want to rely only on well-defined C behavior (i.e. if you want a solution that strictly conforms to the standard) then you'll need to make a copy of the data. You could use a function such as the above to populate the copy.

Related

How to programmatically determine maximum and minimum limit of int data in C?

I am attempting exercise 2.1 of K&R. The exercise reads:
Write a program to determine the ranges of char, short, int, and long variables, both signed and unsigned, by printing appropriate values from standard headers and by direct computation. Harder if you compute them: determine the ranges of the various floating-point types.
Printing the values of constants in the standards headers is easy, just like this (only integer shown for example):
printf("Integral Ranges (from constants)\n");
printf("int max: %d\n", INT_MAX);
printf("int min: %d\n", INT_MIN);
printf("unsigned int max: %u\n", UINT_MAX);
However, I want to determine the limits programmatically.
I tried this code which seems like it should work but it actually goes into an infinite loop and gets stuck there:
printf("Integral Ranges (determined programmatically)\n");
int i_max = 0;
while ((i_max + 1) > i_max) {
++i_max;
}
printf("int max: %d\n", i_max);
Why is this getting stuck in a loop? It would seem that when an integer overflows it jumps from 2147483647 to -2147483648. The incremented value is obviously smaller than the previous value so the loop should end, but it doesn't.
Ok, I was about to write a comment but it got too long...
Are you allowed to use sizeof?
If true, then there is an easy way to find the max value for any type:
For example, I'll find the maximum value for an integer:
Definition: INT_MAX = (1 << 31) - 1 for 32-bit integer (2^31 - 1)
The previous definition overflows if we use integers to compute int max, so, it has to be adapted properly:
INT_MAX = (1 << 31) - 1
= ((1 << 30) * 2) - 1
= ((1 << 30) - 1) * 2 + 2) - 1
= ((1 << 30) - 1) * 2) + 1
And using sizeof:
INT_MAX = ((1 << (sizeof(int)*8 - 2) - 1) * 2) + 1
You can do the same for any signed/unsigned type by just reading the rules for each type.
So it actually wasn't getting stuck in an infinite loop. C code is usually so fast that I assume it's broken if it doesn't complete immediately.
It did eventually return the correct answer after I let it run for about 10 seconds. Turns out that 2,147,483,647 increments takes quite a few cycles to complete.
I should also note that I compiled with cc -O0 to disable optimizations, so this wasn't the problem.
A faster solution might look something like this:
int i_max = 0;
int step_size = 256;
while ((i_max + step_size) > i_max) {
i_max += step_size;
}
while ((i_max + 1) > i_max) {
++i_max;
}
printf("int max: %d\n", i_max);
However, as signed overflow is undefined behavior, probably it is a terrible idea to ever try to programmatically guess this in practice. Better to use INT_MAX.
The simplest I could come up with is:
signed int max_signed_int = ~(1 << ((sizeof(int) * 8) -1));
signed int min_signed_int = (1 << ((sizeof(int) * 8) -1));
unsigned int max_unsigned_int = ~0U;
unsigned int min_unsigned_int = 0U;
In my system:
// max_signed_int = 2147483647
// min_signed_int = -2147483648
// max_unsigned_int = 4294967295
// min_unsigned_int = 0
Assuming a two's complement processor, use unsigned math:
unsigned ... smax, smin;
smax = ((unsigned ...)0 - (unsigned ...)1) / (unsigned ...) 2;
smin = ~smax;
As it has been pointed here in other solutions, trying to overflow an integer in C is undefined behaviour, but, at least in this case, I think you can get an valid answer, even from the U.B. thing:
The case is tha if you increment a value and compare the new value with the last, you always get a greater value, except on an overflow (in this case you'll get a value lesser or equal ---you don't have more values greater, that's the case in an overflow) So you can try at least:
int i_old = 0, i = 0;
while (++i > i_old)
i_old = i;
printf("MAX_INT guess: %d\n", i_old);
After this loop, you will have got the expected overflow, and old_i will store the last valid number. Of course, in case you go down, you'll have to use this snippet of code:
int i_old = 0, i = 0;
while (--i < i_old)
i_old = i;
printf("MIN_INT guess: %d\n", i_old);
Of course, U.B. can even mean program stopping run (in this case, you'll have to put traces, to get at least the last value printed)
By the way, in the ancient times of K&R, integers used to be 16bit wide, a value easily accessible by counting up (easier than now, try 64bit integers overflow from 0 up)
I would use the properties of two's complement to compute the values.
unsigned int uint_max = ~0U;
signed int int_max = uint_max >> 1;
signed int int_min1 = (-int_max - 1);
signed int int_min2 = ~int_max;
2^3 is 1000. 2^3 - 1 is 0111. 2^4 - 1 is 1111.
w is the length in bits of your data type.
uint_max is 2^w - 1, or 111...111. This effect is achieved by using ~0U.
int_max is 2^(w-1) - 1, or 0111...111. This effect can be achieved by bitshifting uint_max 1 bit to the right. Since uint_max is an unsigned value, the logical shift is applied by the >> operator, means it adds in leading zeroes instead of extending the sign bit.
int_min is -2^(w-1), or 100...000. In two's complement, the most significant bit has a negative weight!
This is how to visualize the first expression for computing int_min1:
...
011...111 int_max +2^(w-1) - 1
100...000 (-int_max - 1) -2^(w-1) == -2^(w-1) + 1 - 1
100...001 -int_max -2^(w-1) + 1 == -(+2^(w-1) - 1)
...
Adding 1 would be moving down, and subtracting 1 would be moving up. First we negate int_max in order to generate a valid int value, then we subtract 1 to get int_min. We can't just negate (int_max + 1) because that would exceed int_max itself, the biggest int value.
Depending on which version of C or C++ you are using, the expression -(int_max + 1) would either become a signed 64-bit integer, keeping the signedness but sacrificing the original bit width, or it would become an unsigned 32-bit integer, keeping the original bit width but sacrificing the signedness. We need to declare int_min programatically in this roundabout way to keep it a valid int value.
If that's a bit (or byte) too complicated for you, you can just do ~int_max, observing that int_max is 011...111 and int_min is 100...000.
Keep in mind that these techniques I've mentioned here can be used for any bit width w of an integer data type. They can be used for char, short, int, long, and also long long. Keep in mind that integer literals are almost always 32-bits by default, so you may have to cast the 0U to the data type with the appropriate bit width before bitwise NOTing it. But other than that, these techniques are based on the fundamental mathematical principles of two's complement integer representation. That said, they won't work if your computer uses a different way of representing integers, for example ones' complement or most-significant sign-bit.
The assignment says that "printing appropriate values from standard headers" is allowed, and in the real world, that is what you would do. As your prof wrote, direct computation is harder, and why make things harder for its own sake when you're working on another interesting problem and you just want the result? Look up the constants in <limits.h>, for example, INT_MIN and INT_MAX.
Since this is homework and you want to solve it yourself, here are some hints.
The language standard technically allows any of three different representations for signed numbers: two's-complement, one's-complement and sign-and-magnitude. Sure, every computer made in the last fifty years has used two's-complement (with the partial exception of legacy code for certain Unisys mainframes), but if you really want to language-lawyer, you could compute the smallest number for each of the three possible representations and find the minimum by comparing them.
Attempting to find the answer by overflowing or underflowing a signed value does not work! This is undefined behavior! You may in theory, but not in practice, increment an unsigned value of the same width, convert to the corresponding signed type, and compare to the result of casting the previous or next unsigned value. For 32-bit long, this might just be tolerable; it will not scale to a machine where long is 64 bits wide.
You want to use the bitwise operators, particularly ~ and <<, to calculate the largest and smallest value for every type. Note: CHAR_BITS * sizeof(x) gives you the number of bits in x, and left-shifting 0x01UL by one fewer than that, then casting to the desired type, sets the highest bit.
For floating-point values, the only portable way is to use the constants in <math.h>; floating-point values might or might not be able to represent positive and negative infinity, are not constrained to use any particular format. That said, if your compiler supports the optional Annex G of the C11 standard, which specifies IEC 60559 complex arithmetic, then dividing a nonzero floating-point number by zero will be defined as producing infinity, which does allow you to "compute" infinity and negative infinity. If so, the implementation will #define __STDC_IEC_559_COMPLEX__ as 1.
If you detect that infinity is not supported on your implementation, for instance by checking whether INFINITY and -INFINITY are infinities, you would want to use HUGE_VAL and -HUGE_VAL instead.
#include <stdio.h>
int main() {
int n = 1;
while(n>0) {
n=n<<1;
}
int int_min = n;
int int_max = -(n+1);
printf("int_min is: %d\n",int_min);
printf("int_max is: %d\n", int_max);
return 0;
}
unsigned long LMAX=(unsigned long)-1L;
long SLMAX=LMAX/2;
long SLMIN=-SLMAX-1;
If you don't have yhe L suffix just use a variable or cast to signed before castong to unsigned.
For long long:
unsigned long long LLMAX=(unsigned long long)-1LL;

how can split integers into bytes without using arithmetic in c?

I am implementing four basic arithmetic functions(add, sub, division, multiplication) in C.
the basic structure of these functions I imagined is
the program gets two operands by user using scanf,
and the program split these values into bytes and compute!
I've completed addition and subtraction,
but I forgot that I shouldn't use arithmetic functions,
so when splitting integer into single bytes,
I wrote codes like
while(quotient!=0){
bin[i]=quotient%2;
quotient=quotient/2;
i++;
}
but since there is arithmetic functions that i shouldn't use..
so i have to rewrite that splitting parts,
but i really have no idea how can i split integer into single byte without using
% or /.
To access the bytes of a variable type punning can be used.
According to the Standard C (C99 and C11), only unsigned char brings certainty to perform this operation in a safe way.
This could be done in the following way:
typedef unsigned int myint_t;
myint_t x = 1234;
union {
myint_t val;
unsigned char byte[sizeof(myint_t)];
} u;
Now, you can of course access to the bytes of x in this way:
u.val = x;
for (int j = 0; j < sizeof(myint_t); j++)
printf("%d ",u.byte[j]);
However, as WhozCrag has pointed out, there are issues with endianness.
It cannot be assumed that the bytes are in determined order.
So, before doing any computation with bytes, your program needs to check how the endianness works.
#include <limits.h> /* To use UCHAR_MAX */
unsigned long int ByteFactor = 1u + UCHAR_MAX; /* 256 almost everywhere */
u.val = 0;
for (int j = sizeof(myint_t) - 1; j >= 0 ; j--)
u.val = u.val * ByteFactor + j;
Now, when you print the values of u.byte[], you will see the order in that bytes are arranged for the type myint_t.
The less significant byte will have value 0.
I assume 32 bit integers (if not the case then just change the sizes) there are more approaches:
BYTE pointer
#include<stdio.h>
int x; // your integer or whatever else data type
BYTE *p=(BYTE*)&x;
x=0x11223344;
printf("%x\n",p[0]);
printf("%x\n",p[1]);
printf("%x\n",p[2]);
printf("%x\n",p[3]);
just get the address of your data as BYTE pointer
and access the bytes directly via 1D array
union
#include<stdio.h>
union
{
int x; // your integer or whatever else data type
BYTE p[4];
} a;
a.x=0x11223344;
printf("%x\n",a.p[0]);
printf("%x\n",a.p[1]);
printf("%x\n",a.p[2]);
printf("%x\n",a.p[3]);
and access the bytes directly via 1D array
[notes]
if you do not have BYTE defined then change it for unsigned char
with ALU you can use not only %,/ but also >>,& which is way faster but still use arithmetics
now depending on the platform endianness the output can be 11,22,33,44 of 44,33,22,11 so you need to take that in mind (especially for code used in multiple platforms)
you need to handle sign of number, for unsigned integers there is no problem
but for signed the C uses 2'os complement so it is better to separate the sign before spliting like:
int s;
if (x<0) { s=-1; x=-x; } else s=+1;
// now split ...
[edit2] logical/bit operations
x<<n,x>>n - is bit shift left and right of x by n bits
x&y - is bitwise logical and (perform logical AND on each bit separately)
so when you have for example 32 bit unsigned int (called DWORD) yu can split it to BYTES like this:
DWORD x; // input 32 bit unsigned int
BYTE a0,a1,a2,a3; // output BYTES a0 is the least significant a3 is the most significant
x=0x11223344;
a0=DWORD((x )&255); // should be 0x44
a1=DWORD((x>> 8)&255); // should be 0x33
a2=DWORD((x>>16)&255); // should be 0x22
a3=DWORD((x>>24)&255); // should be 0x11
this approach is not affected by endianness
but it uses ALU
the point is shift the bits you want to position of 0..7 bit and mask out the rest
the &255 and DWORD() overtyping is not needed on all compilers but some do weird stuff without them especially on signed variables like char or int
x>>n is the same as x/(pow(2,n))=x/(1<<n)
x&((1<<n)-1) is the same as x%(pow(2,n))=x%(1<<n)
so (x>>8)=x/256 and (x&255)=x%256

Reading bits serially in C

I am trying to convert a bit of code from Python to C. I have got it all working other than the section below. All the variables have been defined as ints. I believe the problem has to do with pointers and addresses but I can't work it out.
for(j=0; j<12; j++)
{
digitalWrite(CLOCK, 1);
sleep(0.001);
bit = digitalRead(DATA_IN);
sleep(0.001);
digitalWrite(CLOCK, 0);
value = bit * 2 ** (12-j-1); // error
anip = anip + value;
printf("j:%i bit:%i value:%i anip:%i", j, bit, value, anip);
}
The error is invalid type argument of unary ‘*’ (have ‘int’)
C has no exponentiation operator, which is what I guess you do ** for.
You can use e.g. pow if it's okay to typecast the result from a floating point value back to an integer.
In C, 1<<i is the best way to raise i to the power of 2.
Do not use ints for bit manipulation, because they vary in size by platform. Use uint32_t from /usr/include/stdint.h.
The sleep() function takes an integer argument and waits for the specified number of seconds. The argument 0.001 becomes 0, which is probably not what you want. Instead, try usleep(), which takes an argument that represents milliseconds.
The other answers will solve the generic problem of raising an arbitrary number to a power, or to a power of 2, but this is a very specific case.
The purpose of the loop is to read 11 bits serially from MSB to LSB and convert them into an integer. The implementation you've shown attempts to do this by reading a bit, shifting it to the correct position, and accumulating the result into anip. But there's an easier way:
anip = 0;
for (j=0; j<12; ++j) {
// Pulse the CLOCK line and read one bit, MSB first.
digitalWrite(CLOCK, 1);
usleep(1);
bit = digitalRead(DATA_IN);
usleep(1);
digitalWrite(CLOCK, 0);
// Accumulate the bits.
anip <<= 1; // Shift to make room for the new bit.
anip += bit; // Add the new bit.
printf("j:%i bit:%i anip:%i", j, bit, anip);
}
As an example, suppose the first 4 bits are 1,0,0,1. Then the output will be
j:0 bit:1 anip:1
j:1 bit:0 anip:10
j:2 bit:0 anip:100
j:3 bit:1 anip:1001
When the loop completes, anip will contain the value of the entire sequence of bits. This is a fairly standard idiom for reading data serially.
Although the advice to use uint32_t is generally appropriate, the C standard defines int to be at least 16 bits, which is more than the 12 you need (including the sign bit, if anip is signed). Moreover, you're probably writing this for a specific platform and therefore aren't worried about portability.

How to sign extend a 9-bit value when converting from an 8-bit value?

I'm implementing a relative branching function in my simple VM.
Basically, I'm given an 8-bit relative value. I then shift this left by 1 bit to make it a 9-bit value. So, for instance, if you were to say "branch +127" this would really mean, 127 instructions, and thus would add 256 to the IP.
My current code looks like this:
uint8_t argument = 0xFF; //-1 or whatever
int16_t difference = argument << 1;
*ip += difference; //ip is a uint16_t
I don't believe difference will ever be detected as a less than 0 with this however. I'm rusty on how signed to unsigned works. Beyond that, I'm not sure the difference would be correctly be subtracted from IP in the case argument is say -1 or -2 or something.
Basically, I'm wanting something that would satisfy these "tests"
//case 1
argument = -5
difference -> -10
ip = 20 -> 10 //ip starts at 20, but becomes 10 after applying difference
//case 2
argument = 127 (must fit in a byte)
difference -> 254
ip = 20 -> 274
Hopefully that makes it a bit more clear.
Anyway, how would I do this cheaply? I saw one "solution" to a similar problem, but it involved division. I'm working with slow embedded processors (assumed to be without efficient ways to multiply and divide), so that's a pretty big thing I'd like to avoid.
To clarify: you worry that left shifting a negative 8 bit number will make it appear like a positive nine bit number? Just pad the top 9 bits with the sign bit of the initial number before left shift:
diff = 0xFF;
int16 diff16=(diff + (diff & 0x80)*0x01FE) << 1;
Now your diff16 is signed 2*diff
As was pointed out by Richard J Ross III, you can avoid the multiplication (if that's expensive on your platform) with a conditional branch:
int16 diff16 = (diff + ((diff & 0x80)?0xFF00:0))<<1;
If you are worried about things staying in range and such ("undefined behavior"), you can do
int16 diff16 = diff;
diff16 = (diff16 | ((diff16 & 0x80)?0x7F00:0))<<1;
At no point does this produce numbers that are going out of range.
The cleanest solution, though, seems to be "cast and shift":
diff16 = (signed char)diff; // recognizes and preserves the sign of diff
diff16 = (short int)((unsigned short)diff16)<<1; // left shift, preserving sign
This produces the expected result, because the compiler automatically takes care of the sign bit (so no need for the mask) in the first line; and in the second line, it does a left shift on an unsigned int (for which overflow is well defined per the standard); the final cast back to short int ensures that the number is correctly interpreted as negative. I believe that in this form the construct is never "undefined".
All of my quotes come from the C standard, section 6.3.1.3. Unsigned to signed is well defined when the value is within range of the signed type:
1 When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
Signed to unsigned is well defined:
2 Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
Unsigned to signed, when the value lies out of range isn't too well defined:
3 Otherwise, the new type is signed and the value cannot be
represented in it; either the result is implementation-defined or an
implementation-defined signal is raised.
Unfortunately, your question lies in the realm of point 3. C doesn't guarantee any implicit mechanism to convert out-of-range values, so you'll need to explicitly provide one. The first step is to decide which representation you intend to use: Ones' complement, two's complement or sign and magnitude
The representation you use will affect the translation algorithm you use. In the example below, I'll use two's complement: If the sign bit is 1 and the value bits are all 0, this corresponds to your lowest value. Your lowest value is another choice you must make: In the case of two's complement, it'd make sense to use either of INT16_MIN (-32768) or INT8_MIN (-128). In the case of the other two, it'd make sense to use INT16_MIN - 1 or INT8_MIN - 1 due to the presense of negative zeros, which should probably be translated to be indistinguishable from regular zeros. In this example, I'll use INT8_MIN, since it makes sense that (uint8_t) -1 should translate to -1 as an int16_t.
Separate the sign bit from the value bits. The value should be the absolute value, except in the case of a two's complement minimum value when sign will be 1 and the value will be 0. Of course, the sign bit can be where-ever you like it to be, though it's conventional for it to rest at the far left hand side. Hence, shifting right 7 places obtains the conventional "sign" bit:
uint8_t sign = input >> 7;
uint8_t value = input & (UINT8_MAX >> 1);
int16_t result;
If the sign bit is 1, we'll call this a negative number and add to INT8_MIN to construct the sign so we don't end up in the same conundrum we started with, or worse: undefined behaviour (which is the fate of one of the other answers).
if (sign == 1) {
result = INT8_MIN + value;
}
else {
result = value;
}
This can be shortened to:
int16_t result = (input >> 7) ? INT8_MIN + (input & (UINT8_MAX >> 1)) : input;
... or, better yet:
int16_t result = input <= INT8_MAX ? input
: INT8_MIN + (int8_t)(input % (uint8_t) INT8_MIN);
The sign test now involves checking if it's in the positive range. If it is, the value remains unchanged. Otherwise, we use addition and modulo to produce the correct negative value. This is fairly consistent with the C standard's language above. It works well for two's complement, because int16_t and int8_t are guaranteed to use a two's complement representation internally. However, types like int aren't required to use a two's complement representation internally. When converting unsigned int to int for example, there needs to be another check, so that we're treating values less than or equal to INT_MAX as positive, and values greater than or equal to (unsigned int) INT_MIN as negative. Any other values need to be handled as errors; In this case I treat them as zeros.
/* Generate some random input */
srand(time(NULL));
unsigned int input = rand();
for (unsigned int x = UINT_MAX / ((unsigned int) RAND_MAX + 1); x > 1; x--) {
input *= (unsigned int) RAND_MAX + 1;
input += rand();
}
int result = /* Handle positives: */ input <= INT_MAX ? input
: /* Handle negatives: */ input >= (unsigned int) INT_MIN ? INT_MIN + (int)(input % (unsigned int) INT_MIN)
: /* Handle errors: */ 0;
If the offset is in the 2's complement representation, then
convert this
uint8_t argument = 0xFF; //-1
int16_t difference = argument << 1;
*ip += difference;
into this:
uint8_t argument = 0xFF; //-1
int8_t signed_argument;
signed_argument = argument; // this relies on implementation-defined
// conversion of unsigned to signed, usually it's
// just a bit-wise copy on 2's complement systems
// OR
// memcpy(&signed_argument, &argument, sizeof argument);
*ip += signed_argument + signed_argument;

How to subtract two unsigned ints with wrap around or overflow

There are two unsigned ints (x and y) that need to be subtracted. x is always larger than y. However, both x and y can wrap around; for example, if they were both bytes, after 0xff comes 0x00. The problem case is if x wraps around, while y does not. Now x appears to be smaller than y. Luckily, x will not wrap around twice (only once is guaranteed). Assuming bytes, x has wrapped and is now 0x2, whereas y has not and is 0xFE. The right answer of x - y is supposed to be 0x4.
Maybe,
( x > y) ? (x-y) : (x+0xff-y);
But I think there is another way, something involving 2s compliment?, and in this embedded system, x and y are the largest unsigned int types, so adding 0xff... is not possible
What is the best way to write the statement (target language is C)?
Assuming two unsigned integers:
If you know that one is supposed to be "larger" than the other, just subtract. It will work provided you haven't wrapped around more than once (obviously, if you have, you won't be able to tell).
If you don't know that one is larger than the other, subtract and cast the result to a signed int of the same width. It will work provided the difference between the two is in the range of the signed int (if not, you won't be able to tell).
To clarify: the scenario described by the original poster seems to be confusing people, but is typical of monotonically increasing fixed-width counters, such as hardware tick counters, or sequence numbers in protocols. The counter goes (e.g. for 8 bits) 0xfc, 0xfd, 0xfe, 0xff, 0x00, 0x01, 0x02, 0x03 etc., and you know that of the two values x and y that you have, x comes later. If x==0x02 and y==0xfe, the calculation x-y (as an 8-bit result) will give the correct answer of 4, assuming that subtraction of two n-bit values wraps modulo 2n - which C99 guarantees for subtraction of unsigned values. (Note: the C standard does not guarantee this behaviour for subtraction of signed values.)
Here's a little more detail of why it 'just works' when you subtract the 'smaller' from the 'larger'.
A couple of things going into this…
1. In hardware, subtraction uses addition: The appropriate operand is simply negated before being added.
2. In two’s complement (which pretty much everything uses), an integer is negated by inverting all the bits then adding 1.
Hardware does this more efficiently than it sounds from the above description, but that’s the basic algorithm for subtraction (even when values are unsigned).
So, lets figure 2 – 250 using 8bit unsigned integers. In binary we have
0 0 0 0 0 0 1 0
- 1 1 1 1 1 0 1 0
We negate the operand being subtracted and then add. Recall that to negate we invert all the bits then add 1. After inverting the bits of the second operand we have
0 0 0 0 0 1 0 1
Then after adding 1 we have
0 0 0 0 0 1 1 0
Now we perform addition...
0 0 0 0 0 0 1 0
+ 0 0 0 0 0 1 1 0
= 0 0 0 0 1 0 0 0 = 8, which is the result we wanted from 2 - 250
Maybe I don't understand, but what's wrong with:
unsigned r = x - y;
The question, as stated, is confusing. You said that you are subtracting unsigned values. If x is always larger than y, as you said, then x - y cannot possibly wrap around or overflow. So you just do x - y (if that's what you need) and that's it.
This is an efficient way to determine the amount of free space in a circular buffer or do sliding window flow control.
Use unsigned ints for head and tail - increment them and let them wrap!
Buffer length has to be a power of 2.
free = ((head - tail) & size_mask), where size_mask is 2^n-1 the buffer or window size.
Just to put the already correct answer into code:
If you know that x is the smaller value, the following calculation just works:
int main()
{
uint8_t x = 0xff;
uint8_t y = x + 20;
uint8_t res = y - x;
printf("Expect 20: %d\n", res); // res is 20
return 0;
}
If you do not know which one is smaller:
int main()
{
uint8_t x = 0xff;
uint8_t y = x + 20;
int8_t res1 = (int8_t)x - y;
int8_t res2 = (int8_t)y - x;
printf("Expect -20 and 20: %d and %d\n", res1, res2);
return 0;
}
Where the difference must be inside the range of uint8_t in this case.
The code experiment helped me to understand the solution better.
The problem should be stated as follows:
Let's assume the position (angle) of two pointers a and b of a clock is given by an uint8_t. The whole circumerence is devided into the 256 values of an uint8_t. How can the smaller distance between the two pointer be calculated efficiently?
A solution is:
uint8_t smaller_distance = abs( (int8_t)( a - b ) );
I suspect there is nothing more effient as otherwise there would be something more efficient than abs().
To echo everyone else replying, if you just subtract the two and interpret the result as unsigned you'll be fine.
Unless you have an explicit counterexample.
Your example of x = 0x2, y= 0x14 would not result in 0x4, it would result in 0xEE, unless you have more constraints on the math that are unstated.
Yet another answer, and hopefully easy to understand:
SUMMARY:
It's assumed the OP's x and y are assigned values from a counter, e.g., from a timer.
(x - y) will always give the value desired, even if the counter wraps.
This assumes the counter is incremented less than 2^N times between y and x,
for N-bit unsigned int's.
DESCRIPTION:
A counter variable is unsigned and it can wrap around.
A uint8 counter would have values:
0, 1, 2, ..., 255, 0, 1, 2, ..., 255, ...
The number of counter tics between two points can be calculated as shown below.
This assumes the counter is incremented less than 256 times, between y and x.
uint8 x, y, counter, counterTics;
<initalize the counter>
<do stuff while the counter increments>
y = counter;
<do stuff while the counter increments>
x = counter;
counterTics = x - y;
EXPLANATION:
For uint8, and the counter-tics from y to x is less than 256 (i.e., less than 2^8):
If (x >= y) then: the counter did not wrap, counterTics == x - y
If (x < y) then: the counter wrapped, counterTics == (256-y) + x
(256-y) is the number of tics before wrapping.
x is the number of tics after wrapping.
Note: if those calculations are made in the order shown, no negative numbers are involved.
This equation holds for both cases: counterTics == (256+x-y) mod 256
For uintN, where N is the number of bits:
counterTics == ((2^N)+x-y) mod (2^N)
The last equation also describes the result in C when subtracting unsigned int's, in general.
This is not to say the compiler or processor uses that equation when subtracting unsigned int's.
RATIONALE:
The explanation is consistent with what is described in this ACM paper:
"Understanding Integer Overflow in C/C++", by Dietz, et al.
HARDWARE INTEGER ARITHMETIC
When an n-bit addition or subtraction operation on unsigned or two’s complement integers overflows, the result “wraps around,” effectively subtracting 2n from, or adding 2n to, the true mathematical result. Equivalently, the result can be considered to occupy n+1 bits; the lower n bits are placed into the result register and the highest-order bit is placed into the processor’s carry flag.
INTEGER ARITHMETIC IN C AND C++
3.3. Unsigned Overflow
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
Thus, the semantics for unsigned overflow in C/C++ are precisely the same as the semantics of processor-level unsigned overflow as described in Section 2. As shown in Table I, UINT MAX+1 must evaluate to zero in a conforming C and C++ implementation.
Also, it's easy to write a C program to test that the cases shown work as described.

Resources