Right shift (Division) -> ROUND TOWARD ZERO - c

I am doing this..
value >> 3;
It is always going toward negative side.How do I round toward zero with right shift division?

Gez, the answers were pretty bad ; you want to solve that without branching, but without breaking your positive numbers either.
Here it is :
(int)(value+(((unsigned)value)>>31)) >> 3
The cast to (unsigned) is required to perform a logical shift and obtain just the sign bit, then we need to cast back to (int) to perform an arithmetic right shift.
The code above made the assumption that your int data type is 32 bits, you should of course use data types such as int32_t in such cases.

Do something conditionally depending on whether your value is positive or negative.
if( value < 0 ) {
-((-value) >> 3);
}
else {
value >> 3;
}

Try the following expression instead:
(value < 0) ? -((-value) >> 3) : value >> 3;
That will force a negative number to be positive first so that it round towards zero, then changes the result back to negative.
This may cause issues for the minimum integer under two's complement notation (not ones' complement or sign/magnitude) but you could put a separate check in to catch that first.
Or (and this is probably preferable) you could just stop trying to divide by eight with a right shift altogether, instead choosing:
value = value / 8;
Then let your compiler choose the best way of doing that. You should be coding to specify intent rather than trying to optimise (needlessly, unless you have a truly brain-dead compiler).

I do this:
(value + 4) >> 3

Add 7 to the number if it is negative in order to round to zero.
This is how you do that:
int mask = x >> 31;
x = (x + (7 & mask)) >> 3;

You are encountering 'signed' shifting, when what you seem to want is unsigned shifting. Try casting it to unsigned first, like this
x = ((unsigned) x) >> 3;
.. or you could just use division.

Related

Divide a signed integer by a power of 2

I'm working on a way to divide a signed integer by a power of 2 using only binary operators (<< >> + ^ ~ & | !), and the result has to be round toward 0. I came across this question also on Stackoverflow on the problem, however, I cannot understand why it works. Here's the solution:
int divideByPowerOf2(int x, int n)
{
return (x + ((x >> 31) & ((1 << n) + ~0))) >> n;
}
I understand the x >> 31 part (only add the next part if x is negative, because if it's positive x will be automatically round toward 0). But what's bothering me is the (1 << n) + ~0 part. How can it work?
Assuming 2-complement, just bit-shifting the dividend is equivalent to a certain kind of division: not the conventional division where we round the dividend to next multiple of divisor toward zero. But another kind where we round the dividend toward negative infinity. I rediscovered that in Smalltalk, see http://smallissimo.blogspot.fr/2015/03/is-bitshift-equivalent-to-division-in.html.
For example, let's divide -126 by 8. traditionally, we would write
-126 = -15 * 8 - 6
But if we round toward infinity, we get a positive remainder and write it:
-126 = -16 * 8 + 2
The bit-shifting is performing the second operation, in term of bit patterns (assuming 8 bits long int for the sake of being short):
1000|0010 >> 3 = 1111|0000
1000|0010 = 1111|0000 * 0000|1000 + 0000|0010
So what if we want the traditional division with quotient rounded toward zero and remainder of same sign as dividend? Simple, we just have to add 1 to the quotient - if and only if the dividend is negative and the division is inexact.
You saw that x>>31 corresponds to first condition, dividend is negative, assuming int has 32 bits.
The second term corresponds to the second condition, if division is inexact.
See how are encoded -1, -2, -4, ... in two complement: 1111|1111 , 1111|1110 , 1111|1100. So the negation of nth power of two has n trailing zeros.
When the dividend has n trailing zeros and we divide by 2^n, then no need to add 1 to final quotient. In any other case, we need to add 1.
What ((1 << n) + ~0) is doing is creating a mask with n trailing ones.
The n last bits don't really matter, because we are going to shift to the right and just throw them away. So, if the division is exact, the n trailing bits of dividend are zero, and we just add n 1s that will be skipped. On the contrary, if the division is inexact, then one or more of the n trailing bits of the dividend is 1, and we are sure to cause a carry to the n+1 bit position: that's how we add 1 to the quotient (we add 2^n to the dividend). Does that explain it a bit more?
This is "write-only code": instead of trying to understand the code, try to create it by yourself.
For example, let's divide a number by 8 (shift right by 3).
If the number is negative, the normal right-shift rounds in the wrong direction. Let's "fix" it by adding a number:
int divideBy8(int x)
{
if (x >= 0)
return x >> 3;
else
return (x + whatever) >> 3;
}
Here you can come up with a mathematical formula for whatever, or do some trial and error. Anyway, here whatever = 7:
int divideBy8(int x)
{
if (x >= 0)
return x >> 3;
else
return (x + 7) >> 3;
}
How to unify the two cases? You need to make an expression that looks like this:
(x + stuff) >> 3
where stuff is 7 for negative x, and 0 for positive x. The trick here is using x >> 31, which is a 32-bit number whose bits are equal to the sign-bit of x: all 0 or all 1. So stuff is
(x >> 31) & 7
Combining all these, and replacing 8 and 7 by the more general power of 2, you get the code you asked about.
Note: in the description above, I assume that int represents a 32-bit hardware register, and hardware uses two's complement representation to do right shift.
OP's reference is of a C# code and so many subtle differences that cause it to be bad code with C, as this post is tagged.
int is not necessarily 32-bits so using a magic number of 32 does not make for a robust solution.
In particular (1 << n) + ~0 results in implementation defined behavior when n causes a bit to be shifted into the sign place. Not good coding.
Restricting code to only using "binary" operators << >> + ^ ~ & | ! encourages a coder to assume things about int which is not portable nor compliant with the C spec. So OP's posted code does not "work" in general, although may work in many common implementations.
OP code fails when int is not 2's complement, not uses the range [-2147483648 .. 2147483647] or when 1 << n uses implementation behavior that is not as expected.
// weak code
int divideByPowerOf2(int x, int n) {
return (x + ((x >> 31) & ((1 << n) + ~0))) >> n;
}
A simple alternative, assuming long long exceeds the range of int follows. I doubt this meets some corner of OP's goals, but OP's given goals encourages non-robust coding.
int divideByPowerOf2(int x, int n) {
long long ill = x;
if (x < 0) ill = -ill;
while (n--) ill >>= 1;
if (x < 0) ill = -ill;
return (int) ill;
}

How to find the nth bit of an integer in C

I've got an assignment where I need to convert from an 8 bit sign magnitude number to two's complement and then add those two numbers. I've got a relatively good idea as to how to do this, however I can't work out how to find the eighth bit of an integer such that I can tell what sign the number has.
The overall idea is that should the sign bit be 0 just return the number as it is already in two's complement if it is a one though then I want to set it to 0 before inverting all bits with the ~ operator and then add 1.
Thanks in advance
You can check if the high bit is set by creating a mask that has just that bit set and using a logical AND to see if the result is non-zero.
Once you know the high bit is set, you can convert to twos complement by flipping all bits and adding one.
uint8_t x = (some value)
if (x & (1 << 7)) {
printf("sign bit set\n");
x = (uint8_t)((~(x & (0x7F))) & 0xFF) + 1;
printf("converted value: %02X\n", x);
}
Then you can add this number to any other normally.
Assuming that your computer/compiler uses two's complement (almost certainly the case) and assuming that you want the result to be in two's complement.
Use an uint8_t to hold the sign and magnitude number.
To check if a bit is set, use the bitwise AND operator &, together with a bit mask corresponding to the msb. To get a bit mask corresponding to bit n, left shift the value 1 n times. In C code:
#define SIGN (1 << 7)
uint8_t sm = ...;
if(sm & SIGN) // if non-zero, then the SIGN bit is set
{
}
else // it was zero, the SIGN bit is not set
{
}
To do the actual conversion, there are several ways. I simply would mask out and copy the relevant parts of the number, again with bitwise AND:
#define MAGNITUDE 0x7F
int8_t magnitude = sm & MAGNITUDE; // variable magnitude is two's compl.
EDIT complete solution (since someone already posted one):
#define SIGN (1 << 7)
#define MAGNITUDE 0x7F
uint8_t sm = ...;
int8_t twos_compl = sm & MAGNITUDE;
if(sm & SIGN) // if non-zero, then the SIGN bit is set
{
twos_compl = -twos_compl;
}
int8_t x = ...; // some other number in two's complement
int16_t result = twos_compl + x;
As a side note, be very careful when mixing the ~ operator with small integer types, because it performs an implicit integer promotion. For example uint8_t x = 1 and then ~my_uint8 gives you 0xFFFFFFFE (32 bit system) and not 0xFE as you might expect.
For the above task, there is no need to use ~ at all.

How to sign extend a 9-bit value when converting from an 8-bit value?

I'm implementing a relative branching function in my simple VM.
Basically, I'm given an 8-bit relative value. I then shift this left by 1 bit to make it a 9-bit value. So, for instance, if you were to say "branch +127" this would really mean, 127 instructions, and thus would add 256 to the IP.
My current code looks like this:
uint8_t argument = 0xFF; //-1 or whatever
int16_t difference = argument << 1;
*ip += difference; //ip is a uint16_t
I don't believe difference will ever be detected as a less than 0 with this however. I'm rusty on how signed to unsigned works. Beyond that, I'm not sure the difference would be correctly be subtracted from IP in the case argument is say -1 or -2 or something.
Basically, I'm wanting something that would satisfy these "tests"
//case 1
argument = -5
difference -> -10
ip = 20 -> 10 //ip starts at 20, but becomes 10 after applying difference
//case 2
argument = 127 (must fit in a byte)
difference -> 254
ip = 20 -> 274
Hopefully that makes it a bit more clear.
Anyway, how would I do this cheaply? I saw one "solution" to a similar problem, but it involved division. I'm working with slow embedded processors (assumed to be without efficient ways to multiply and divide), so that's a pretty big thing I'd like to avoid.
To clarify: you worry that left shifting a negative 8 bit number will make it appear like a positive nine bit number? Just pad the top 9 bits with the sign bit of the initial number before left shift:
diff = 0xFF;
int16 diff16=(diff + (diff & 0x80)*0x01FE) << 1;
Now your diff16 is signed 2*diff
As was pointed out by Richard J Ross III, you can avoid the multiplication (if that's expensive on your platform) with a conditional branch:
int16 diff16 = (diff + ((diff & 0x80)?0xFF00:0))<<1;
If you are worried about things staying in range and such ("undefined behavior"), you can do
int16 diff16 = diff;
diff16 = (diff16 | ((diff16 & 0x80)?0x7F00:0))<<1;
At no point does this produce numbers that are going out of range.
The cleanest solution, though, seems to be "cast and shift":
diff16 = (signed char)diff; // recognizes and preserves the sign of diff
diff16 = (short int)((unsigned short)diff16)<<1; // left shift, preserving sign
This produces the expected result, because the compiler automatically takes care of the sign bit (so no need for the mask) in the first line; and in the second line, it does a left shift on an unsigned int (for which overflow is well defined per the standard); the final cast back to short int ensures that the number is correctly interpreted as negative. I believe that in this form the construct is never "undefined".
All of my quotes come from the C standard, section 6.3.1.3. Unsigned to signed is well defined when the value is within range of the signed type:
1 When a value with integer type is converted to another integer type
other than _Bool, if the value can be represented by the new type, it
is unchanged.
Signed to unsigned is well defined:
2 Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
Unsigned to signed, when the value lies out of range isn't too well defined:
3 Otherwise, the new type is signed and the value cannot be
represented in it; either the result is implementation-defined or an
implementation-defined signal is raised.
Unfortunately, your question lies in the realm of point 3. C doesn't guarantee any implicit mechanism to convert out-of-range values, so you'll need to explicitly provide one. The first step is to decide which representation you intend to use: Ones' complement, two's complement or sign and magnitude
The representation you use will affect the translation algorithm you use. In the example below, I'll use two's complement: If the sign bit is 1 and the value bits are all 0, this corresponds to your lowest value. Your lowest value is another choice you must make: In the case of two's complement, it'd make sense to use either of INT16_MIN (-32768) or INT8_MIN (-128). In the case of the other two, it'd make sense to use INT16_MIN - 1 or INT8_MIN - 1 due to the presense of negative zeros, which should probably be translated to be indistinguishable from regular zeros. In this example, I'll use INT8_MIN, since it makes sense that (uint8_t) -1 should translate to -1 as an int16_t.
Separate the sign bit from the value bits. The value should be the absolute value, except in the case of a two's complement minimum value when sign will be 1 and the value will be 0. Of course, the sign bit can be where-ever you like it to be, though it's conventional for it to rest at the far left hand side. Hence, shifting right 7 places obtains the conventional "sign" bit:
uint8_t sign = input >> 7;
uint8_t value = input & (UINT8_MAX >> 1);
int16_t result;
If the sign bit is 1, we'll call this a negative number and add to INT8_MIN to construct the sign so we don't end up in the same conundrum we started with, or worse: undefined behaviour (which is the fate of one of the other answers).
if (sign == 1) {
result = INT8_MIN + value;
}
else {
result = value;
}
This can be shortened to:
int16_t result = (input >> 7) ? INT8_MIN + (input & (UINT8_MAX >> 1)) : input;
... or, better yet:
int16_t result = input <= INT8_MAX ? input
: INT8_MIN + (int8_t)(input % (uint8_t) INT8_MIN);
The sign test now involves checking if it's in the positive range. If it is, the value remains unchanged. Otherwise, we use addition and modulo to produce the correct negative value. This is fairly consistent with the C standard's language above. It works well for two's complement, because int16_t and int8_t are guaranteed to use a two's complement representation internally. However, types like int aren't required to use a two's complement representation internally. When converting unsigned int to int for example, there needs to be another check, so that we're treating values less than or equal to INT_MAX as positive, and values greater than or equal to (unsigned int) INT_MIN as negative. Any other values need to be handled as errors; In this case I treat them as zeros.
/* Generate some random input */
srand(time(NULL));
unsigned int input = rand();
for (unsigned int x = UINT_MAX / ((unsigned int) RAND_MAX + 1); x > 1; x--) {
input *= (unsigned int) RAND_MAX + 1;
input += rand();
}
int result = /* Handle positives: */ input <= INT_MAX ? input
: /* Handle negatives: */ input >= (unsigned int) INT_MIN ? INT_MIN + (int)(input % (unsigned int) INT_MIN)
: /* Handle errors: */ 0;
If the offset is in the 2's complement representation, then
convert this
uint8_t argument = 0xFF; //-1
int16_t difference = argument << 1;
*ip += difference;
into this:
uint8_t argument = 0xFF; //-1
int8_t signed_argument;
signed_argument = argument; // this relies on implementation-defined
// conversion of unsigned to signed, usually it's
// just a bit-wise copy on 2's complement systems
// OR
// memcpy(&signed_argument, &argument, sizeof argument);
*ip += signed_argument + signed_argument;

C - Need to shift an unsigned int right one place at a time.

I need to shift an unsigned int to the right more than 32 times and still get a proper answer of zero instead of the random or original number. E.g 8 >> 40 should = 0 but it returns a random number.
I understand a loop that shifts one place right at a time would solve this problem as it would fill in zeros as it went. However my current code for this doesn't work for some reason. What am I doing wrong?
unsigned int shiftR(unsigned int a, unsigned int b) {
unsigned int i=0;
while (i < b) {
a >> 1;
i++;
}
return a;
}
This gives me a compile warning that it has no effect ( a >> 1;). How come?
Thanks!
You want to use a >>= 1; or a = a >> 1; this is because a >> 1 shifts a to the right once and returns the result. It doesn't assign the result to a
I need to shift an unsigned int to the right more than 32 times and still get a proper answer of zero instead of the random or original number.
... Then do that?
unsigned int shiftR(unsigned int a, unsigned int b) {
return (b >= 32) ? 0 : a >> b;
}
Why complicate things?
As far as I remember C, you need to say a = a >> 1.
As others noted, you never re-assigned a a new value; the result of that statement was not used for anything, so the compiler strips it out. a>>=1 is what you wanted.
I'd like to add though that if you want your unsigned int to be 32-bit, then force it. Use the C99 stdint.h library and make it uint32_t - nice and unambiguous.
a loop that shifts one place right at a time
You must change a >> 1; to a >>= 1;.
E.g 8 >> 40 should = 0 but it returns a random number.
In C, it is undefined behavior to left or right shift an integer by more places than the bit width of the integer type[0]. Stepping on undefined behavior is very bad, because nothing could happen, something bad could happen immediately, something bad could happen at an unknown point in the future, or something bad could happen on a different platform or compiler.
The correct way to deal with this is to manually check if you're shifting by more than 32 places, and then manually give a result of 0.
[0]: http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html . You should read the whole page, but the specific section is "Oversized Shift Amounts".
a + 1 adds 1 with a but does not store anything anywhere, therefore the value of a is unmodified. To update the value of a with the incremented value you have to do a = a + 1 or a += 1 . similarly to shift the integer value in a by 1 and then store the shifted value in a you need to do a = a >> 1 or a >>= 1.
Because only doing a >> 1 does not modify the value of a, the compiler appropriately warns you that this statement has no effect, that means that keeping this statement or removing it does not matter, as it does not modify anything.
In your case you are shifting the value of a , b nos of times so you can simply use a >>= b instead of iterating a loop.

How can I check if a signed integer is positive?

Using bitwise operators and I suppose addition and subtraction, how can I check if a signed integer is positive (specifically, not negative and not zero)? I'm sure the answer to this is very simple, but it's just not coming to me.
If you really want an "is strictly positive" predicate for int n without using conditionals (assuming 2's complement):
-n will have the sign (top) bit set if n was strictly positive, and clear in all other cases except n == INT_MIN;
~n will have the sign bit set if n was strictly positive, or 0, and clear in all other cases including n == INT_MIN;
...so -n & ~n will have the sign bit set if n was strictly positive, and clear in all other cases.
Apply an unsigned shift to turn this into a 0 / 1 answer:
int strictly_positive = (unsigned)(-n & ~n) >> ((sizeof(int) * CHAR_BIT) - 1);
EDIT: as caf points out in the comments, -n causes an overflow when n == INT_MIN (still assuming 2's complement). The C standard allows the program to fail in this case (for example, you can enable traps for signed overflow using GCC with the-ftrapv option). Casting n to unsigned fixes the problem (unsigned arithmetic does not cause overflows). So an improvement would be:
unsigned u = (unsigned)n;
int strictly_positive = (-u & ~u) >> ((sizeof(int) * CHAR_BIT) - 1);
Check the most significant bit. 0 is positive, 1 is negative.
If you can't use the obvious comparison operators, then you have to work harder:
int i = anyValue;
if (i && !(i & (1U << (sizeof(int) * CHAR_BIT - 1))))
/* I'm almost positive it is positive */
The first term checks that the value is not zero; the second checks that the value does not have the leading bit set. That should work for 2's-complement, 1's-complement or sign-magnitude integers.
Consider how the signedness is represented. Often it's done with two's-complement or with a simple sign bit - I think both of these could be checked with a simple logical and.
Check that is not 0 and the most significant bit is 0, something like:
int positive(int x) {
return x && (x & 0x80000000);
}

Resources