Bit Operations - Indicating the Sign of a Signed Integer - c

Why does the following C code not work for returning -1 for negative numbers, 0 for 0s, and 1 for positive numbers?
(((x >> 31) ^ -1) + 1) | (!x ^ 1);
Specifically, when I pass in negative numbers, it returns 1. It seems like if I have a negative number, though (i.e., the the least significant bit is a 1 after the 31 bit shift), XORing it with -1 will give me -2 (i.e., all 1s and a 0 in the least significant bit location), and adding 1 would make it -1.

According to the C99 standard, the result of x >> n if x is negative is implementation defined. So the reason you are having a problem depends on your compiler and architecture.
However, it's most likely that the x is sign extended when you shift it i.e. the top bit is repeated to keep the sign the same as the operand. This is what happens with my compiler. So for any negative number, x >> 31 is -1. Also, for any non zero number !x is 0 (i.e. false). This applies assuming x is a 32 bit integer. If you make x an unsigned int, it should work, but consider the following alternative:
(x < 0) ? -1 : ((x > 0) ? 1 : 0)
which I think is a bit less cryptic.
And here is a program that you can use to see what your expression is doing
#include <stdio.h>
#define EVALUATE(x) printf("%s = %d\n", #x, x)
int main(int argc, char** argv)
{
unsigned int x = 51;
EVALUATE(x >> 31);
EVALUATE(((x >> 31) ^ -1));
EVALUATE(((x >> 31) ^ -1) + 1);
EVALUATE(!x);
EVALUATE(!x ^ 1);
EVALUATE((((x >> 31) ^ -1) + 1) | (!x ^ 1));
return 0;
}

>> will generally do arithmetic shift on signed data, so ((-1) >> 31) == (-1), contrary to your assumption. As pointed out by others, this is not guaranteed by the standard, but it is most likely true on modern systems. In any case, be careful with this type of bit twiddling. If portability is a concern or speed is not, you should do it a different way. See Is there a standard sign function (signum, sgn) in C/C++? for some ideas.

Related

Constant time string equality test return value

Looking for a constant time string equality test I found that most of them use bit trickery on the return value. For example this piece of code:
int ctiszero(const void* x, size_t n)
{
volatile unsigned char r = 0;
for (size_t i = 0; i < n; i += 1) {
r |= ((unsigned char*)x)[i];
}
return 1 & ((r - 1) >> 8);
}
What is the purpose of return 1 & ((r - 1) >> 8);? Why not a simple return !r;?
As mentioned in one of my comments, this functions checks if an array of arbitrary bytes is zero or not. If all bytes are zero then 1 will be returned, otherwise 0 will be returned.
If there is at least one non-zero byte, then r will be non-zero as well. Subtract 1 and you get a value that is zero or positive (since r is unsigned). Shift all bits off of r and the result is zero, which is then masked with 1 resulting in zero, which is returned.
If all the bytes are zero, then the value of r will be zero as well. But here comes the "magic": In the expression r - 1 the value of r undergoes what is called usual arithmetic conversion, which leads to the value of r to become promoted to an int. The value is still zero, but now it's a signed integer. Subtract 1 and you will have -1, which with the usual two's complement notation is equal to 0xffffffff. Shift it so it becomes 0x00ffffff and mask with 1 results in 1. Which is returned.
With constant time code, typically code that may branch (and incur run-time time differences), like return !r; is avoided.
Note that a well optimized compiler may emit the exact same code for return 1 & ((r - 1) >> 8); as return !r;. This exercise is therefore, at best, code to coax the compiler input emitting constant time code.
What about uncommon platforms?
return 1 & ((r - 1) >> 8); is well explained by #Some programmer dude good answer when int is 8-bit 2's complement - something that is very common.
With 8-bit unsigned char, and r > 0, r-1 is non-negative and 1 & ((r - 1) >> 8) returns 0 even if int is 2's complement, 1's complement or sign-magnitude, 16-bit, 32-bit etc.
When r == 0, r-1 is -1. It is implementation define behavior what 1 & ((r - 1) >> 8) returns. It returns 1 with int as 2's complement or 1's complement, but 0 with sign-magnitude.
// fails with sign-magnitude (rare)
// fails when byte width > 8 (uncommon)
return 1 & ((r - 1) >> 8);
Small changes can fix to work as desired in more cases1. Also see #Eric Postpischil
By insuring r - 1 is done using unsigned math, int encoding is irrelevant.
// v--- add u v--- shift by byte width
return 1 & ((r - 1u) >> CHAR_BIT);
1 Somewhat rare: When unsigned char size is the same as unsigned, OP's code and this fix fail. If wider math integer was available, code could use that: e.g.: return 1 & ((r - 1LLU) >> CHAR_BIT);
That's shorthand for r > 128 or zero. Which is to say, it's a non-ASCII character. If r's high bit is set subtracting 1 from it will leave the high bit set unless the high bit is the only bit set. Thus greater than 128 (0x80) and if r is zero, underflow will set the high bit.
The result of the for loop then is that if any bytes have the high bit set, or if all of the bytes are zero, 1 will be returned. But if all the non-zero bytes do not have the high bit set 0 will be returned.
Oddly, for a string of all 0x80 and 0x00 bytes 0 will still be returned. Not sure if that's a "feature" or not!

Is there anyway to get around this compiler optimization in C?

I want to note that, as pointed out by Olaf, the compiler is not at fault.
Disclaimer: I'm not entirely sure this behavior is due to compiler optimization.
Anyways, in C I'm trying to determine whether the n-th bit (n should be between 0 and 7, inclusive) of an 8-bit byte is 1 or 0. I initially came up with this solution:
#include <stdint.h>
#include <stdbool.h>
bool one_or_zero( uint8_t t, uint8_t n ) // t is some byte, n signifies which bit
{
return (t << (n - (n % 8) - 1)) >> 7;
}
Which, from my previous understanding, would do the following to a byte:
Suppose t = 5 and n = 2. Then the byte t can be represented as 0000 0101. I assumed that (t << (n - (n % 8) - 1)) would shift the bits of t so that t is 1010 0000. This assumption is only somewhat correct. I also assumed the next bit shift (>> 7) would shift the bits of t so that t is 0000 0001. This assumption is also only somewhat correct.
TL;DR: I thought the line return (t << (n - (n % 8) - 1)) >> 7; did this:
t is 0000 0101
The first bit shift occurs; t is now 1010 0000
The second bit shift occurs; t is now 0000 0001
t is returned as 0000 0001
Although I intend for that to happen, it does not. Instead, I have to write the following, to get my intended results:
bool one_or_zero( uint8_t t, uint8_t n ) // t is some byte, n signifies which bit
{
uint8_t val = (t << (n - (n % 8) - 1));
return val >> 7;
}
I know that adding uint8_t val isn't a massive performance drain. Still, I'd like to know two things:
Do I have to initialize another variable to do what I intend?
Why doesn't the one-liner do the same thing as the two-liner?
I'm under the impression that when the compiler optimizes my code, it smashes the two bit shifts together so only one occurs. This seems like a nice thing, but it doesn't "clear" the other bits as intended.
That code is very complicated just to check a bit in an integer. Try the standard method:
return (t & (1U << n)) != 0;
If you have to check n is valid, add an assertion. else masking (n & 7) or modulus (n % 8) (this will be optimized by the compiler to the mask-operation) will force the shift count in a valid range. As that pattern will be recognized by many compilers, they might transform this to a single bit-test CPU instruction if available.
To avoid magic numbers, you should replace the modulus 8 by: (sizeof(t) * CHAR_BIT). That will follow any type t might have. The mask is always one less than the modulus.
Your code:
(n - (n % 8) - 1))
If n < 8 it yields a negative value (-1 precisely). Negative shifts present undefined behaviour, so anything can happen (watch out for nasal demons).
I believe you are the victim of integer promotion.
When you have an expression: x operator y there are a few things you should be aware of. The first is that the result (and in the process the other operand) of the expression is promoted to the "largest" type of the two operands.
In your example, this means the following:
(t << (n - (n % 8) - 1)) >> 7; The constant 8 is an int therefore n%8 is also an int.
(t << (n - (integer) - 1)) >> 7 (n - integer - 1) is also an integer, which means that the temporary value (t << integer) will be stored in an int. This means that you don't "cut off" the most significant bits like you intend, because the result is stored in (most likely) 32 bits, and not 8 like you presume.
If you on the other hand temporarily store the int result in an uint8_t you will correctly cut off the leading bits and get what you intend.
The you can work around the problem by casting your operands to uint8_t during the computation:
(t << (uint8_t)(n - (n % 8) - 1))) >> 7;
Or even better, use a mask like suggested in the answer by Olaf:
(t & ((uint8_t)1 << n)) != 0

Moving a "nibble" to the left using C

I've been working on this puzzle for awhile. I'm trying to figure out how to rotate 4 bits in a number (x) around to the left (with wrapping) by n where 0 <= n <= 31.. The code will look like:
moveNib(int x, int n){
//... some code here
}
The trick is that I can only use these operators:
~ & ^ | + << >>
and of them only a combination of 25. I also can not use If statements, loops, function calls. And I may only use type int.
An example would be moveNib(0x87654321,1) = 0x76543218.
My attempt: I have figured out how to use a mask to store the the bits and all but I can't figure out how to move by an arbitrary number. Any help would be appreciated thank you!
How about:
uint32_t moveNib(uint32_t x, int n) { return x<<(n<<2) | x>>((8-n)<<2); }
It uses <<2 to convert from nibbles to bits, and then shifts the bits by that much. To handle wraparound, we OR by a copy of the number which has been shifted by the opposite amount in the opposite direciton. For example, with x=0x87654321 and n=1, the left part is shifted 4 bits to the left and becomes 0x76543210, and the right part is shifted 28 bits to the right and becomes 0x00000008, and when ORed together, the result is 0x76543218, as requested.
Edit: If - really isn't allowed, then this will get the same result (assuming an architecture with two's complement integers) without using it:
uint32_t moveNib(uint32_t x, int n) { return x<<(n<<2) | x>>((9+~n)<<2); }
Edit2: OK. Since you aren't allowed to use anything but int, how about this, then?
int moveNib(int x, int n) { return (x&0xffffffff)<<(n<<2) | (x&0xffffffff)>>((9+~n)<<2); }
The logic is the same as before, but we force the calculation to use unsigned integers by ANDing with 0xffffffff. All this assumes 32 bit integers, though. Is there anything else I have missed now?
Edit3: Here's one more version, which should be a bit more portable:
int moveNib(int x, int n) { return ((x|0u)<<((n&7)<<2) | (x|0u)>>((9+~(n&7))<<2))&0xffffffff; }
It caps n as suggested by chux, and uses |0u to convert to unsigned in order to avoid the sign bit duplication you get with signed integers. This works because (from the standard):
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Since int and 0u have the same rank, but 0u is unsigned, then the result is unsigned, even though ORing with 0 otherwise would be a null operation.
It then truncates the result to the range of a 32-bit int so that the function will still work if ints have more bits than this (though the rotation will still be performed on the lowest 32 bits in that case. A 64-bit version would replace 7 by 15, 9 by 17 and truncate using 0xffffffffffffffff).
This solution uses 12 operators (11 if you skip the truncation, 10 if you store n&7 in a variable).
To see what happens in detail here, let's go through it for the example you gave: x=0x87654321, n=1. x|0u results in a the unsigned number 0x87654321u. (n&7)<<2=4, so we will shift 4 bits to the left, while ((9+~(n&7))<<2=28, so we will shift 28 bits to the right. So putting this together, we will compute 0x87654321u<<4 | 0x87654321u >> 28. For 32-bit integers, this is 0x76543210|0x8=0x76543218. But for 64-bit integers it is 0x876543210|0x8=0x876543218, so in that case we need to truncate to 32 bits, which is what the final &0xffffffff does. If the integers are shorter than 32 bits, then this won't work, but your example in the question had 32 bits, so I assume the integer types are at least that long.
As a small side-note: If you allow one operator which is not on the list, the sizeof operator, then we can make a version that works with all the bits of a longer int automatically. Inspired by Aki, we get (using 16 operators (remember, sizeof is an operator in C)):
int moveNib(int x, int n) {
int nbit = (n&((sizeof(int)<<1)+~0u))<<2;
return (x|0u)<<nbit | (x|0u)>>((sizeof(int)<<3)+1u+~nbit);
}
Without the additional restrictions, the typical rotate_left operation (by 0 < n < 32) is trivial.
uint32_t X = (x << 4*n) | (x >> 4*(8-n));
Since we are talking about rotations, n < 0 is not a problem. Rotation right by 1 is the same as rotation left by 7 units. Ie. nn=n & 7; and we are through.
int nn = (n & 7) << 2; // Remove the multiplication
uint32_t X = (x << nn) | (x >> (32-nn));
When nn == 0, x would be shifted by 32, which is undefined. This can be replaced simply with x >> 0, i.e. no rotation at all. (x << 0) | (x >> 0) == x.
Replacing the subtraction with addition: a - b = a + (~b+1) and simplifying:
int nn = (n & 7) << 2;
int mm = (33 + ~nn) & 31;
uint32_t X = (x << nn) | (x >> mm); // when nn=0, also mm=0
Now the only problem is in shifting a signed int x right, which would duplicate the sign bit. That should be cured by a mask: (x << nn) - 1
int nn = (n & 7) << 2;
int mm = (33 + ~nn) & 31;
int result = (x << nn) | ((x >> mm) & ((1 << nn) + ~0));
At this point we have used just 12 of the allowed operations -- next we can start to dig into the problem of sizeof(int)...
int nn = (n & (sizeof(int)-1)) << 2; // etc.

Tell if a 32 bit signed int is a power of 2

I need to determine if a signed 32 bit number is a power of two. So far I know the first thing to do is check if its negative since negative numbers cannot be powers of 2.
Then I need to see if the next numbers are valid etc... SO I was able to write it like this:
// Return 1 if x is a power of 2, and return 0 otherwise.
int func(int x)
{
return ((x != 0) && ((x & (~x + 1)) == x));
}
But for my assignment I can only use 20 of these operators:
! ~ & ^ | + << >>
and NO equality statements or loops or casting or language constructs.
So I am trying to convert the equality parts and I know that !(a^b) is the same as a == b but I cant seem to figure it out completely. Any ideas on how to covert that to the allowed operators?
Tim's comment ashamed me. Let me try to help you to find the answer by yourself.
What does it mean that x is power of 2 in terms of bit manipulation? It means that there is only one bit set to 1. How can we do such a trick, that will turn that bit to 0 and some other possibly to 1? So that & will give 0? In single expression? If you find out - you win.
Try these ideas:
~!!x+1 gives a mask: 0 if x==0 and -1 if x!=0.
(x&(~x+1))^x gives 0 if x has at most 1 bit set and nonzero otherwise, except when ~x is INT_MIN, in which case the result is undefined... You could perhaps split it into multiple parts with bitshifts to avoid this but then I think you'll exceed the operation limit.
You also want to check the sign bit, since negative values are not powers of two...
BTW, it sounds like your instructor is unaware that signed overflow is UB in C. He should be writing these problems for unsigned integers. Even if you want to treat the value semantically as if it were signed, you need unsigned arithmetic to do meaningful bitwise operations like this.
First, in your solution, it should be
return ((x > 0) && ((x & (~x + 1)) == x));
since negative numbers cannot be the power of 2.
According to your requirement, we need to convert ">", "&&", "==" into permitted operators.
First think of ">", an integer>0 when its sign bit is 1 and it is not 0; so we consider
~(x >> 31) & (x & ~0)
this expression above will return a non zero number if x is non-positive. Notice that ~0 = -1, which is 0x11111111. We use x & ~0 to check if this integer is all 0 at each digit.
Secondly we consider "&&". AND is pretty straight forward -- we only need to get 0x01 & 0x01 to return 1. So here we need to add (!!) in front of our first answer to change it to 0x01 if it returns a nonzero number.
Finally, we consider "==". To test equity of A and B we only need to do
!(A ^ B)
So finally we have
return (!!(~(x >> 31) & (x & ~0))) & (!((x&(~x+1)) ^ x))
It seems that it's a homework problem. Please don't simply copy and paste. My answer is kind of awkward, it might be improved.
Think about this... any power of 2 minus 1 is a string of 0s followed by a string of 1s. You can implement minus one by x + ~0. Think about where the string of 1s starts with relation to the single 1 that would be in a power of 2.
int ispower2(int x)
{
int ispositive= ! ( (x>>31) ^ 0) & !!(x^0);
int temp= !((x & ~x+1) ^ x);
return temp & ispositive;
}
It is interesting and efficient to use bitwise operators in C to solve some problems. In this question, we need to deal with two checks:
the sign check. If negative, return 0; otherwise return 1;
! (x >> 31 & ox1) & !(!x)
/* This op. extracts the sign bit in x. However, the >> in this case will be arithmetic. That means there will be all 1 before the last bit(LSB). For negative int, it is oxFFFFFFFF(-); otherwise, oxFFFFFFFE(+). The AND ox1 op. corrects the >> to ox1(-) or ox0(+). The Logical ! turns ox1(-) and ox0 (+) into 0 or 1,respectively. The !(!x) makes sure 0 is not power(2)*/
the isPower(2) check. If yes, return 1; otherwise 0.
!( x & (~x + ox1) ^ x )
/* This op. does the isPower(2) check. The x & (~x + ox1) returns x, if and only if x is power(2). For example: x = ox2 and ~x + ox1 = oxFFFFFFFE. x & (~x + ox1) = ox2; if x = ox5 and ~x + ox1 = oxFFFFFFFB. x & (~x + ox1) = ox1. Therefore, ox2 ^ ox2 = 0; but ox5 ^ ox1 = ox4. The ! op turn 0 and others into 1 and 0, respectively.*/
The last AND op. between 1 and 2 checks will generate the result of isPower(2) function.

Logical NOT (!) operator won't work with bitwise statement

I am attempting to determine if I can compute the sum of two 32 bit integers without overflow, while making use of only certain bitwise and other operators. So, if the integers x and y can be added without overflow, the following code should return 1, and 0 otherwise.
(((((x >> 31) + (y >> 31)) & 2) >> 1))
However, it returns 0 when it should be 1 and vice versa. When I employ the logical NOT (!) operator, or bitwise XOR (^) with 0x1, it does not fix the issue.
!(((((x >> 31) + (y >> 31)) & 2) >> 1))
(((((x >> 31) + (y >> 31)) & 2) >> 1) ^ 0x1)
^ these don't work.
Thanks in advance.
This is a bit cleaner:
~(x & y) >> 31
Update
kriss' comment is correct. all this code does is check that the two MSBs are both set.
I was just looking at kriss' answer, and it occurred to me that the same thing can be done using only a single addition, plus bitwise operators, assuming unsigned ints.
((x & 0x7FFFFFFF) + (y & 0x7FFFFFFF)) & 0x80000000 & (x | y)
The first parenthesised section sets both MSB to 0 then adds the result. Any carry will end up in the MSB of the result. The next bitmask isolates that carry. The final term checks for a set MSB on either x or y, which result in a carry overall. To meet the spec in the question, just do:
~(((x & 0x7FFFFFFF) + (y & 0x7FFFFFFF)) & 0x80000000 & (x | y)) >> 31
Let's suppose both numbers are unsigned integers. If you work with signed integers, it would be a little be more tricky as there is two ways to get overflow, either adding two large positives of adding two large negative. Anyway checking the most significant bits won't be enough, as addition propagates carry bit, you must take it into account.
For unsigned integers, if you don't care to cheat an easy way is:
(x+y < x) || (x+y < y)
This will work as most compilers won't do anything when overflow happen, just let it be.
You can also remarks that for overflow to happen at least one of the two numbers must have it's most significant bit set at 1. Hence something like that should work (beware, untested), but it's way more compilcated than the other version.
/* both Most Significant bits are 1 */
(x&y&0x80000000)
/* x MSb is 1 and carry propagate */
||((x&0x80000000)&&(((x&0x7FFFFFFF)+y)&0x80000000))
/* y MSb is 1 and carry propagate */
||((y&0x80000000)&&(((y&0x7FFFFFFF)+x)&0x80000000))
The logical ! is working fine for me.
me#desktop:~$ cat > so.c
#include <stdio.h>
void main() {
int y = 5;
int x = 3;
int t;
t = (((((x >> 31) + (y >> 31)) & 2) >> 1));
printf("%d\n", t);
t = !(((((x >> 31) + (y >> 31)) & 2) >> 1));
printf("%d\n", t);
}
^D
me#desktop:~$ gcc -o so so.c
me#desktop:~$ ./so
0
1
me#desktop:~$ uname -a
Linux desktop 2.6.32-23-generic #37-Ubuntu SMP Fri Jun 11 07:54:58 UTC 2010 i686 GNU/Linux
There is no simple bit-arithmetic-based test for overflow because addition involves carry. But there are simple tests for overflow that do not involve invoking overflow or unsigned integer wrapping, and they're even simpler than doing the addition then checking for overflow (which is of course undefined behavior for signed integers):
For unsigned integers x and y: (x<=UINT_MAX-y)
For signed integers, first check if they have opposite signs. If so, addition is automatically safe. If they're both positive, use (x<=INT_MAX-y). If they're both negative, use (x>=INT_MIN-y).
Are those signed integers by any chance? Your logic looks like it should be fine for unsigned integers (unsigned int) but not for regular ints, since in that case the shift will preserve the sign bit.

Resources