bitwise shifiting question - c

if i have int temp=(1<<31)>>31. How come the temp becomes -1?
how do i get around this problem?
thanks

Ints are signed by default, which usually means that the high bit is reserved to indicate whether the integer is negative or not. Look up Two's complement for an explanation of how this works.
Here's the upshot:
[steven#sexy:~]% cat test.c
#include <stdint.h>
#include <stdio.h>
int main(int argc, char **argv[]) {
uint32_t uint;
int32_t sint;
int64_t slong;
uint = (((uint32_t)1)<<31) >> 31;
sint = (1<<31) >> 31;
slong = (1L << 31) >> 31;
printf("signed 32 = %d, unsigned 32 = %u, signed 64 = %ld\n", sint, uint, slong);
}
[steven#sexy:~]% ./test
signed 32 = -1, unsigned 32 = 1, signed 64 = 1
Notice how you can avoid this problem either by using an "unsigned" int (allowing the use of all 32 bits), or by going to a larger type which you don't overflow.

In your case, the 1 in your expression is a signed type - so when you upshift it by 31, its sign changes. Then downshifting causes the sign bit to be duplicated, and you end up with a bit pattern of 0xffffffff.
You can fix it like this:
int temp = (1UL << 31) >> 31;
GCC warns about this kind of error if you have -Wall turned on.

int is signed.
what 'problem' - what are you trying to do ?
int i = (1<<31); // i = -2147483648
i>>31; // i = -1
unsigned int i = (1<<31); // i = 2147483648
i>>31; // i = 1
ps ch is a nice command line 'c' intepreter for windows that lets you try this sort of stuff without compiling, it also gives you a unix command shell. See http://www.drdobbs.com/184402054

When you do (1<<31), the MSB which is the sign-bit is set(becomes 1). Then when you do the right shift, it is sign extended. Hence, you get -1. Solution: (1UL << 31) >> 31.

bit to indicate the sign is set when you do "such" left shift on a integer variable. Hence the result.

Related

Why left shift 24 bits changed the value of unsigned long in C?

I expect 0b11010010 << 24 should be the same value as 0b11010010000000000000000000000000.
I tested it in C, 0b11010010 << 24 doesn't work as expected if we saved it in c unsigned long.
Does anyone know how C unsigned long works like this?
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
int main(){
unsigned long a = 0b11010010000000000000000000000000;
unsigned long b = 0b11010010 << 24;
bool isTheSame1 = a == b;
printf("isTheSame1 %d \n",isTheSame1);
bool isTheSame2 = 0b11010010000000000000000000000000 == (0b11010010 << 24);
printf("isTheSame2 %d",isTheSame2);
}
isTheSame1 should be 1 but it prints 0 as following
isTheSame1 0
isTheSame2 1
Compiled and executed by gcc main.c && ./a.out
gcc --version
Apple clang version 14.0.0 (clang-1400.0.29.202)
Target: x86_64-apple-darwin22.2.0
Thread model: posix
Updated
As Allan Wind pointed out, I added UL suffix and now it works as expected.
unsigned long a = 0b11010010000000000000000000000000UL;
unsigned long b = 0b11010010UL << 24;
bool isTheSame1 = a == b;
printf("isTheSame1 %d \n",isTheSame1);
bool isTheSame2 = 0b11010010000000000000000000000000UL == (0b11010010UL << 24);
printf("isTheSame2 %d",isTheSame2);
The constant 0b11010010 has type int which is signed. Assuming an int is 32 bits, the expression 0b11010010 << 24 will shift a "1" bit into the sign bit. Doing so triggers undefined behavior which is why you're getting strange results.
Add the UL suffix to the constant to give it type unsigned long, then the shift will work as expected.
unsigned long b = 0b11010010UL << 24;
You are doing a left shift of a signed value (see good answer of #dbush)
In absence of suffixes numbers have int or double types
b = 0b11010010 ; /* type int */
b = 1.0; /* type double */
If you want want b in your example as unsigned long use a suffix:
b = 0b11010010UL; /* type unsigned long */
or a cast:
b = (unsigned long)0b11010010; /* type unsigned long */
With 32-bit (or smaller) int, 0b11010010 << 24 is undefined behaver (UB). It attempts to shift into the sign bit.
When int is 32-bit (common), this often results in a negative value corresponding to the bit pattern 11010010-00000000-00000000-00000000.
When a negative value is saved as an unsigned long, ULONG_MAX + 1 is added to it. With a 64-bit unsigned long the value has the bit pattern:
11111111-11111111-11111111-11111111-11010010-00000000-00000000-00000000
This large unsigned long in not equal to 0b11010010000000000000000000000000UL and so the output of "isTheSame1 0".
Had OP's long been 32-bit, it "might" have worked as OP had intended - yet unfortunately still replying on UB.
Appending an L
32-bit unsigned long: 0b11010010 << 24 suffers the same UB problem as above - yet might have "worked".
64-bit unsigned long: 0b11010010L is also long and 0b11010010L << 24 becomes the value 0b11010010000000000000000000000000, the same value as a.
Appending an U
32-bit unsigned: 0b11010010U << 24 becomes the value 0b11010010000000000000000000000000, the same value as a.
16-bit unsigned: 0b11010010U << 24 is undefined behavior as the shift is too great. Often the UB results in the same as 0b11010010U << (24-16), yet this is not reliably done.
Appending an UL
32 or 64-bit unsigned long: 0b11010010UL << 24 becomes the value 0b11010010000000000000000000000000, the same value as a.
Since the left hand side of the = of the below is unsigned long, better for the right hand side constant to be unsigned long.
unsigned long b = 0b11010010 << 24; // Original
unsigned long b = 0b11010010UL << 24; // Better

Bizarre right bitshift inconsistency

I've been working with bits in C (running on ubuntu). In using two different ways to right shift an integer, I got oddly different outputs:
#include <stdio.h>
int main(){
int x = 0xfffffffe;
int a = x >> 16;
int b = 0xfffffffe >> 16;
printf("%X\n%X\n", a, b);
return 0;
}
I would think the output would be the same for each: FFFF, because the right four hex places (16 bits) are being rightshifted away. Instead, the output is:
FFFFFFFF
FFFF
What explains this behaviour?
When you say:
int x = 0xfffffffe;
That sets x to -2 because the maximum value an int can hold here is 0x7FFFFFFF and it wraps around during conversion. When you bit-shift the negative number it gets weird.
If you change those values to unsigned int it all works out.
#include <stdio.h>
int main(){
unsigned int x = 0xfffffffe;
unsigned int a = x >> 16;
unsigned int b = 0xfffffffe >> 16;
printf("%X\n%X\n", a, b);
return 0;
}
The behaviour you see here has to do with shifting on signed or unsigned integers which give different results.
Shifts on unsigned integers are logical. On the contrary, shift on signed integers are arithmetic. EDIT: In C, it's implementation defined but generally the case.
Consequently,
int x = 0xfffffffe;
int a = x >> 16;
this part performs an arithmetic shift because x is signed. And because x is actually negative (-2 in two's complement), x is sign extended, so '1's are appended which results in 0xFFFFFFFF.
On the contrary,
int b = 0xfffffffe >> 16;
0xfffffffe is a litteral interpreted as an unsigned integer. Therefore a logical shift of 16 results in 0x0000FFFF as expected.

C - three bytes into one signed int

I have a sensor which gives its output in three bytes. I read it like this:
unsigned char byte0,byte1,byte2;
byte0=readRegister(0x25);
byte1=readRegister(0x26);
byte2=readRegister(0x27);
Now I want these three bytes merged into one number:
int value;
value=byte0 + (byte1 << 8) + (byte2 << 16);
it gives me values from 0 to 16,777,215 but I'm expecting values from -8,388,608 to 8,388,607. I though that int was already signed by its implementation. Even if I try define it like signed int value; it still gives me only positive numbers. So I guess my question is how to convert int to its two's complement?
Thanks!
What you need to perform is called sign extension. You have 24 significant bits but want 32 significant bits (note that you assume int to be 32-bit wide, which is not always true; you'd better use type int32_t defined in stdint.h). Missing 8 top bits should be either all zeroes for positive values or all ones for negative. It is defined by the most significant bit of the 24 bit value.
int32_t value;
uint8_t extension = byte2 & 0x80 ? 0xff:00; /* checks bit 7 */
value = (int32_t)byte0 | ((int32_t)byte1 << 8) | ((int32_t)byte2 << 16) | ((int32_t)extension << 24);
EDIT: Note that you cannot shift an 8 bit value by 8 or more bits, it is undefined behavior. You'll have to cast it to a wider type first.
#include <stdint.h>
uint8_t byte0,byte1,byte2;
int32_t answer;
// assuming reg 0x25 is the signed MSB of the number
// but you need to read unsigned for some reason
byte0=readRegister(0x25);
byte1=readRegister(0x26);
byte2=readRegister(0x27);
// so the trick is you need to get the byte to sign extend to 32 bits
// so force it signed then cast it up
answer = (int32_t)((int8_t)byte0); // this should sign extend the number
answer <<= 8;
answer |= (int32_t)byte1; // this should just make 8 bit field, not extended
answer <<= 8;
answer |= (int32_t)byte2;
This should also work
answer = (((int32_t)((int8_t)byte0))<<16) + (((int32_t)byte1)<< 8) + byte2;
I may be overly aggressive with parentheses but I never trust myself with shift operators :)

Bit-shift not applying to a variable declaration-assignment one-liner

I'm seeing strange behavior when I try to apply a right bit-shift within a variable declaration/assignment:
unsigned int i = ~0 >> 1;
The result I'm getting is 0xffffffff, as if the >> 1 simply wasn't there. It seems to be something about the ~0, because if I instead do:
unsigned int i = 0xffffffff >> 1;
I get 0x7fffffff as expected. I thought I might be tripping over an operator precedence issue, so tried:
unsigned int i = (~0) >> 1;
but it made no difference. I could just perform the shift in a separate statement, like
unsigned int i = ~0;
i >>= 1;
but I'd like to know what's going on.
update Thanks merlin2011 for pointing me towards an answer. Turns out it was performing an arithmetic shift because it was interpreting ~0 as a signed (negative) value. The simplest fix seems to be:
unsigned int i = ~0u >> 1;
Now I'm wondering why 0xffffffff wasn't also interpreted as a signed value.
It is how c compiler works for signed value. The base literal for number in C is int (in 32-bit machine, it is 32-bit signed int)
You may want to change it to:
unsigned int i = ~(unsigned int)0 >> 1;
The reason is because for the signed value, the compiler would treat the operator >> as an arithmetic shift (or signed shift).
Or, more shortly (pointed out by M.M),
unsigned int i = ~0u >> 1;
Test:
printf("%x", i);
Result:
In unsigned int i = ~0;, ~0 is seen as a signed integer (the compiler should warn about that).
Try this instead:
unsigned int i = (unsigned int)~0 >> 1;

Sign extension from 16 to 32 bits in C

I have to do a sign extension for a 16-bit integer and for some reason, it seems not to be working properly. Could anyone please tell me where the bug is in the code? I've been working on it for hours.
int signExtension(int instr) {
int value = (0x0000FFFF & instr);
int mask = 0x00008000;
int sign = (mask & instr) >> 15;
if (sign == 1)
value += 0xFFFF0000;
return value;
}
The instruction (instr) is 32 bits and inside it I have a 16bit number.
Why is wrong with:
int16_t s = -890;
int32_t i = s; //this does the job, doesn't it?
what's wrong in using the builtin types?
int32_t signExtension(int32_t instr) {
int16_t value = (int16_t)instr;
return (int32_t)value;
}
or better yet (this might generate a warning if passed a int32_t)
int32_t signExtension(int16_t instr) {
return (int32_t)instr;
}
or, for all that matters, replace signExtension(value) with ((int32_t)(int16_t)value)
you obviously need to include <stdint.h> for the int16_t and int32_t data types.
Just bumped into this looking for something else, maybe a bit late, but maybe it'll be useful for someone else. AFAIAC all C programmers should start off programming assembler.
Anyway sign extending is much easier than the proposals. Just make sure you are using signed variables and then use 2 shifts.
long value; // 32 bit storage
value=0xffff; // 16 bit 2's complement -1, value is now 0x0000ffff
value = ((value << 16) >> 16); // value is now 0xffffffff
If the variable is signed then the C compiler translates >> to Arithmetic Shift Right which preserves sign. This behaviour is platform independent.
So, assuming that value starts of with 0x1ff then we have, << 16 will SL (Shift Left) the value so instr is now 0xff80, then >> 16 will ASR the value so instr is now 0xffff.
If you really want to have fun with macros then try something like this (syntax works in GCC haven't tried in MSVC).
#include <stdio.h>
#define INT8 signed char
#define INT16 signed short
#define INT32 signed long
#define INT64 signed long long
#define SIGN_EXTEND(to, from, value) ((INT##to)((INT##to)(((INT##to)value) << (to - from)) >> (to - from)))
int main(int argc, char *argv[], char *envp[])
{
INT16 value16 = 0x10f;
INT32 value32 = 0x10f;
printf("SIGN_EXTEND(8,3,6)=%i\n", SIGN_EXTEND(8,3,6));
printf("LITERAL SIGN_EXTEND(16,9,0x10f)=%i\n", SIGN_EXTEND(16,9,0x10f));
printf("16 BIT VARIABLE SIGN_EXTEND(16,9,0x10f)=%i\n", SIGN_EXTEND(16,9,value16));
printf("32 BIT VARIABLE SIGN_EXTEND(16,9,0x10f)=%i\n", SIGN_EXTEND(16,9,value32));
return 0;
}
This produces the following output:
SIGN_EXTEND(8,3,6)=-2
LITERAL SIGN_EXTEND(16,9,0x10f)=-241
16 BIT VARIABLE SIGN_EXTEND(16,9,0x10f)=-241
32 BIT VARIABLE SIGN_EXTEND(16,9,0x10f)=-241
Try:
int signExtension(int instr) {
int value = (0x0000FFFF & instr);
int mask = 0x00008000;
if (mask & instr) {
value += 0xFFFF0000;
}
return value;
}
People pointed out casting and a left shift followed by an arithmetic right shift. Another way that requires no branching:
(0xffff & n ^ 0x8000) - 0x8000
If the upper 16 bits are already zeroes:
(n ^ 0x8000) - 0x8000
• Community wiki as it's an idea from "The Aggregate Magic Algorithms, Sign Extension"

Resources