I have the following C code which works:
int ex(unsigned int x) {
int mask = 0x55555555;
int a = ((x >> 0) & mask );
return a + ((x >> 1) & mask );
}
However, when I expand it to this, I get a different result:
int ex(unsigned int x) {
int mask = 0x55555555;
int a = ((x >> 0) & mask );
int b = ((x >> 1) & mask );
return a + b;
}
What is the reason for this difference?
EDIT:
Note, I'm compiling this for 32bit.
What is the reason for this difference?
The 1st snippet returns the result of adding two unsigneds with the result being (implicitly) converted to an int.
The 2nd snippet returns the result of adding two ints .
More on "The Usual Arithmetic Conversions":
Usual arithmetic conversions in C : Whats the rationale behind this particular rule
C usual arithmetic conversions
Implicit integer type conversion in C
Related
Legal operators
! ~ & ^ | << >>
I have tried this code:
int lhs = ((x << 31)>>31);
int rhs = (x >> 31);
return (~(lhs ^ rhs));
But output was not always as expected
I don't know of any data type whose MSB is bit 30. However, 32- bit ints have their MSB in bit 31, so you might try using that value instead.
You should always use unsigned types when performing bit operations.
#include <stdint.h>
uint32_t lsb = ((uint32_t)(value & 1));
uint32_t msb = ((uint32_t)((value >> 31) & 1));
return !(msb ^ lsb);
I am trying to use Bitwise operators ( ! ~ & ^ | + << >> ) in C to achieve a multiplication of 4 while also correcting for Positive and Negative overflow by returning the Max value and Minimum value, respectively.
For example,
Function(0x10000000) = 0x40000000
Function(0x20000000) = 0x7FFFFFFF
Function(0x80000000) = 0x80000000
My primary method is checking the sign of the product to find if it changed expectedly.
int funcMultBy4(int x){
int signedBit=(x>>31);
int minValue= 1<<31;
int xtimes4= x<<2;
int maxValue= (x ^ xtimes4) >> 31;
int saturate= maxValue & (signedBit ^ ~minValue);
return saturate | (xtimes4 ^ ~maxValue) ;
}
Currently, when multiplying 0x7fffffff, I am getting -1 rather than an expected 0x7FFFFFFF. I understand there is probably a necessary shift by 1 somewhere, but am having trouble finding my error.
It is the ^ in the last line that needs to be & and overflow must be detected in both the first and second bit-shift.
This slight reorganisation of the function seems more intuitive to me:
int funcMultBy4(int x)
{
int signedBit = (x>>31);
int minValue = 1<<31;
int xtimes4 = x<<2;
int overflow = (x ^ (x<<1) | (x ^ (x<<2))) >> 31;
int saturate = (signedBit ^ ~minValue);
return (overflow & saturate) | (~overflow & xtimes4) ;
}
Of course, the code depends on the int size to be 32 bits. You may either use the fixed-width type int32_t or replace 31 by ((int)((sizeof(int)<<3)-1)) (could be defined in a macro).
This question already has answers here:
Detecting signed overflow in C/C++
(13 answers)
Closed 7 years ago.
I can only use the operations ! ~ & ^ ! + << >>, and I'm having trouble grasping overflow, could use any tips or help!
It depends on whether the numbers are signed or unsigned.
If both operands are unsigned, overflow will wrap back around to 0.
If one or both operands are signed, the behavior is implementation defined, however most implementations represent signed integers in 2's complement, so in those cases positive overflow will wrap around to the negative side, and negative overflow will wrap around to the positive side.
In the case of unsigned overflow, the result will be less than at least one operand, so you can test for it this way:
if ((x + y < x) || (x + y < y) {
printf("overflow\n");
}
In the signed case, you first need to check whether both are positive (and check for negative wraparound) or both are negative (and check for positive wraparound):
if ((x > 0) && (y > 0) && ((x + y < x) || (x + y < y))) {
printf("negative overflow\n");
}
if ((x < 0) && (y < 0) && ((x + y > x) || (x + y > y))) {
printf("positive overflow\n");
}
As I mentioned before, the signed case is implementation defined, and the above will only work if signed integers are represented as 2's complement. In practice however, this will typically be the case.
This should give you an idea of how overflow works, although it doesn't use only the specific operators you mentioned. With this, you should be able to figure out how to use those other operators to achieve what the expressions above do.
With signed integer math, unless you have access to the limits like INT_MAX INT_MIN, there is no answer that gets around undefined behavior.
#include <limits.h>
int is_overflow_add_signed(int a, int b) {
// This uses -, so does not meet OP's goal.
// Available as a guide
return (a < 0) ? (b < INT_MIN - a) : (b > INT_MAX - a);
}
With unsigned math, simply see if the result "wrapped" around.
int is_overflow_add_unsigned(unsigned a, unsigned b) {
return (a + b) < a;
}
As pointed by many peoples, it is not right for signed...
So I changed it for unsigned first.
You need to calculate part by part.
Since you didn't tell us the data type, I assumed it is 4 byte unsigned data.
unsigned long x, unsigned long y;
// x = ...
// y = ...
unsigned long first_byte_x = (x & 0xFF000000) >> 24;
unsigned long first_byte_y = (y & 0xFF000000) >> 24;
unsigned long other_bytes_x = x & 0x00FFFFFF;
unsigned long other_bytes_y = y & 0x00FFFFFF;
unsigned long other_bytes_sum = other_bytes_x + other_bytes_y;
unsigned long carry = (other_bytes_sum & 0xFF000000) >> 24;
unsigned long first_byte_sum = first_byte_x + first_byte_y + carry;
if (first_byte_sum > 0xFF)
// overflow
else
// not overflow
If you can use mod(%), then it will be more simple.
*It looks like a homework so I hoped you considered enough before your asking...
This is a university question. Just to make sure :-) We need to implement (float)x
I have the following code which must convert integer x to its floating point binary representation stored in an unsigned integer.
unsigned float_i2f(int x) {
if (!x) return x;
/* get sign of x */
int sign = (x>>31) & 0x1;
/* absolute value of x */
int a = sign ? ~x + 1 : x;
/* calculate exponent */
int e = 0;
int t = a;
while(t != 1) {
/* divide by two until t is 0*/
t >>= 1;
e++;
};
/* calculate mantissa */
int m = a << (32 - e);
/* logical right shift */
m = (m >> 9) & ~(((0x1 << 31) >> 9 << 1));
/* add bias for 32bit float */
e += 127;
int res = sign << 31;
res |= (e << 23);
res |= m;
/* lots of printf */
return res;
}
One problem I encounter now is that when my integers are too big then my code fails. I have this control procedure implemented:
float f = (float)x;
unsigned int r;
memcpy(&r, &f, sizeof(unsigned int));
This of course always produces the correct output.
Now when I do some test runs, this are my outputs (GOAL is what It needs to be, result is what I got)
:!make && ./btest -f float_i2f -1 0x80004999
make: Nothing to be done for `all'.
Score Rating Errors Function
x: [-2147464807] 10000000000000000100100110011001
sign: 1
expone: 01001110100000000000000000000000
mantis: 00000000011111111111111101101100
result: 11001110111111111111111101101100
GOAL: 11001110111111111111111101101101
So in this case, a 1 is added as the LSB.
Next case:
:!make && ./btest -f float_i2f -1 0x80000001
make: Nothing to be done for `all'.
Score Rating Errors Function
x: [-2147483647] 10000000000000000000000000000001
sign: 1
expone: 01001110100000000000000000000000
mantis: 00000000011111111111111111111111
result: 11001110111111111111111111111111
GOAL: 11001111000000000000000000000000
Here 1 is added to the exponent while the mantissa is the complement of it.
I tried hours to look ip up on the internet plus in my books etc but I can't find any references to this problem. I guess It has something to do with the fact that the mantissa is only 23 bits. But how do I have to handle it then?
EDIT: THIS PART IS OBSOLETE THANKS TO THE COMMENTS BELOW. int l must be unsigned l.
int x = 2147483647;
float f = (float)x;
int l = f;
printf("l: %d\n", l);
then l becomes -2147483648.
How can this happen? So C is doing the casting wrong?
Hope someone can help me here!
Thx
Markus
EDIT 2:
My updated code is now this:
unsigned float_i2f(int x) {
if (x == 0) return 0;
/* get sign of x */
int sign = (x>>31) & 0x1;
/* absolute value of x */
int a = sign ? ~x + 1 : x;
/* calculate exponent */
int e = 158;
int t = a;
while (!(t >> 31) & 0x1) {
t <<= 1;
e--;
};
/* calculate mantissa */
int m = (t >> 8) & ~(((0x1 << 31) >> 8 << 1));
m &= 0x7fffff;
int res = sign << 31;
res |= (e << 23);
res |= m;
return res;
}
I also figured out that the code works for all integers in the range -2^24, 2^24. Everything above/below sometimes works but mostly doesn't.
Something is missing, but I really have no idea what. Can anyone help me?
The answer printed is absolutely correct as it's totally dependent on the underlying representation of numbers being cast. However, If we understand the binary representation of the number, you won't get surprised with this result.
To understand an implicit conversion is associated with the assignment operator (ref C99 Standard 6.5.16). The C99 Standard goes on to say:
6.3.1.4 Real floating and integer
When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
Your earlier example illustrates undefined behavior due to assigning a value outside the range of the destination type. Trying to assign a negative value to an unsigned type, not from converting floating point to integer.
The asserts in the following snippet ought to prevent any undefined behavior from occurring.
#include <limits.h>
#include <math.h>
unsigned int convertFloatingPoint(double v) {
double d;
assert(isfinite(v));
d = trunc(v);
assert((d>=0.0) && (d<=(double)UINT_MAX));
return (unsigned int)d;
}
Another way for doing the same thing, Create a union containing a 32-bit integer and a float. The int and float are now just different ways of looking at the same bit of memory;
union {
int myInt;
float myFloat;
} my_union;
my_union.myInt = 0x BFFFF2E5;
printf("float is %f\n", my_union.myFloat);
float is -1.999600
You are telling the compiler to take the number you have (large integer) and make it into a float, not to interpret the number AS float. To do that, you need to tell the compiler to read the number from that address in a different form, so this:
myFloat = *(float *)&myInt ;
That means, if we take it apart, starting from the right:
&myInt - the location in memory that holds your integer.
(float *) - really, I want the compiler use this as a pointer to float, not whatever the compiler thinks it may be.
* - read from the address of whatever is to the right.
myFloat = - set this variable to whatever is to the right.
So, you are telling the compiler: In the location of (myInt), there is a floating point number, now put that float into myFloat.
I've been working on this puzzle for awhile. I'm trying to figure out how to rotate 4 bits in a number (x) around to the left (with wrapping) by n where 0 <= n <= 31.. The code will look like:
moveNib(int x, int n){
//... some code here
}
The trick is that I can only use these operators:
~ & ^ | + << >>
and of them only a combination of 25. I also can not use If statements, loops, function calls. And I may only use type int.
An example would be moveNib(0x87654321,1) = 0x76543218.
My attempt: I have figured out how to use a mask to store the the bits and all but I can't figure out how to move by an arbitrary number. Any help would be appreciated thank you!
How about:
uint32_t moveNib(uint32_t x, int n) { return x<<(n<<2) | x>>((8-n)<<2); }
It uses <<2 to convert from nibbles to bits, and then shifts the bits by that much. To handle wraparound, we OR by a copy of the number which has been shifted by the opposite amount in the opposite direciton. For example, with x=0x87654321 and n=1, the left part is shifted 4 bits to the left and becomes 0x76543210, and the right part is shifted 28 bits to the right and becomes 0x00000008, and when ORed together, the result is 0x76543218, as requested.
Edit: If - really isn't allowed, then this will get the same result (assuming an architecture with two's complement integers) without using it:
uint32_t moveNib(uint32_t x, int n) { return x<<(n<<2) | x>>((9+~n)<<2); }
Edit2: OK. Since you aren't allowed to use anything but int, how about this, then?
int moveNib(int x, int n) { return (x&0xffffffff)<<(n<<2) | (x&0xffffffff)>>((9+~n)<<2); }
The logic is the same as before, but we force the calculation to use unsigned integers by ANDing with 0xffffffff. All this assumes 32 bit integers, though. Is there anything else I have missed now?
Edit3: Here's one more version, which should be a bit more portable:
int moveNib(int x, int n) { return ((x|0u)<<((n&7)<<2) | (x|0u)>>((9+~(n&7))<<2))&0xffffffff; }
It caps n as suggested by chux, and uses |0u to convert to unsigned in order to avoid the sign bit duplication you get with signed integers. This works because (from the standard):
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Since int and 0u have the same rank, but 0u is unsigned, then the result is unsigned, even though ORing with 0 otherwise would be a null operation.
It then truncates the result to the range of a 32-bit int so that the function will still work if ints have more bits than this (though the rotation will still be performed on the lowest 32 bits in that case. A 64-bit version would replace 7 by 15, 9 by 17 and truncate using 0xffffffffffffffff).
This solution uses 12 operators (11 if you skip the truncation, 10 if you store n&7 in a variable).
To see what happens in detail here, let's go through it for the example you gave: x=0x87654321, n=1. x|0u results in a the unsigned number 0x87654321u. (n&7)<<2=4, so we will shift 4 bits to the left, while ((9+~(n&7))<<2=28, so we will shift 28 bits to the right. So putting this together, we will compute 0x87654321u<<4 | 0x87654321u >> 28. For 32-bit integers, this is 0x76543210|0x8=0x76543218. But for 64-bit integers it is 0x876543210|0x8=0x876543218, so in that case we need to truncate to 32 bits, which is what the final &0xffffffff does. If the integers are shorter than 32 bits, then this won't work, but your example in the question had 32 bits, so I assume the integer types are at least that long.
As a small side-note: If you allow one operator which is not on the list, the sizeof operator, then we can make a version that works with all the bits of a longer int automatically. Inspired by Aki, we get (using 16 operators (remember, sizeof is an operator in C)):
int moveNib(int x, int n) {
int nbit = (n&((sizeof(int)<<1)+~0u))<<2;
return (x|0u)<<nbit | (x|0u)>>((sizeof(int)<<3)+1u+~nbit);
}
Without the additional restrictions, the typical rotate_left operation (by 0 < n < 32) is trivial.
uint32_t X = (x << 4*n) | (x >> 4*(8-n));
Since we are talking about rotations, n < 0 is not a problem. Rotation right by 1 is the same as rotation left by 7 units. Ie. nn=n & 7; and we are through.
int nn = (n & 7) << 2; // Remove the multiplication
uint32_t X = (x << nn) | (x >> (32-nn));
When nn == 0, x would be shifted by 32, which is undefined. This can be replaced simply with x >> 0, i.e. no rotation at all. (x << 0) | (x >> 0) == x.
Replacing the subtraction with addition: a - b = a + (~b+1) and simplifying:
int nn = (n & 7) << 2;
int mm = (33 + ~nn) & 31;
uint32_t X = (x << nn) | (x >> mm); // when nn=0, also mm=0
Now the only problem is in shifting a signed int x right, which would duplicate the sign bit. That should be cured by a mask: (x << nn) - 1
int nn = (n & 7) << 2;
int mm = (33 + ~nn) & 31;
int result = (x << nn) | ((x >> mm) & ((1 << nn) + ~0));
At this point we have used just 12 of the allowed operations -- next we can start to dig into the problem of sizeof(int)...
int nn = (n & (sizeof(int)-1)) << 2; // etc.