I'm trying to get this problem to work where I have to multiply 3/8 in bit and then round towards zero.
So far I have this
((((x<<1)+x)>>3)+((x>>31)&1));
The idea behind it is that the first part takes x and shifts it left 1 and adds x to get the multiplied by 3 effect and then shifts right 3 to get the divide by 8 part. Then I would add 1 if it is negative by testing to see if the sign bit is 1 (1&1 = 1) or 0 (0&1 = 0). My code won't work though, the tests are off.
Any ideas what I am doing wrong?
The left shifts that you use in effect shifts the individual bits as an unsigned integer so that you could wind up losing the sign bit. That is not what it sounds like you want. Try the multiplication to see what you should be getting against the bit shift to see what you are getting.
x *= 3.0/8.0;
Note the manual entry below showing that if the signed bit is affected the result is undefined
Left shift
The left-shift operator causes the bits in shift-expression to be
shifted to the left by the number of positions specified by
additive-expression. The bit positions that have been vacated by the
shift operation are zero-filled. A left shift is a logical shift (the
bits that are shifted off the end are discarded, including the sign
bit). For more information about the kinds of bitwise shifts, see
Bitwise shifts.
The following example shows left-shift operations using unsigned
numbers. The example shows what is happening to the bits by
representing the value as a bitset. For more information, see bitset
Class.
If you left-shift a signed number so that the sign bit is affected, the result is undefined. The following example shows what happens in
Visual C++ when a 1 bit is left-shifted into the sign bit position.
#include <iostream>
#include <bitset>
using namespace std;
int main() {
short short1 = 16384;
bitset<16> bitset1{short2};
cout << bitset1 << endl; // 0100000000000000
short short3 = short1 << 1;
bitset<16> bitset3{short3}; // 16384 left-shifted by 1 = -32768
cout << bitset3 << endl; // 100000000000000
short short4 = short1 << 14;
bitset<16> bitset4{short4}; // 4 left-shifted by 14 = 0
cout << bitset4 << endl; // 000000000000000
}
You are overflowing your format by testing with the most negative number and then trying to multiply it to a larger (i.e., even more negative) number.
There are various ways to fix this.
Use something larger like int64.
Use a second value to hold the overflow.
Split the value in half and then compute it as a polynomial.
For the one test case, you could divide first and then multiply, and it would "work", but it would fail for all the cases where you then lose bits off the right side.
Related
I need explanation what exaclty means this operation in C language.
I know this is doing a bit shift to left by n, but I don't understand this code:
| (a >> (32 - n)).
This is full code below:
uint32_t rot_l(uint32_t a, uint8_t n)
{
return (a << n) | (a >> (32 - n));
}
Please help me understand this.
Given a sample 32 bit integer a:
11000000001111111110000000000000
a << n will shift the entire sequence to the left by n bits. Any bits that are shifted to the left of the first bit are removed. Any new bits added on the right are 0. So, say we shift this by n = 3, we'll get:
00000001111111110000000000000000
Then, a >> (32 - n) will shift a to the right by 32 - n. Note that 32 is the size in bits of a, so 32 - n will shift all the bits that didn't get truncated to the right. For n = 3 again, we'll get:
00000000000000000000000000000110
(the 110 is the first 3 most significant bits of n)
Finally, the | is the bitwise or operator, and this will compute the result of every using or on every bit in the two results.
00000001111111110000000000000000
00000000000000000000000000000110
================================ |
00000001111111110000000000000110
So what happens is, first the bits of a are shifted to the left by n. This results in the n most significant bits being truncated. Then these n most signifcant bits are shifted all the way to the right, to fill up the space that was originally filled with 0 from the left shift.
The result is then combined using the |. This simulates the entire string of bits in the integer being rotated to the left. This makes sense given the name of the function is rot_l :)
So i am working on this method, but am restricted to using ONLY these operators:
<<, >>, !, ~, &, ^, |, +
I need to find if a given int parameter can be represented using 2's complement arithmetic in a given amount of bits.
Here is what I have so far:
int validRange(int val, int bits){
int minInRange = ~(1<<(bits + ~0 ))+1; //the smallest 2's comp value possible with this many bits
int maxInRange = (1<<(bits+~0))+~0; //largest 2's comp value possible
..........
}
This is what I have so far, and all I need to do now is figure out how to tell if minInRange <= val <=maxInRange. I wish I could use the greater than or less than operator, but we are not allowed. What is the bitwise way to check this?
Thanks for any help!
Two's complement negative numbers always have a '1' in their high bit.
You can convert from negative to positive (and vice versa) by converting from FF -> 00 -> 01. That is, invert the bits, add 1. (01 -> FE -> FF also works: invert the bits, add 1)
A positive number can be represented if the highest set bit in the number is within your range. (nbits - 1: 7 bits for an 8 bit signed char, etc.)
I'm not sure if your constraints allow you to use arrays. They would speed up some things but can be replaced with loops or if statements.
Anyway, if 1 << (NUM_INT_BITS-1) is set on your input, then it's negative.
Invert, add one.
Now, consider 0. Zero is a constant, and it's always the same no matter how many bits. But if you invert 0, you get "all the bits" which changes by architecture. So, ALL_BITS = ~0.
If you want to know if a positive number can be represented in 2 bits, check to see if any bits greater than or equal to bit 2 are set. Example:
two_bits = 0b00000011
any_other_bits = ~two_bits # Result: 0b11...11100
if positive_number & any_other_bits
this number is too fat for these bits!
But how do you know what ~two_bits should be? Well, it's "all set bits except the bottom however-many". And you can construct that by starting with "all set bits" and shifting them upwards (aka, "left") however-many places:
any_other_bits = ~0 << 2 # where "2" is the number of bits to check
All together now:
if (val & ((unsigned)INT_MAX + 1))
val = ~val + 1;
mask = ~0 << bits;
too_wide = val & mask;
return !too_wide;
To test if a number can be represented in a N-bit 2s compliment number: Simply test that either
The number bitwise-and'ed with the compliment of a word with the low (N-1) bits set is equal to zero
OR The high InputBitWidth-(N-1) bits of the number are 1s.
mask=(1<<(bits-1))-1; return ( !(val&mask) | !((val&~mask)^~mask) );
Is it possible to shift 0s as you can shift 1s in shifting operation in C?
Something like this
for (i=0; i<32; i++) {
if (data & 0x01) {
data |= (1<<i);
}
else {
data &=~ (0<<i);
}
}
I'm checking if some bits are set and depending on that I'm storing 0s or 1s in new variable, shifting every bit to left.
<< (not <!) does not shift one bit. It shifts the whole number. When you say 1<<5, it is shifting 0000000000000001 five places left (given a 16-bit value), which gives 0000000000100000. 0<<5 is shifting 0000000000000000 five places left, which results in 0000000000000000 (i.e. same value). The other bits are not indeterminate: you can't shift a single bit (I assume you want something like ??????????1????? and ??????????0?????, but numbers don't work like that.)
This looks like a typical XY-problem. You likely want to use data to switch on or switch off a bit in something else? Ask about that. (EDIT: As said by Joachim Pileborg in question comments.)
You don't shift 0 or 1, you shift the bit values, be it either 0 or 1. In other words, you Shift the bit positions, regardless of the value stored in them.
From C11 standard, chapter ยง6.5.7. , Bitwise shift operators
The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros.
It is just like that, in case of 1s, the shifting is visible.
That said, < is not a bit-shift operator (as in your code), << is.
The good news is - you're way over-thinking it.
Take the whole number, say: 01011011.
Shift it left 1: 10110110.
The 'whole thing' shifts together; zeroes are shifted into the LSB. In the case of right-shifting, it's zeroes again for unsigned, or implementation-defined for signed numbers:
The result of E1 >> E2 is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.
(S6.5.7)
Which means right-shifting a signed number may result in either an arithmetic shift, or a logical shift.
If you want to shift 1s in to the low end, instead of 0s, just OR with a bit-mask for the number of places you shifted.
Shift operations are performed using a CPU register. The register consists of a number of bits (8, 16 and 32 are common, and you appear to have a 32 CPU), which in combination can be interpreted as a decimal value.
In your example you use the value 1. The C language allows the value one to be represented in several ways, all of which result in the same content of the CPU register:
decimal hexadecimal binary
1 0x00000001 0b00000000000000000000000000000001
3713883835 0xDD5D5EBB 0b11011101010111010101111010111011
(In all cases leading 0s can be omitted.)
In the binary representation, all of the CPU register bit values are specified. As you can see, this includes 0s as well as 1s. So when you place the value 1 in a register and perform a left shift using the << operator, the content of the entire register will be shifted left by one bit, with a 0 being placed in to the least significant bit.
In your code the value assigned to data will follow this pattern:
i = 0, data = 0x00000000000000000000000000000001
i = 1, data = 0x00000000000000000000000000000010
i = 2, data = 0x00000000000000000000000000000100
etc.
i = 30, data = 0x01000000000000000000000000000000
i = 31, data = 0x10000000000000000000000000000000
If you were to use a different value than 1 to shift, you would of course end up with a different pattern, but it would follow the same rules. I.e. all of the bits get shifted, regardless of whether they hold a 1 or a 0.
for(i=0;i<32;i++)
{
data = (165 << i) // 165 = 0xA5 = 0b10100101
}
Produces:
i = 0, data = 0x00000000000000000000000010100101
i = 1, data = 0x00000000000000000000000101001010
i = 2, data = 0x00000000000000000000001010010100
etc.
i = 28, data = 0x10100000000000000000000000000000
i = 29, data = 0x01000000000000000000000000000000
i = 30, data = 0x10000000000000000000000000000000
i = 31, data = 0x00000000000000000000000000000000
Note how the pattern disappears towards the end as the most significant bits get shifted out of the register.
The same rules hold for doing right shifts using the >> operator, except that the bits move from most significant towards least significant.
(Some CPUs also have Rotate and other wonderful bit manipulation instructions.)
I am trying to write my own C floor function. I am stuck on this code detail. I would just like to know how I can zero out the bottom n bits of an unsigned int.
For example, to round 51.5 to 51.0, I need to zero out the bottom 18 bits, and keep the top 14. Since it's a floor function, I want to make a mask to zero out the bottom (23 minus exponent) bits from the float representation. I know how to make a mask for individual cases like that, but I'm not sure how to code it so that it will work for all. Please help.
A much simpler way is doing just this:
value = (value >> bits) << bits
because the shift left will fill it in with zeroes, not whatever was in there.
Shift a number left N bits. Subtract one. Invert the bits. And with the number you need to mask.
1 << 14 = 00000000000000000010000000000000
-1 = 00000000000000000001111111111111
~ = 11111111111111111110000000000000
When you and with this, a 1 in the mask will preserve the input, and 0 in the mask will set the result to 0.
Can someone please explain this function to me?
A mask with the least significant n bits set to 1.
Ex:
n = 6 --> 0x2F, n = 17 --> 0x1FFFF // I don't get these at all, especially how n = 6 --> 0x2F
Also, what is a mask?
The usual way is to take a 1, and shift it left n bits. That will give you something like: 00100000. Then subtract one from that, which will clear the bit that's set, and set all the less significant bits, so in this case we'd get: 00011111.
A mask is normally used with bitwise operations, especially and. You'd use the mask above to get the 5 least significant bits by themselves, isolated from anything else that might be present. This is especially common when dealing with hardware that will often have a single hardware register containing bits representing a number of entirely separate, unrelated quantities and/or flags.
A mask is a common term for an integer value that is bit-wise ANDed, ORed, XORed, etc with another integer value.
For example, if you want to extract the 8 least significant digits of an int variable, you do variable & 0xFF. 0xFF is a mask.
Likewise if you want to set bits 0 and 8, you do variable | 0x101, where 0x101 is a mask.
Or if you want to invert the same bits, you do variable ^ 0x101, where 0x101 is a mask.
To generate a mask for your case you should exploit the simple mathematical fact that if you add 1 to your mask (the mask having all its least significant bits set to 1 and the rest to 0), you get a value that is a power of 2.
So, if you generate the closest power of 2, then you can subtract 1 from it to get the mask.
Positive powers of 2 are easily generated with the left shift << operator in C.
Hence, 1 << n yields 2n. In binary it's 10...0 with n 0s.
(1 << n) - 1 will produce a mask with n lowest bits set to 1.
Now, you need to watch out for overflows in left shifts. In C (and in C++) you can't legally shift a variable left by as many bit positions as the variable has, so if ints are 32-bit, 1<<32 results in undefined behavior. Signed integer overflows should also be avoided, so you should use unsigned values, e.g. 1u << 31.
For both correctness and performance, the best way to accomplish this has changed since this question was asked back in 2012 due to the advent of BMI instructions in modern x86 processors, specifically BLSMSK.
Here's a good way of approaching this problem, while retaining backwards compatibility with older processors.
This method is correct, whereas the current top answers produce undefined behavior in edge cases.
Clang and GCC, when allowed to optimize using BMI instructions, will condense gen_mask() to just two ops. With supporting hardware, be sure to add compiler flags for BMI instructions:
-mbmi -mbmi2
#include <inttypes.h>
#include <stdio.h>
uint64_t gen_mask(const uint_fast8_t msb) {
const uint64_t src = (uint64_t)1 << msb;
return (src - 1) ^ src;
}
int main() {
uint_fast8_t msb;
for (msb = 0; msb < 64; ++msb) {
printf("%016" PRIx64 "\n", gen_mask(msb));
}
return 0;
}
First, for those who only want the code to create the mask:
uint64_t bits = 6;
uint64_t mask = ((uint64_t)1 << bits) - 1;
# Results in 0b111111 (or 0x03F)
Thanks to #Benni who asked about using bits = 64. If you need the code to support this value as well, you can use:
uint64_t bits = 6;
uint64_t mask = (bits < 64)
? ((uint64_t)1 << bits) - 1
: (uint64_t)0 - 1
For those who want to know what a mask is:
A mask is usually a name for value that we use to manipulate other values using bitwise operations such as AND, OR, XOR, etc.
Short masks are usually represented in binary, where we can explicitly see all the bits that are set to 1.
Longer masks are usually represented in hexadecimal, that is really easy to read once you get a hold of it.
You can read more about bitwise operations in C here.
I believe your first example should be 0x3f.
0x3f is hexadecimal notation for the number 63 which is 111111 in binary, so that last 6 bits (the least significant 6 bits) are set to 1.
The following little C program will calculate the correct mask:
#include <stdarg.h>
#include <stdio.h>
int mask_for_n_bits(int n)
{
int mask = 0;
for (int i = 0; i < n; ++i)
mask |= 1 << i;
return mask;
}
int main (int argc, char const *argv[])
{
printf("6: 0x%x\n17: 0x%x\n", mask_for_n_bits(6), mask_for_n_bits(17));
return 0;
}
0x2F is 0010 1111 in binary - this should be 0x3f, which is 0011 1111 in binary and which has the 6 least-significant bits set.
Similarly, 0x1FFFF is 0001 1111 1111 1111 1111 in binary, which has the 17 least-significant bits set.
A "mask" is a value that is intended to be combined with another value using a bitwise operator like &, | or ^ to individually set, unset, flip or leave unchanged the bits in that other value.
For example, if you combine the mask 0x2F with some value n using the & operator, the result will have zeroes in all but the 6 least significant bits, and those 6 bits will be copied unchanged from the value n.
In the case of an & mask, a binary 0 in the mask means "unconditionally set the result bit to 0" and a 1 means "set the result bit to the input value bit". For an | mask, an 0 in the mask sets the result bit to the input bit and a 1 unconditionally sets the result bit to 1, and for an ^ mask, an 0 sets the result bit to the input bit and a 1 sets the result bit to the complement of the input bit.