MSB (1-bit most left) index in C - c

How can I get the most significative 1-bit index from an unsigned integer (uint16_t)?
Example:
uint16_t x = // 0000 0000 1111 0000 = 240
printf("ffs=%d", __builtin_ffs(allowed)); // ffs=4
There is a function (__builtin_ffs) that return the least significative 1-bit (LSB) from a unsigned integer.
I want something opposite, I want some function which returns 8 applied to above example.
Remark: I have tried building my own function but I have found some problems with datatype size, which depends by compiler.

From the GCC manual at http://gcc.gnu.org/onlinedocs/gcc-4.1.2/gcc/Other-Builtins.html:
Built-in Function: int __builtin_clz (unsigned int x)
Returns the number of leading 0-bits in x, starting at the most significant bit position. If x is 0, the result is undefined.
So, highest set bit:
#define ONE_BASED_INDEX_OF_HIGHEST_SET_BIT(x) \
(CHAR_BIT * sizeof 1 - __builtin_clz(x)) // 1-based index!!
beware of x == 0 or x<0 && sizeof(x)<sizeof 0 though.

if I am reading this right (am I? I'm a little rusty on this stuff) you can do this as follows:
int msb = 0;
while(x) { // while there are still bits
x >>= 1; // right-shift the argument
msb++; // each time we right shift the argument, increment msb
}

Related

Negative numbers: How can I change the sign bit in a signed int to a 0?

I was thinking this world work, but it does not:
int a = -500;
a = a << 1;
a = (unsigned int)a >> 1;
//printf("%d",a) gives me "2147483148"
My thought was that the left-shift would remove the leftmost sign bit, so right-shifting it as an unsigned int would guarantee that it's a logical shift rather than arithmetic. Why is this incorrect?
Also:
int a = -500;
a = a << 1;
//printf("%d",a) gives me "-1000"
TL;DR: the easiest way is to use the abs function from <stdlib.h>. The rest of the answer involves the representation of negative numbers on a computer.
Negative integers are (almost always) represented in 2's complement form. (see note below)
The method of getting the negative of a number is:
Take the binary representation of the whole number (including leading zeroes for the data type, except the MSB which will serve as the sign bit).
Take the 1's complement of the above number.
Add 1 to the 1's complement.
Prefix a sign bit.
Using 500 as an example,
Take the binary representation of 500: _000 0001 1111 0100 (_ is a placeholder for the sign bit).
Take the 1's-complement / inverse of it: _111 1110 0000 1011
Add 1 to the 1's complement: _111 1110 0000 1011 + 1 = _111 1110 0000 1100. This is the same as 2147483148 that you obtained, when you replaced the sign-bit by zero.
Prefix 0 to show a positive number and 1 for a negative number: 1111 1110 0000 1100. (This will be different from 2147483148 above. The reason you got the above value is because you nuked the MSB).
Inverting the sign is a similar process. You get leading ones if you use 16-bit or 32-bit numbers leading to the large value that you see. The LSB should be the same in each case.
Note: there are machines with 1's complement representation, but they are a minority. The 2's complement is usually preferred because 0 has the same representation, i.e., -0 and 0 are represented as all-zeroes in the 2's complement notation.
Left-shifting negative integers invokes undefined behavior, so you can't do that. You could have used your code if you did a = (unsigned int)a << 1;. You'd get 500 = 0xFFFFFE0C, left-shifted 1 = 0xFFFFFC18.
a = (unsigned int)a >> 1; does indeed guarantee logical shift, so you get 0x7FFFFE0C. This is decimal 2147483148.
But this is needlessly complex. The best and most portable way to change the sign bit is simply a = -a. Any other code or method is questionable.
If you however insist on bit-twiddling, you could also do something like
(int32_t)a & ~(1u << 31)
This is portable to 32 bit systems, since (int32_t) guarantees two's complement, but 1u << 31 assumes 32 bit int type.
Demo:
#include <stdio.h>
#include <stdint.h>
int main (void)
{
int a = -500;
a = (unsigned int)a << 1;
a = (unsigned int)a >> 1;
printf("%.8X = %d\n", a, a);
_Static_assert(sizeof(int)>=4, "Int must be at least 32 bits.");
a = -500;
a = (int32_t)a & ~(1u << 31);
printf("%.8X = %d\n", a, a);
return 0;
}
As you put in the your "Also" section, after your first left shift of 1 bit, a DOES reflect -1000 as expected.
The issue is in your cast to unsigned int. As explained above, the negative number is represented as 2's complement, meaning the sign is determined by the left most bit (most significant bit). When cast to an unsigned int, that value no longer represents sign but increases the maximum value your int can take.
Assuming 32 bit ints, the MSB used to represent -2^31 (= -2147483648) and now represents positive 2147483648 in an unsigned int, for an increase of 2* 2147483648 = 4294967296. Add this to your original value of -1000 and you get 4294966296. Right shift divides this by 2 and you arrive at 2147483148.
Hoping this may be helpful: (modified printing func from Print an int in binary representation using C)
void int2bin(int a, char *buffer, int buf_size) {
buffer += (buf_size - 1);
for (int i = buf_size-1; i >= 0; i--) {
*buffer-- = (a & 1) + '0';
a >>= 1;
}
}
int main() {
int test = -500;
int bufSize = sizeof(int)*8 + 1;
char buf[bufSize];
buf[bufSize-1] = '\0';
int2bin(test, buf, bufSize-1);
printf("%i (%u): %s\n", test, (unsigned int)test, buf);
//Prints: -500 (4294966796): 11111111111111111111111000001100
test = test << 1;
int2bin(test, buf, bufSize-1);
printf("%i (%u): %s\n", test, (unsigned int)test, buf);
//Prints: -1000 (4294966296): 11111111111111111111110000011000
test = 500;
int2bin(test, buf, bufSize-1);
printf("%i (%u): %s\n", test, (unsigned int)test, buf);
//Prints: 500 (500): 00000000000000000000000111110100
return 0;
}

Mask and extract bits in C

I've been looking at posts about masks, but I still can't get my head around how to extract certain bits from a number in C.
Say if we have an integer number, 0001 1010 0100 1011, its hexadecimal representation is 0x1A4B, right? If I want to know the 5th to 7th number, which is 101 in this case, shall I use int mask= 0x0000 1110 0000 0000, int extract = mask&number?
Also, how can I check if it is 101? I guess == won't work here...
Masking is done by setting all the bits except the one(s) you want to 0. So let's say you have a 8 bit variable and you want to check if the 5th bit from the is a 1. Let's say your variable is 00101100. To mask all the other bits we set all the bits except the 5th one to 0 using the & operator:
00101100 & 00010000
Now what this does is for every bit except the 5th one, the bit from the byte on the right will be 0, so the result of the & operation will be 0. For the 5th bit, however, the value from the right bit is a 1, so the result will be whatever the value of hte 5th bit from the left byte is - in this case 0:
Now to check this value you have to compare it with something. To do this, simply compare the result with the byte on the right:
result = (00101100 & 00010000) == 00000000
To generalize this, you can retrieve any bit from the lefthand byte simply by left-shifting 00000001 until you get the bit you want. The following function achieves this:
int getBit(char byte, int bitNum)
{
return (byte & (0x1 << (bitNum - 1)))
}
This works on vars of any size, whether it's 8, 16, 32 or 64 (or anything else for that matter).
Assuming the GCC extension 0b to define binary literals:
int number = 0b0001101001001011; /* 0x1A4B */
int mask = 0b0000111000000000; /* 0x0E00 */
/* &'ed: 0b0000101000000000; 0x0A00 */
int extract = mask & number; /* 0x0A00 */
if (extract == 0b0000101000000000)
/* Or if 0b is not available:
if (extract == 0x0a00 ) */
{
/* Success */
}
else
{
/* Failure */
}
You need to mask and shift. Either shift the value you are comparing to, or the value you are comparing. I find it easier to think about by shifting the value you are comparing to. So if you're trying to extract the 5th to 7th digits (from the left), you shift right 9 positions (16-7) so that the 7th digit is now the rightmost, then apply 0x7 (111 in binary) as a mask to get only the rightmost three binary digits
int i = 0x1A4B;
if (((i >> 9) & 0x07) == 0x05) { // 0x05 = 101 in binary
//do what you need to
}
First, the digits in binary are (usually) counted from the right (10th and 12th digit) or you say 5th and 7th most significant digits.
int mask = 0x0E00; // 0000 1110 0000 0000;
int extract = mask & number;
results in:
extract = 0000 1010 0000 0000
You can do
if (extract == 0x0A00 /*0000 1010 0000 0000*/){}
to test, or:
if (( extract >> 9 ) == 0x05){}
Both of the statements in the if will return true with your sample number.
Usually with a mask you will find yourself testing a single digit. You could use a function like this to test it:
bool digit_value( unsigned int number, unsigned int digit)
{
return (1 << digit) & number;
}
int main()
{
unsigned int number = 0x1A4B;
int should_be_three = 0;
should_be_three += digit_value(number, 10);
should_be_three += !digit_value(number, 11);
should_be_three += digit_value(number, 12);
printf("%s", (should_be_three == 3?"it worked":"it didn't work"));
return 0;
}
It may be simpler to check bits one-by-one, not all at once.
At first, you create mask for interested bit:
int fifthBitMask = 1 << 4;
int fifthBitResult = number & fifthBitMask;
int seventhBitMask = 1 << 6;
int seventhBitResult = number & seventhBitMask;
Now, you can compare results with zero OR with mask.
Comparing with zero can be omitted, so you can just use simple if:
if (fifthBitResult && seventhBitResult)
{
//your code here
}
Also, you can compare with masks. After operation &, in result will set only bits, which was set in mask.
So, it could like this:
if (fifthBitResult == fifthBitMask && seventhBitResult == seventhBitMask)
{
// your code here
}
So, if result of operation is equals to mask, you can do this with one operation:
int mask = 0x5 << 4; // 0x5 is hex representation of 101b
int result = number & mask;
if (result == mask)
{
// your code here
}
shall I use int mask= 0x0000 1110 0000 0000, int extract = mask&number?-
Yes, you can do this.
Also, how can I check if it is 101?
Sure you can check this-
0000 1010 0000 0000 which is 1280 in int.
extract== 1280
First of all, your calculation for bits 7-6-5 is incorrect. You stated it was 101, but it is 010 (for x1a43).
Second of all, to get these bits (the value represented by these bits) you should do &0xE0.
int my_bits_from_5to7 = number & 0xE0;

How do I extract bits from 32 bit number

I have do not have much knowledge of C and I'm stuck with a problem since one of my colleague is on leave.
I have a 32 bit number and i have to extract bits from it. I did go through a few threads but I'm still not clear how to do so. I would be highly obliged if someone can help me.
Here is an example of what I need to do:
Assume hex number = 0xD7448EAB.
In binary = 1101 0111 0100 0100 1000 1110 1010 1011.
I need to extract the 16 bits, and output that value. I want bits 10 through 25.
The lower 10 bits (Decimal) are ignored. i.e., 10 1010 1011 are ignored.
And the upper 6 bits (Overflow) are ignored. i.e. 1101 01 are ignored.
The remaining 16 bits of data needs to be the output which is 11 0100 0100 1000 11 (numbers in italics are needed as the output).
This was an example but I will keep getting different hex numbers all the time and I need to extract the same bits as I explained.
How do I solve this?
Thank you.
For this example you would output 1101 0001 0010 0011, which is 0xD123, or 53,539 decimal.
You need masks to get the bits you want. Masks are numbers that you can use to sift through bits in the manner you want (keep bits, delete/clear bits, modify numbers etc). What you need to know are the AND, OR, XOR, NOT, and shifting operations. For what you need, you'll only need a couple.
You know shifting: x << y moves bits from x *y positions to the left*.
How to get x bits set to 1 in order: (1 << x) - 1
How to get x bits set to 1, in order, starting from y to y + x: ((1 << x) -1) << y
The above is your mask for the bits you need. So for example if you want 16 bits of 0xD7448EAB, from 10 to 25, you'll need the above, for x = 16 and y = 10.
And now to get the bits you want, just AND your number 0xD7448EAB with the mask above and you'll get the masked 0xD7448EAB with only the bits you want. Later, if you want to go through each one, you'll need to shift your result by 10 to the right and process each bit at a time (at position 0).
The answer may be a bit longer, but it's better design than just hard coding with 0xff or whatever.
OK, here's how I wrote it:
#include <stdint.h>
#include <stdio.h>
main() {
uint32_t in = 0xd7448eab;
uint16_t out = 0;
out = in >> 10; // Shift right 10 bits
out &= 0xffff; // Only lower 16 bits
printf("%x\n",out);
}
The in >> 10 shifts the number right 10 bits; the & 0xffff discards all bits except the lower 16 bits.
I want bits 10 through 25.
You can do this:
unsigned int number = 0xD7448EAB;
unsigned int value = (number & 0x3FFFC00) >> 10;
Or this:
unsigned int number = 0xD7448EAB;
unsigned int value = (number >> 10) & 0xFFFF;
I combined the top 2 answers above to write a C program that extracts the bits for any range of bits (not just 10 through 25) of a 32-bit unsigned int. The way the function works is that it returns bits lo to hi (inclusive) of num.
#include <stdio.h>
#include <stdint.h>
unsigned extract(unsigned num, unsigned hi, unsigned lo) {
uint32_t range = (hi - lo + 1); //number of bits to be extracted
//shifting a number by the number of bits it has produces inconsistent
//results across machines so we need a special case for extract(num, 31, 0)
if(range == 32)
return num;
uint32_t result = 0;
//following the rule above, ((1 << x) - 1) << y) makes the mask:
uint32_t mask = ((1 << range) -1) << lo;
//AND num and mask to get only the bits in our range
result = num & mask;
result = result >> lo; //gets rid of trailing 0s
return result;
}
int main() {
unsigned int num = 0xd7448eab;
printf("0x%x\n", extract(num, 10, 25));
}

C: bit operations on a variable-length bit string

I'm doing some bit operations on a variable-length bit string.
I defined a function setBits(char *res, int x, int y) that should work on that bit string passed by the *res variable, given a x and y (just to mention, I'm trying to implement something like a Bloom filter using 8 bits per x):
void setBits(char *res, int x, int y)
{
*res |= x << (y * 8)
}
E.g. given the following x-y-vectors {0,0} ; {0,1} ; {1,2} ; {2,3}, I expect a bit string like this (or vice-versa depending whether little- or big-endian, but that isn't important right now):
0000 0010 0000 0001 0000 0000 0000 0000
So the lowest 8 bits should come from {0,0}, the second 8 bits from {0,1}, the next 8 bits come from {1,2} and the last from {2,3}.
Unfortunately, and I don't seem to get the reason for that, setBits always returns only the last result (in this case i.e. the bit string from {2,3}). I debugged the code and realized that *res is always 0 - but why? What am I doing wrong? Is it that I chose char* that it doesn't work or am I completely missing something very stupid?
Assuming 8-bit chars, the maximum value you can store in *res is 0xff i.e. (1<<8)-1.
Consider what happens when you call setBits for x=1, y=1
x << (y * 8) == 1 << (1 * 8)
== 1 << 8
== 0x100
*res is an 8-bit value so can only store the bottom 8 bits of this calculation. For any non-zero value of y, the bits which can be stored in *res are guaranteed to be 0.

C bitwise shift

I suppose sizeof(char) is one byte. Then when I write following code,
#include<stdio.h>
int main(void)
{
char x = 10;
printf("%d", x<<5);
}
The output is 320
My question is, if char is one byte long and value is 10, it should be:
0000 1010
When I shift by 5, shouldn't it become:
0100 0001
so why is output 320 and not 65?
I am using gcc on Linux and checked that sizeof(char) = 1
In C, all intermediates that are smaller than int are automatically promoted to int.
Therefore, your char is being promoted to larger than 8 bits.
So your 0000 1010 is being shifted up by 5 bits to get 320. (nothing is shifted off the top)
If you want to rotate, you need to do two shifts and a mask:
unsigned char x = 10;
x = (x << 5) | (x >> 3);
x &= 0xff;
printf("%d", x);
It's possible to do it faster using inline assembly or if the compiler supports it, intrinsics.
Mysticial is right. If you do
char x = 10;
printf("%c", x);
It prints "#", which, if you check your ASCII table, is 64.
0000 1010 << 5 = 0001 0100 0000
You had overflow, but since it was promoted to an int, it just printed the number.
Because what you describe is a rotate, not a shift. 0 is always shifted in on left shifts.

Resources