How to detect in C whether your machine is 32-bits - c

So I am revising for an exam and I got stuck in this problem:
2.67 ◆◆
You are given the task of writing a procedure int_size_is_32() that yields 1
when run on a machine for which an int is 32 bits, and yields 0 otherwise. You are
not allowed to use the sizeof operator. Here is a first attempt:
1 /* The following code does not run properly on some machines */
2 int bad_int_size_is_32() {
3 /* Set most significant bit (msb) of 32-bit machine */
4 int set_msb = 1 << 31;
5 /* Shift past msb of 32-bit word */
6 int beyond_msb = 1 << 32;
7
8 /* set_msb is nonzero when word size >= 32
9 beyond_msb is zero when word size <= 32 */
10 return set_msb && !beyond_msb;
11 }
When compiled and run on a 32-bitSUNSPARC, however, this procedure returns 0. The following compiler message gives us an indication of the problem: warning: left shift count >= width of type
A. In what way does our code fail to comply with the C standard?
B. Modify the code to run properly on any machine for which data type int is
at least 32 bits.
C. Modify the code to run properly on any machine for which data type int is
at least 16 bits.
__________ MY ANSWERS:
A: When we shift by 31 in line 4, we overflow, bec according to the unsigned integer standard, the maximum unsigned integer we can represent is 2^31-1
B: In line 4 1<<30
C: In line 4 1<<14 and in line 6 1<<16
Am I right? And if not why please? Thank you!
__________ Second tentative answer:
B: In line 4 (1<<31)>>1 and in line 6: int beyond_msb = set_msb+1; I think I might be right this time :)

A: When we shift by 31 in line 4, we overflow, bec according to the unsigned integer standard, the maximum unsigned integer we can represent is 2^31-1
The error is on line 6, not line 4. The compiler message explains exactly why: shifting by a number of bits greater than the size of the type is undefined behavior.
B: In line 4 1<<30
C: In line 4 1<<14 and in line 6 1<<16
Both of those changes will cause the error to not appear, but will also make the function give incorrect results. You will need to understand how the function works (and how it doesn't work) before you fix it.

For first thing shifting by 30 will not create any overflow as max you can shift is word size w-1.
So when w = 32 you can shift till 31.
Overflow occurs when you shift it by 32 bits as lsb will now move to 33rd bit which is out of bound.
So the problem is in line 6 not 4.
For B.
0xffffffff + 1
If it is 32 bit then it will result 0 otherwise some nozero no.

There is absolutely no way to test the size of signed types in C at runtime. This is because overflow is undefined behavior; you cannot tell if overflow has happened. If you use unsigned int, you can just count how many types you can double a value that starts at 1 before the result becomes zero.
If you want to do the test at compile-time instead of runtime, this will work:
struct { int x:N; };
where N is replaced by successively larger values. The compiler is required to accept the program as long as N is no larger than the width of int, and reject it with a diagnostic/error when N is larger.

You should be able to comply with the C standard by breaking up the shifts left.
B -
Replace Line 6 with
int beyond_msb = (1 << 31) << 1;
C -
Replace Line 4 with
int set_msb = ((1 << 15) << 15) << 1 ;
Replace Line 6 with
int beyond_msb = ((1 << 15) << 15) << 2;
Also, as an extension to the question the following should satisify both B and C, and keep runtime error safe. Shifting left a bit at a time until it reverts back to all zeroes.
int int_size_is_32() {
//initialise our test integer variable.
int x = 1;
//count for checking purposes
int count = 0;
//keep shifting left 1 bit until we have got pushed the 1-bit off the left of the value type space.
while ( x != 0 ) {
x << 1 //shift left
count++;
}
return (count==31);
}

Related

2^32 - 1 not part of uint32_t?

Here is the program whose compilation output makes me cry:
#include <inttypes.h>
int main()
{
uint32_t limit = (1 << 32) - 1; // 2^32 - 1 right?
}
and here is the compilation output:
~/workspace/CCode$ gcc uint32.c
uint32.c: In function ‘main’:
uint32.c:5:29: warning: left shift count >= width of type [-Wshift-count-overflow]
uint32_t limit = (1 << 32) - 1; // 2^32 - 1 right?
I thought that (1 << 32) - 1 equals to 2^32 - 1 and that unsigned integers on 32 bits range from 0 to 2^32 - 1, isnt it the case? Where did I go wrong?
The warning is correct, the highest bit in a 32bit number is the 31st bit (0 indexed) so the largest shift before overflow is 1 << 30 (30 because of the sign bit). Even though you are doing -1 at some point the result of 1 << 32 must be stored and it will be stored in an int (which in this case happens to be 32 bits). Hence you get the warning.
If you really need to get the max of the 32 bit unsigned int you should do it the neat way:
#include <stdint.h>
uint32_t limit = UINT32_MAX;
Or better yet, use the c++ limits header:
#include <limits>
auto limit = std::numeric_limits<uint32_t>::max();
You have two errors:
1 is of type int, so you are computing the initial value as an int, not as a uint32_t.
As the warning says, shift operators must have their shift argument be less than the width of the type. 1 << 32 is undefined behavior if int is 32 bits or less. (uint32_t)1 << 32 would be undefined as well.
(also, note that 1 << 31 would be undefined behavior as well, if int is 32 bits, because of overflow)
Since arithmetic is done modulo 2^32 anyways, an easier way to do this is just
uint32_t x = -1;
uint32_t y = (uint32_t)0 - 1; // this way avoids compiler warnings
The compiler is using int internally in your example when trying to calculate the target constant. Imagine that rhe compiler didn't have any optimization available and was to generate assembler for your shift. The number 32 would be to big for the 32bit int shift instruction.
Also, if you want all bits set, use ~0

find ones position in 64 bit number

I'm trying to find the position of two 1's in a 64 bit number. In this case the ones are at the 0th and 63rd position. The code here returns 0 and 32, which is only half right. Why does this not work?
#include<stdio.h>
void main()
{
unsigned long long number=576460752303423489;
int i;
for (i=0; i<64; i++)
{
if ((number & (1 << i))==1)
{
printf("%d ",i);
}
}
}
There are two bugs on the line
if ((number & (1 << i))==1)
which should read
if (number & (1ull << i))
Changing 1 to 1ull means that the left shift is done on a value of type unsigned long long rather than int, and therefore the bitmask can actually reach positions 32 through 63. Removing the comparison to 1 is because the result of number & mask (where mask has only one bit set) is either mask or 0, and mask is only equal to 1 when i is 0.
However, when I make that change, the output for me is 0 59, which still isn't what you expected. The remaining problem is that 576460752303423489 (decimal) = 0800 0000 0000 0001 (hexadecimal). 0 59 is the correct output for that number. The number you wanted is 9223372036854775809 (decimal) = 8000 0000 0000 0001 (hex).
Incidentally, main is required to return int, not void, and needs an explicit return 0; as its last action (unless you are doing something more sophisticated with the return code). Yes, C99 lets you omit that. Do it anyway.
Because (1 << i) is a 32-bit int value on the platform you are compiling and running on. This then gets sign-extended to 64 bits for the & operation with the number value, resulting in bit 31 being duplicated into bits 32 through 63.
Also, you are comparing the result of the & to 1, which isn't correct. It will not be 0 if the bit is set, but it won't be 1.
Shifting a 32-bit int by 32 is undefined.
Also, your input number is incorrect. The bits set are at positions 0 and 59 (or 1 and 60 if you prefer to count starting at 1).
The fix is to use (1ull << i), or otherwise to right-shift the original value and & it with 1 (instead of left-shifting 1). And of course if you do left-shift 1 and & it with the original value, the result won't be 1 (except for bit 0), so you need to compare != 0 rather than == 1.
#include<stdio.h>
int main()
{
unsigned long long number = 576460752303423489;
int i;
for (i=0; i<64; i++)
{
if ((number & (1ULL << i))) //here
{
printf("%d ",i);
}
}
}
First is to use 1ULL to represent unsigned long long constant. Second is in the if statement, what you mean is not to compare with 1, that will only be true for the rightmost bit.
Output: 0 59
It's correct because 576460752303423489 is equal to 0x800000000000001
The problem could have been avoided in the first place by adopting the methodology of applying the >> operator to a variable, instead of a literal:
if ((variable >> other_variable) & 1)
...
I know the question has some time and multiple correct answers while my should be a comment, but is a bit too long for it. I advice you to encapsulate bit checking logic in a macro and don't use 64 number directly, but rather calculate it. Take a look here for quite comprehensive source of bit manipulation hacks.
#include<stdio.h>
#include<limits.h>
#define CHECK_BIT(var,pos) ((var) & (1ULL<<(pos)))
int main(void)
{
unsigned long long number=576460752303423489;
int pos=sizeof(unsigned long long)*CHAR_BIT-1;
while((pos--)>=0) {
if(CHECK_BIT(number,pos))
printf("%d ",pos);
}
return(0);
}
Rather than resorting to bit manipulation, one can use compiler facilities to perform bit analysis tasks in the most efficient manner (using only a single CPU instruction in many cases).
For example, gcc and clang provide those handy routines:
__builtin_popcountll() - number of bits set in the 64b value
__builtin_clzll() - number of leading zeroes in the 64b value
__builtin_ctzll() - number of trailing zeroes in the 64b value
__builtin_ffsll() - bit index of least significant set bit in the 64b value
Other compilers have similar mechanisms.

Left shift operator in C

Consider:
#include <stdio.h>
#define macro(a) a=a<<4;
main()
{
int a = 0x59;
printf("%x", a);
printf("\n");
macro(a)
printf("%x", a);
}
For the above code, I am getting the below output:
59
590
Why am I not getting the below output as the left shift operation?
59
90
Left shifts do not truncate the number to fit the length of the original one. To get 90, use:
(a<<4) & 0xff
0x59 is an int and probably on your platform it has sizeof(int)==4. Then it's a 0x00000059. Left shifting it by 4 gives 0x00000590.
Also, form a good habit of using unsigned int types when dealing with bitwise operators, unless you know what you are doing. They have different behaviours in situations like a right shift.
You shifted a hexadecimal number by 4 places to left so you get 590, which is correct.
You had
000001011001
shifted to left by 4 bits
010110010000
is 590 in hexadecimal
10010000
is 90 in hexadecimal, so you might want to remove 0101 as is shown by phoeagon.
In your printf, if you change %x to %d, you get a = 89.
And after left shifting you will get a = 1424.
Generally for decimal (base 10) numbers
a = a<< n is a = a*2^n
a = a>> n is a = a/2^n
For hexadecimal (base 16) numbers, any shift by n (left or right), can be considered, as a corresponding shift of the digits of the binary equivalent. But this depends on sizeof(int), used for a given compiler.
You are using int, so you have:
000001011001
If you shift it by 4 to the left, you get
010110010000
If you only want to have only the first 8 bits you don't have to use "int" but unsigned char (or char):
#include<stdio.h>
#define macro(a) a=a<<4;
main()
{
unsigned char a=0x59;
printf("%x",a);
printf("\n");
macro(a)
printf("%x",a);
}
If you still want to use int, but only keep the first 8 bits, you can use a mask:
#define macro(a) a=(a<<4) & 0xFF
So could you please tell me what should I do so as to get the output
as 0x90? I need to shift the last 4 bits to the first 4 bits adding 0's
at the end
The only way you can shift the 4 last bit 4 bit to the left AND get it in the place of the first 4 bit is if your type have just 8 bit. Usually this is the case of unsigned char, not int. You will get 0x90 for
unsigned char a=0x59;
macro(a)
but when using int the result is 0x590
The error is not with the use of the << is with the selection of the type. (or a misuse of macro?)

How to set the 513th bit of a char[1024] in C?

I was recently asked in an interview how to set the 513th bit of a char[1024] in C, but I'm unsure how to approach the problem. I saw How do you set, clear, and toggle a single bit?, but how do I choose the bit from such a large array?
int bitToSet = 513;
inArray[bitToSet / 8] |= (1 << (bitToSet % 8));
...making certain assumptions about character size and desired endianness.
EDIT: Okay, fine. You can replace 8 with CHAR_BIT if you want.
#include <limits.h>
int charContaining513thBit = 513 / CHAR_BIT;
int offsetOf513thBitInChar = 513 - charContaining513thBit*CHAR_BIT;
int bit513 = array[charContaining513thBit] >> offsetOf513thBitInChar & 1;
You have to know the width of characters (in bits) on your machine. For pretty much everyone, that's 8. You can use the constant CHAR_BIT from limits.h in a C program. You can then do some fairly simple math to find the offset of the bit (depending on how you count them).
Numbering bits from the left, with the 2⁷ bit in a[0] being bit 0, the 2⁰ bit being bit 7, and the 2⁷ bit in a[1] being bit 8, this gives:
offset = 513 / CHAR_BIT; /* using integer (truncating) math, of course */
bit = 513 % CHAR_BIT;
a[offset] |= (0x80>>bit)
There are many sane ways to number bits, here are two:
a[0] a[1]
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 This is the above
7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 This is |= (1<<bit)
You could also number from the other end of the array (treating it as one very large big-endian number).
Small optimization:
The / and % operators are rather slow, even on a lot of modern cpus, with modulus being slightly slower. I would replace them with the equivalent operations using bit shifting (and subtraction), which only works nicely when the second operand is a power of two, obviously.
x / 8 becomes x >> 3
x % 8 becomes x-((x>>3)<<3)
for this second operation, just reuse the result from the initial division.
Depending on the desired order (left to right versus right to left), it might change. But the general idea assuming 8 bits per byte would be to choose the byte as. This is expanded into lots of lines of code to hopefully show more clearly the intended steps (or perhaps it just obfuscates the intention):
int bitNum = 513;
int bytePos = bitNum / 8;
Then the bit position would be computed as:
int bitInByte = bitNum % 8;
Then set the bit (assuming the goal is to set it to 1 as opposed to clear or toggle it):
charArray[bytePos] |= ( 1 << bitInByte );
When you say 513th are you using index 0 or 1 for the 1st bit? If it's the former your post refers to the bit at index 512. I think the question is valid since everywhere else in C the first index is always 0.
BTW
static char chr[1024];
...
chr[512>>3]=1<<(512&0x7);

How do the bit manipulations in this bit-sorting code work?

Jon Bentley in Column 1 of his book programming pearls introduces a technique for sorting a sequence of non-zero positive integers using bit vectors.
I have taken the program bitsort.c from here and pasted it below:
/* Copyright (C) 1999 Lucent Technologies */
/* From 'Programming Pearls' by Jon Bentley */
/* bitsort.c -- bitmap sort from Column 1
* Sort distinct integers in the range [0..N-1]
*/
#include <stdio.h>
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];
void set(int i)
{
int sh = i>>SHIFT;
a[i>>SHIFT] |= (1<<(i & MASK));
}
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }
int main()
{ int i;
for (i = 0; i < N; i++)
clr(i);
/*Replace above 2 lines with below 3 for word-parallel init
int top = 1 + N/BITSPERWORD;
for (i = 0; i < top; i++)
a[i] = 0;
*/
while (scanf("%d", &i) != EOF)
set(i);
for (i = 0; i < N; i++)
if (test(i))
printf("%d\n", i);
return 0;
}
I understand what the functions clr, set and test are doing and explain them below: ( please correct me if I am wrong here ).
clr clears the ith bit
set sets the ith bit
test returns the value at the ith bit
Now, I don't understand how the functions do what they do. I am unable to figure out all the bit manipulation happening in those three functions.
The first 3 constants are inter-related. BITSPERWORD is 32. This you'd want to set based on your compiler+architecture. SHIFT is 5, because 2^5 = 32. Finally, MASK is 0x1F which is 11111 in binary (ie: the bottom 5 bits are all set). Equivalently, MASK = BITSPERWORD - 1.
The bitset is conceptually just an array of bits. This implementation actually uses an array of ints, and assumes 32 bits per int. So whenever we want to set, clear or test (read) a bit we need to figure out two things:
which int (of the array) is it in
which of that int's bits are we talking about
Because we're assuming 32 bits per int, we can just divide by 32 (and truncate) to get the array index we want. Dividing by 32 (BITSPERWORD) is the same as shifting to the right by 5 (SHIFT). So that's what the a[i>>SHIFT] bit is about. You could also write this as a[i/BITSPERWORD] (and in fact, you'd probably get the same or very similar code assuming your compiler has a reasonable optimizer).
Now that we know which element of a we want, we need to figure out which bit. Really, we want the remainder. We could do this with i%BITSPERWORD, but it turns out that i&MASK is equivalent. This is because BITSPERWORD is a power of 2 (2^5 in this case) and MASK is the bottom 5 bits all set.
Basically is a bucket sort optimized:
reserve a bit array of length n
bits.
clear the bit array (first for in main).
read the items one by one (they must all be distinct).
set the i'th bit in the bit array if the read number is i.
iterate the bit array.
if the bit is set then print the position.
Or in other words (for N < 10 and to sort 3 numbers 4, 6, 2) 0
start with an empty 10 bit array (aka one integer usually)
0000000000
read 4 and set the bit in the array..
0000100000
read 6 and set the bit in the array
0000101000
read 2 and set the bit in the array
0010101000
iterate the array and print every position in which the bits are set to one.
2, 4, 6
sorted.
Starting with set():
A right shift of 5 is the same as dividing by 32. It does that to find which int the bit is in.
MASK is 0x1f or 31. ANDing with the address gives the bit index within the int. It's the same as the remainder of dividing the address by 32.
Shifting 1 left by the bit index ("1<<(i & MASK)") results in an integer which has just 1 bit in the given position set.
ORing sets the bit.
The line "int sh = i>>SHIFT;" is a wasted line, because they didn't use sh again beneath it, and instead just repeated "i>>SHIFT"
clr() is basically the same as set, except instead of ORing with 1<<(i & MASK) to set the bit, it ANDs with the inverse to clear the bit. test() ANDs with 1<<(i & MASK) to test the bit.
The bitsort will also remove duplicates from the list, because it will only count up to 1 per integer. A sort that uses integers instead of bits to count more than 1 of each is called a radix sort.
The bit magic is used as a special addressing scheme that works well with row sizes that are powers of two.
If you try understand this (note: I rather use bits-per-row than bits-per-word, since we're talking about a bit-matrix here):
// supposing an int of 1 bit would exist...
int1 bits[BITSPERROW * N]; // an array of N x BITSPERROW elements
// set bit at x,y:
int linear_address = y*BITSPERWORD + x;
bits + linear_address = 1; // or 0
// 0 1 2 3 4 5 6 7 8 9 10 11 ... 31
// . . . . . . . . . . . . .
// . . . . X . . . . . . . . -> x = 4, y = 1 => i = (1*32 + 4)
The statement linear_address = y*BITSPERWORD + x also means that x = linear_address % BITSPERWORD and y = linear_address / BITSPERWORD.
When you optimize this in memory by using 1 word of 32 bits per row, you get the fact that a bit at column x can be set using
int bitrow = 0;
bitrow |= 1 << (x);
Now when we iterate over the bits, we have the linear address, but need to find the corresponding word.
int column = linear_address % BITSPERROW;
int bit_mask = 1 << column; // meaning for the xth column,
// you take 1 and shift that bit x times
int row = linear_address / BITSPERROW;
So to set the i'th bit, you can do this:
bits[ i%BITSPERROW ] |= 1 << (linear_address / BITSPERROW );
An extra gotcha is, that the modulo operator can be replaced by a logical AND, and the / operator can be replaced by a shift, too, if the second operand is a power of two.
a % BITSPERROW == a & ( BITSPERROW - 1 ) == a & MASK
a / BITSPERROW == a >> ( log2(BITSPERROW) ) == a & SHIFT
This ultimately boils down to the very dense, yet hard-to-understand-for-the-bitfucker-agnostic notation
a[ i >> SHIFT ] |= ( 1 << (i&MASK) );
But I don't see the algorithm working for e.g. 40 bits per word.
Quoting the excerpts from Bentleys' original article in DDJ, this is what the code does at a high level:
/* phase 1: initialize set to empty */
for (i = 0; i < n; i++)
bit[i] = 0
/* phase 2: insert present elements */
for each i in the input file
bit[i] = 1
/* phase 3: write sorted output */
for (i = 0; i < n; i++)
if bit[i] == 1
write i on the output file
A few doubts :
1. Why is it a need for a 32 bit ?
2. Can we do this in Java by creating a HashMap with Keys from 0000000 to 9999999
and values 0 or 1 based on the presence/absence of the bit ? What are the implications
for such a program ?

Resources