How to set the 513th bit of a char[1024] in C? - c

I was recently asked in an interview how to set the 513th bit of a char[1024] in C, but I'm unsure how to approach the problem. I saw How do you set, clear, and toggle a single bit?, but how do I choose the bit from such a large array?

int bitToSet = 513;
inArray[bitToSet / 8] |= (1 << (bitToSet % 8));
...making certain assumptions about character size and desired endianness.
EDIT: Okay, fine. You can replace 8 with CHAR_BIT if you want.

#include <limits.h>
int charContaining513thBit = 513 / CHAR_BIT;
int offsetOf513thBitInChar = 513 - charContaining513thBit*CHAR_BIT;
int bit513 = array[charContaining513thBit] >> offsetOf513thBitInChar & 1;

You have to know the width of characters (in bits) on your machine. For pretty much everyone, that's 8. You can use the constant CHAR_BIT from limits.h in a C program. You can then do some fairly simple math to find the offset of the bit (depending on how you count them).
Numbering bits from the left, with the 2⁷ bit in a[0] being bit 0, the 2⁰ bit being bit 7, and the 2⁷ bit in a[1] being bit 8, this gives:
offset = 513 / CHAR_BIT; /* using integer (truncating) math, of course */
bit = 513 % CHAR_BIT;
a[offset] |= (0x80>>bit)
There are many sane ways to number bits, here are two:
a[0] a[1]
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 This is the above
7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 This is |= (1<<bit)
You could also number from the other end of the array (treating it as one very large big-endian number).

Small optimization:
The / and % operators are rather slow, even on a lot of modern cpus, with modulus being slightly slower. I would replace them with the equivalent operations using bit shifting (and subtraction), which only works nicely when the second operand is a power of two, obviously.
x / 8 becomes x >> 3
x % 8 becomes x-((x>>3)<<3)
for this second operation, just reuse the result from the initial division.

Depending on the desired order (left to right versus right to left), it might change. But the general idea assuming 8 bits per byte would be to choose the byte as. This is expanded into lots of lines of code to hopefully show more clearly the intended steps (or perhaps it just obfuscates the intention):
int bitNum = 513;
int bytePos = bitNum / 8;
Then the bit position would be computed as:
int bitInByte = bitNum % 8;
Then set the bit (assuming the goal is to set it to 1 as opposed to clear or toggle it):
charArray[bytePos] |= ( 1 << bitInByte );

When you say 513th are you using index 0 or 1 for the 1st bit? If it's the former your post refers to the bit at index 512. I think the question is valid since everywhere else in C the first index is always 0.
BTW
static char chr[1024];
...
chr[512>>3]=1<<(512&0x7);

Related

Creating a byte (8 bits) from 4 2 bits

I am trying to figure out a way to get as much out of the limited memory in my microcontroller (32kb) and am seeking suggestions or pointers to an algorithm that performs what I am attempting to do.
Some background: I am sending Manchester Encoded bits out a SPI (Serial Peripheral Interface) directly from DMA. As the smallest possible unit I can store data into DMA is a byte (8 bits), I am having to represent my 1's as 0b11110000 and my 0's as 0b00001111. This basically means that for every bit of information, I need to use a byte (8 bits) of memory. Which is very inefficient.
If I could reduce this, so that my 1's are represented as 0b10 and my 0's as 0b01, I'd only have to use a 1/4 of a byte (2 bits) for every 1 bit of memory, which is fine for my solution.
Now, if I could save to DMA in bits, this would not be a problem, but of course I need to work with bytes. So I know the solution to my problem involves collecting the 8 bits (or in my case, 4 2bits) and then storing to DMA as a byte.
Questions:
Is there a standard way to solve this problem?
How can I some how create a 8 bit number from a collection of 4 2 bit numbers? But I do not want the addition of these numbers, but the actual way it looks when collected together.
For example: I have the following 4 2 bit numbers (keeping in mind that 0b10 represents 1 and 0b01 represents 0) (Also, the type these are stored in is open to the solution, as obviously there is no such thing as a 2 bit type)
Number1: 0b01 Number 2: 0b10 Number 3: 0b10 Number4: 0b01
And I want to create the following 8 bit number from these:
8 Bit Number: 0b01 10 10 01 or without the spaces 0b01101001 (0x69)
I am programming in c
It seems that you can pack four numbers a, b, c, d, all of which of value zero or one, like so:
64 * (a + 1) + 16 * (b + 1) + 4 * (c + 1) + (d + 1)
This is using the fact that x + 1 encodes your two-bit integer: 1 becomes 0b10, and 0 becomes 0b01.
It's Manchester encoding so 0b11110000 and 0b00001111 should be the only candidates. If so, then reduce the memory by a factor of 8.
uint8_t PackedByte = 0;
for (i=0; i<8; i++) {
PackedByte <<= 1;
if (buf[i] == 0xF0) // 0b11110000
PackedByte++;
}
Other other hand, if it's Manchester encoding and one may not have perfect encoding, then there are 3 results: 0, 1, indeterminate.
uint8_t PackedByte = 0;
for (i=0; i<8; i++) {
int upper = BitCount(buf[i] >> 4);
int lower = BitCount(buf[i] & 0xF);
if (upper > lower)
PackedByte++;
else if (upper == lower)
Hande_Indeterminate();
}
Various simplifications absent in the above, but shown for logic flow.
To number get abcd from (a,b,c,d) you need to shift the number to their places and OR :-
(a<<6)|(b<<4)|(c<<2)|d

How to detect in C whether your machine is 32-bits

So I am revising for an exam and I got stuck in this problem:
2.67 ◆◆
You are given the task of writing a procedure int_size_is_32() that yields 1
when run on a machine for which an int is 32 bits, and yields 0 otherwise. You are
not allowed to use the sizeof operator. Here is a first attempt:
1 /* The following code does not run properly on some machines */
2 int bad_int_size_is_32() {
3 /* Set most significant bit (msb) of 32-bit machine */
4 int set_msb = 1 << 31;
5 /* Shift past msb of 32-bit word */
6 int beyond_msb = 1 << 32;
7
8 /* set_msb is nonzero when word size >= 32
9 beyond_msb is zero when word size <= 32 */
10 return set_msb && !beyond_msb;
11 }
When compiled and run on a 32-bitSUNSPARC, however, this procedure returns 0. The following compiler message gives us an indication of the problem: warning: left shift count >= width of type
A. In what way does our code fail to comply with the C standard?
B. Modify the code to run properly on any machine for which data type int is
at least 32 bits.
C. Modify the code to run properly on any machine for which data type int is
at least 16 bits.
__________ MY ANSWERS:
A: When we shift by 31 in line 4, we overflow, bec according to the unsigned integer standard, the maximum unsigned integer we can represent is 2^31-1
B: In line 4 1<<30
C: In line 4 1<<14 and in line 6 1<<16
Am I right? And if not why please? Thank you!
__________ Second tentative answer:
B: In line 4 (1<<31)>>1 and in line 6: int beyond_msb = set_msb+1; I think I might be right this time :)
A: When we shift by 31 in line 4, we overflow, bec according to the unsigned integer standard, the maximum unsigned integer we can represent is 2^31-1
The error is on line 6, not line 4. The compiler message explains exactly why: shifting by a number of bits greater than the size of the type is undefined behavior.
B: In line 4 1<<30
C: In line 4 1<<14 and in line 6 1<<16
Both of those changes will cause the error to not appear, but will also make the function give incorrect results. You will need to understand how the function works (and how it doesn't work) before you fix it.
For first thing shifting by 30 will not create any overflow as max you can shift is word size w-1.
So when w = 32 you can shift till 31.
Overflow occurs when you shift it by 32 bits as lsb will now move to 33rd bit which is out of bound.
So the problem is in line 6 not 4.
For B.
0xffffffff + 1
If it is 32 bit then it will result 0 otherwise some nozero no.
There is absolutely no way to test the size of signed types in C at runtime. This is because overflow is undefined behavior; you cannot tell if overflow has happened. If you use unsigned int, you can just count how many types you can double a value that starts at 1 before the result becomes zero.
If you want to do the test at compile-time instead of runtime, this will work:
struct { int x:N; };
where N is replaced by successively larger values. The compiler is required to accept the program as long as N is no larger than the width of int, and reject it with a diagnostic/error when N is larger.
You should be able to comply with the C standard by breaking up the shifts left.
B -
Replace Line 6 with
int beyond_msb = (1 << 31) << 1;
C -
Replace Line 4 with
int set_msb = ((1 << 15) << 15) << 1 ;
Replace Line 6 with
int beyond_msb = ((1 << 15) << 15) << 2;
Also, as an extension to the question the following should satisify both B and C, and keep runtime error safe. Shifting left a bit at a time until it reverts back to all zeroes.
int int_size_is_32() {
//initialise our test integer variable.
int x = 1;
//count for checking purposes
int count = 0;
//keep shifting left 1 bit until we have got pushed the 1-bit off the left of the value type space.
while ( x != 0 ) {
x << 1 //shift left
count++;
}
return (count==31);
}

Combining two integers with bit-shifting

I am writing a program, I have to store the distances between pairs of numbers in a hash table.
I will be given a Range R. Let's say the range is 5.
Now I have to find distances between the following pairs:
1 2
1 3
1 4
1 5
2 3
2 4
2 5
3 4
3 5
4 5
that is, the total number of pairs is (R^2/2 -R). I want to store it in a hash table. All these are unsigned integers. So there are 32 bits. My idea was that, I take an unsigned long (64 bits).
Let's say I need to hash the distance between 1 and 5. Now
long k = 1;
k = k<<31;
k+=5;
Since I have 64 bits, I am storing the first number in the first 31 bits and the second number in the second 31 bits. This guarantees unique keys which can then be used for hashing.
But when I do this:
long k = 2;
k << 31;
k+= 2;
The value of k becomes zero.
I am not able to wrap my head around this shifting concept.
Essentially what I am trying to achieve is that,
An unsigned long has | 32bits | 32 bits |
Store |1st integer|2nd integer|
How can I achieve this to get unique keys for each pair?
I am running the code on a 64 bit AMD Opteron processor. sizeof(ulong) returns 8. So it is 64 bits. Do I need a long long in such a case?
Also I need to know if this will create unique keys? From my understanding , it does seem to create unique keys. But I would like a confirmation.
Assuming you're using C or something that follows vaguely similar rules, your problem is primarily with types.
long k = 2; // This defines `k` a a long
k << 31; // This (sort of) shifts k left, but still as a 32-bit long.
What you almost certainly want/need to do is convert k to a long long before you shift it left, so you're shifting in a 64-bit word.
unsigned long first_number = 2;
unsigned long long both_numbers = (unsigned long long)first_number << 32;
unsigned long second_number = 5;
both_numbers |= second_number;
In this case, if (for example) you print out both_numbers, in hexadecimal, you should get 0x0000000200000005.
The concept makes sense. As Oli has added, you want to shift by 32, not 31 - shifting by 31 will put it in the 31st bit, so if you shifted back to the right to try and get the first number you would end up with a bit missing, and the second number would be potentially huge because you could have put a 1 in the uppermost bit.
But if you want to do bit manipulation, I would do instead:
k = 1 << 32;
k = k|5;
It really should produce the same result though. Are you sure that long is 64 bits on your machine? This is not always the case (although it usually is, I think). If long is actually 32 bits, 2<<31 will result in 0.
How large is R? You can get away with a 32 bit sized variable if R doesn't go past 65535...

A "dynamic bitfield" in C

In this question, assume all integers are unsigned for simplicity.
Suppose I would like to write 2 functions, pack and unpack, which let you pack integers of smaller width into, say, a 64-bit integer. However, the location and width of the integers is given at runtime, so I can't use C bitfields.
Quickest is to explain with an example. For simplicity, I'll illustrate with 8-bit integers:
* *
bit # 8 7 6 5 4 3 2 1
myint 0 1 1 0 0 0 1 1
Suppose I want to "unpack" at location 5, an integer of width 2. These are the two bits marked with an asterisk. The result of that operation should be 0b01. Similarly, If I unpack at location 2, of width 6, I would get 0b100011.
I can write the unpack function easily with a bitshift-left followed by a bitshift right.
But I can't think of a clear way to write an equivalent "pack" function, which will do the opposite.
Say given an integer 0b11, packing it into myint (from above) at location 5 and width 2 would yield
* *
bit # 8 7 6 5 4 3 2 1
myint 0 1 1 1 0 0 1 1
Best I came up with involves a lot of concatinating bit-strings with OR, << and >>. Before I implement and test it, maybe somebody sees a clever quick solution?
Off the top of my head, untested.
int pack(int oldPackedInteger, int bitOffset, int bitCount, int value) {
int mask = (1 << bitCount) -1;
mask <<= bitOffset;
oldPackedInteger &= ~mask;
oldPackedInteger |= value << bitOffset;
return oldPackedInteger;
}
In your example:
int value = 0x63;
value = pack(value, 4, 2, 0x3);
To write the value "3" at an offset of 4 (with two bits available) when 0x63 is the current value.

How do the bit manipulations in this bit-sorting code work?

Jon Bentley in Column 1 of his book programming pearls introduces a technique for sorting a sequence of non-zero positive integers using bit vectors.
I have taken the program bitsort.c from here and pasted it below:
/* Copyright (C) 1999 Lucent Technologies */
/* From 'Programming Pearls' by Jon Bentley */
/* bitsort.c -- bitmap sort from Column 1
* Sort distinct integers in the range [0..N-1]
*/
#include <stdio.h>
#define BITSPERWORD 32
#define SHIFT 5
#define MASK 0x1F
#define N 10000000
int a[1 + N/BITSPERWORD];
void set(int i)
{
int sh = i>>SHIFT;
a[i>>SHIFT] |= (1<<(i & MASK));
}
void clr(int i) { a[i>>SHIFT] &= ~(1<<(i & MASK)); }
int test(int i){ return a[i>>SHIFT] & (1<<(i & MASK)); }
int main()
{ int i;
for (i = 0; i < N; i++)
clr(i);
/*Replace above 2 lines with below 3 for word-parallel init
int top = 1 + N/BITSPERWORD;
for (i = 0; i < top; i++)
a[i] = 0;
*/
while (scanf("%d", &i) != EOF)
set(i);
for (i = 0; i < N; i++)
if (test(i))
printf("%d\n", i);
return 0;
}
I understand what the functions clr, set and test are doing and explain them below: ( please correct me if I am wrong here ).
clr clears the ith bit
set sets the ith bit
test returns the value at the ith bit
Now, I don't understand how the functions do what they do. I am unable to figure out all the bit manipulation happening in those three functions.
The first 3 constants are inter-related. BITSPERWORD is 32. This you'd want to set based on your compiler+architecture. SHIFT is 5, because 2^5 = 32. Finally, MASK is 0x1F which is 11111 in binary (ie: the bottom 5 bits are all set). Equivalently, MASK = BITSPERWORD - 1.
The bitset is conceptually just an array of bits. This implementation actually uses an array of ints, and assumes 32 bits per int. So whenever we want to set, clear or test (read) a bit we need to figure out two things:
which int (of the array) is it in
which of that int's bits are we talking about
Because we're assuming 32 bits per int, we can just divide by 32 (and truncate) to get the array index we want. Dividing by 32 (BITSPERWORD) is the same as shifting to the right by 5 (SHIFT). So that's what the a[i>>SHIFT] bit is about. You could also write this as a[i/BITSPERWORD] (and in fact, you'd probably get the same or very similar code assuming your compiler has a reasonable optimizer).
Now that we know which element of a we want, we need to figure out which bit. Really, we want the remainder. We could do this with i%BITSPERWORD, but it turns out that i&MASK is equivalent. This is because BITSPERWORD is a power of 2 (2^5 in this case) and MASK is the bottom 5 bits all set.
Basically is a bucket sort optimized:
reserve a bit array of length n
bits.
clear the bit array (first for in main).
read the items one by one (they must all be distinct).
set the i'th bit in the bit array if the read number is i.
iterate the bit array.
if the bit is set then print the position.
Or in other words (for N < 10 and to sort 3 numbers 4, 6, 2) 0
start with an empty 10 bit array (aka one integer usually)
0000000000
read 4 and set the bit in the array..
0000100000
read 6 and set the bit in the array
0000101000
read 2 and set the bit in the array
0010101000
iterate the array and print every position in which the bits are set to one.
2, 4, 6
sorted.
Starting with set():
A right shift of 5 is the same as dividing by 32. It does that to find which int the bit is in.
MASK is 0x1f or 31. ANDing with the address gives the bit index within the int. It's the same as the remainder of dividing the address by 32.
Shifting 1 left by the bit index ("1<<(i & MASK)") results in an integer which has just 1 bit in the given position set.
ORing sets the bit.
The line "int sh = i>>SHIFT;" is a wasted line, because they didn't use sh again beneath it, and instead just repeated "i>>SHIFT"
clr() is basically the same as set, except instead of ORing with 1<<(i & MASK) to set the bit, it ANDs with the inverse to clear the bit. test() ANDs with 1<<(i & MASK) to test the bit.
The bitsort will also remove duplicates from the list, because it will only count up to 1 per integer. A sort that uses integers instead of bits to count more than 1 of each is called a radix sort.
The bit magic is used as a special addressing scheme that works well with row sizes that are powers of two.
If you try understand this (note: I rather use bits-per-row than bits-per-word, since we're talking about a bit-matrix here):
// supposing an int of 1 bit would exist...
int1 bits[BITSPERROW * N]; // an array of N x BITSPERROW elements
// set bit at x,y:
int linear_address = y*BITSPERWORD + x;
bits + linear_address = 1; // or 0
// 0 1 2 3 4 5 6 7 8 9 10 11 ... 31
// . . . . . . . . . . . . .
// . . . . X . . . . . . . . -> x = 4, y = 1 => i = (1*32 + 4)
The statement linear_address = y*BITSPERWORD + x also means that x = linear_address % BITSPERWORD and y = linear_address / BITSPERWORD.
When you optimize this in memory by using 1 word of 32 bits per row, you get the fact that a bit at column x can be set using
int bitrow = 0;
bitrow |= 1 << (x);
Now when we iterate over the bits, we have the linear address, but need to find the corresponding word.
int column = linear_address % BITSPERROW;
int bit_mask = 1 << column; // meaning for the xth column,
// you take 1 and shift that bit x times
int row = linear_address / BITSPERROW;
So to set the i'th bit, you can do this:
bits[ i%BITSPERROW ] |= 1 << (linear_address / BITSPERROW );
An extra gotcha is, that the modulo operator can be replaced by a logical AND, and the / operator can be replaced by a shift, too, if the second operand is a power of two.
a % BITSPERROW == a & ( BITSPERROW - 1 ) == a & MASK
a / BITSPERROW == a >> ( log2(BITSPERROW) ) == a & SHIFT
This ultimately boils down to the very dense, yet hard-to-understand-for-the-bitfucker-agnostic notation
a[ i >> SHIFT ] |= ( 1 << (i&MASK) );
But I don't see the algorithm working for e.g. 40 bits per word.
Quoting the excerpts from Bentleys' original article in DDJ, this is what the code does at a high level:
/* phase 1: initialize set to empty */
for (i = 0; i < n; i++)
bit[i] = 0
/* phase 2: insert present elements */
for each i in the input file
bit[i] = 1
/* phase 3: write sorted output */
for (i = 0; i < n; i++)
if bit[i] == 1
write i on the output file
A few doubts :
1. Why is it a need for a 32 bit ?
2. Can we do this in Java by creating a HashMap with Keys from 0000000 to 9999999
and values 0 or 1 based on the presence/absence of the bit ? What are the implications
for such a program ?

Resources