I am writing a program, I have to store the distances between pairs of numbers in a hash table.
I will be given a Range R. Let's say the range is 5.
Now I have to find distances between the following pairs:
1 2
1 3
1 4
1 5
2 3
2 4
2 5
3 4
3 5
4 5
that is, the total number of pairs is (R^2/2 -R). I want to store it in a hash table. All these are unsigned integers. So there are 32 bits. My idea was that, I take an unsigned long (64 bits).
Let's say I need to hash the distance between 1 and 5. Now
long k = 1;
k = k<<31;
k+=5;
Since I have 64 bits, I am storing the first number in the first 31 bits and the second number in the second 31 bits. This guarantees unique keys which can then be used for hashing.
But when I do this:
long k = 2;
k << 31;
k+= 2;
The value of k becomes zero.
I am not able to wrap my head around this shifting concept.
Essentially what I am trying to achieve is that,
An unsigned long has | 32bits | 32 bits |
Store |1st integer|2nd integer|
How can I achieve this to get unique keys for each pair?
I am running the code on a 64 bit AMD Opteron processor. sizeof(ulong) returns 8. So it is 64 bits. Do I need a long long in such a case?
Also I need to know if this will create unique keys? From my understanding , it does seem to create unique keys. But I would like a confirmation.
Assuming you're using C or something that follows vaguely similar rules, your problem is primarily with types.
long k = 2; // This defines `k` a a long
k << 31; // This (sort of) shifts k left, but still as a 32-bit long.
What you almost certainly want/need to do is convert k to a long long before you shift it left, so you're shifting in a 64-bit word.
unsigned long first_number = 2;
unsigned long long both_numbers = (unsigned long long)first_number << 32;
unsigned long second_number = 5;
both_numbers |= second_number;
In this case, if (for example) you print out both_numbers, in hexadecimal, you should get 0x0000000200000005.
The concept makes sense. As Oli has added, you want to shift by 32, not 31 - shifting by 31 will put it in the 31st bit, so if you shifted back to the right to try and get the first number you would end up with a bit missing, and the second number would be potentially huge because you could have put a 1 in the uppermost bit.
But if you want to do bit manipulation, I would do instead:
k = 1 << 32;
k = k|5;
It really should produce the same result though. Are you sure that long is 64 bits on your machine? This is not always the case (although it usually is, I think). If long is actually 32 bits, 2<<31 will result in 0.
How large is R? You can get away with a 32 bit sized variable if R doesn't go past 65535...
Related
is it possible to divide for example an integer in n bits?
For example, since an int variable has a size of 32 bits (4 bytes) is it possible to divide the number in 4 "pieces" of 8 bits and put them in 4 other variables that have a size of 8 bits?
I solved using unsigned char *pointer pointing to the variable that I want to analyze bytes, something like this:
int x = 10;
unsigned char *p = (unsigned char *) &x;
//Since my cpu is little endian I'll print bytes from the end
for(int i = sizeof(int) - 1; i >= 0; i--)
//print hexadecimal bytes
printf("%.2x ", p[i]);
Yes, of course it is. But generally we just use bit operations directly on the bits (called bitops) using bitwise operators defined for all discrete integer types.
For instance, if you need to test the 5th least significant bit you can use x &= 1 << 4 to have x just to have the 5th bit set, and all others set to zero. Then you can use if (x) to test if it has been set; C doesn't use a boolean type but assumes that zero is false and any other value means true. If you store 1 << 4 into a constant then you have created a "(bit) mask" for that particular bit.
If you need a value 0 or 1 then you can use a shift the other way and use x = (x >> 4) & 1. This is all covered in most C books, so I'd implore you to read about these bit operations there.
There are many Q/A's here how to split integers into bytes, see e.g. here. In principle you can store those in a char, but if you may require integer operations then you can also split the int into multiple values. One problem with that is that an int is just defined to at least store values from -32768 to 32767. That means that the number of bytes in an int can be 2 bytes or more.
In principle it is also possible to use bit fields but I'd be hesitant to use those. With an int you will at least know that the bits will be stored in the least significant bits.
I'm hoping that somebody can give me an understanding of why the code works the way it does. I'm trying to wrap my head around things but am lost.
My professor has given us this code snippet which we have to use in order to generate random numbers in C. The snippet in question generates a 64-bit integer, and we have to adapt it to also generate 32-bit, 16-bit, and 8-bit integers. I'm completely lost on where to start, and I'm not necessarily asking for a solution, just on how the original snippet works, so that I can adapt it form there.
long long rand64()
{
int a, b;
long long r;
a = rand();
b = rand();
r = (long long)a;
r = (r << 31) | b;
return r;
}
Questions I have about this code are:
Why is it shifted 31 bits? I thought rand() generated a number between 0-32767 which is 16 bits, so wouldn't that be 48 bits?
Why do we say | (or) b on the second to last line?
I'm making the relatively safe assumption that, in your computer's C implementation, long long is a 64-bit data type.
The key here is that, since long long r is signed, any value with the highest bit set will be negative. Therefore, the code shifts r by 31 bits to avoid setting that bit.
The | is a logical bit operator which combines the two values by setting all of the bits in r which are set in b.
EDIT:
After reading some of the comments, I realized that my answer needs correction. rand() returns a value no more than RAND_MAX which is typically 2^31-1. Therefore, r is a 31-bit integer. If you shifted it 32 bits to the left, you'd guarantee that its 31st bit (0-up counting) would always be zero.
rand() generates a random value [0...RAND_MAX] of questionable repute - but let us set that reputation aside and assume rand() is good enough and it is a
Mersenne number (power-of-2 - 1).
Weakness to OP's code: If RAND_MAX == pow(2,31)-1, a common occurrence, then OP's rand64() only returns values [0...pow(2,62)). #Nate Eldredge
Instead, loop as many times as needed.
To find how many random bits are returned with each call, we need the log2(RAND_MAX + 1). This fortunately is easy with an awesome macro from Is there any way to compute the width of an integer type at compile-time?
#include <stdlib.h>
/* Number of bits in inttype_MAX, or in any (1<<k)-1 where 0 <= k < 2040 */
#define IMAX_BITS(m) ((m)/((m)%255+1) / 255%255*8 + 7-86/((m)%255+12))
#define RAND_MAX_BITWIDTH (IMAX_BITS(RAND_MAX))
Example: rand_ul() returns a random value in the [0...ULONG_MAX] range, be unsigned long 32-bit, 64-bit, etc.
unsigned long rand_ul(void) {
unsigned long r = 0;
for (int i=0; i<IMAX_BITS(ULONG_MAX); i += RAND_MAX_BITWIDTH) {
r <<= RAND_MAX_BITWIDTH;
r |= rand();
}
return r;
}
Disclaimer: I am asking these questions in relation to an assignment. The assignment itself calls for implementing a bitmap and doing some operations with that, but that is not what I am asking about. I just want to understand the concepts so I can try the implementation for myself.
I need help understanding bitmaps/bit arrays and bitwise operations. I understand the basics of binary and how left/right shift work, but I don't know exactly how that use is beneficial.
Basically, I need to implement a bitmap to store the results of a prime sieve (of Eratosthenes.) This is a small part of a larger assignment focused on different IPC methods, but to get to that part I need to get the sieve completed first. I've never had to use bitwise operations nor have I ever learned about bitmaps, so I'm kind of on my own to learn this.
From what I can tell, bitmaps are arrays of a bit of a certain size, right? By that I mean you could have an 8-bit array or a 32-bit array (in my case, I need to find the primes for a 32-bit unsigned int, so I'd need the 32-bit array.) So if this is an array of bits, 32 of them to be specific, then we're basically talking about a string of 32 1s and 0s. How does this translate into a list of primes? I figure that one method would evaluate the binary number and save it to a new array as decimal, so all the decimal primes exist in one array, but that seems like you're using too much data.
Do I have the gist of bitmaps? Or is there something I'm missing? I've tried reading about this around the internet but I can't find a source that makes it clear enough for me...
Suppose you have a list of primes: {3, 5, 7}. You can store these numbers as a character array: char c[] = {3, 5, 7} and this requires 3 bytes.
Instead lets use a single byte such that each set bit indicates that the number is in the set. For example, 01010100. If we can set the byte we want and later test it we can use this to store the same information in a single byte. To set it:
char b = 0;
// want to set `3` so shift 1 twice to the left
b = b | (1 << 2);
// also set `5`
b = b | (1 << 4);
// and 7
b = b | (1 << 6);
And to test these numbers:
// is 3 in the map:
if (b & (1 << 2)) {
// it is in...
You are going to need a lot more than 32 bits.
You want a sieve for up to 2^32 numbers, so you will need a bit for each one of those. Each bit will represent one number, and will be 0 if the number is prime and 1 if it is composite. (You can save one bit by noting that the first bit must be 2 as 1 is neither prime nor composite. It is easier to waste that one bit.)
2^32 = 4,294,967,296
Divide by 8
536,870,912 bytes, or 1/2 GB.
So you will want an array of 2^29 bytes, or 2^27 4-byte words, or whatever you decide is best, and also a method for manipulating the individual bits stored in the chars (ints) in the array.
It sounds like eventually, you are going to have several threads or processes operating on this shared memory.You may need to store it all in a file if you can't allocate all that memory to yourself.
Say you want to find the bit for x. Then let a = x / 8 and b = x - 8 * a. Then the bit is at arr[a] & (1 << b). (Avoid the modulus operator % wherever possible.)
//mark composite
a = x / 8;
b = x - 8 * a;
arr[a] |= 1 << b;
This sounds like a fun assignment!
A bitmap allows you to construct a large predicate function over the range of numbers you're interested in. If you just have a single 8-bit char, you can store Boolean values for each of the eight values. If you have 2 chars, it doubles your range.
So, say you have a bitmap that already has this information stored, your test function could look something like this:
bool num_in_bitmap (int num, char *bitmap, size_t sz) {
if (num/8 >= sz) return 0;
return (bitmap[num/8] >> (num%8)) & 1;
}
I'm completely stuck on how to do this homework problem and looking for a hint or two to keep me going. I'm limited to 20 operations (= doesn't count in this 20).
I'm supposed to fill in a function that looks like this:
/* Supposed to do x%(2^n).
For example: for x = 15 and n = 2, the result would be 3.
Additionally, if positive overflow occurs, the result should be the
maximum positive number, and if negative overflow occurs, the result
should be the most negative number.
*/
int remainder_power_of_2(int x, int n){
int twoToN = 1 << n;
/* Magic...? How can I do this without looping? We are assuming it is a
32 bit machine, and we can't use constants bigger than 8 bits
(0xFF is valid for example).
However, I can make a 32 bit number by ORing together a bunch of stuff.
Valid operations are: << >> + ~ ! | & ^
*/
return theAnswer;
}
I was thinking maybe I could shift the twoToN over left... until I somehow check (without if/else) that it is bigger than x, and then shift back to the right once... then xor it with x... and repeat? But I only have 20 operations!
Hint: In decadic system to do a modulo by power of 10, you just leave the last few digits and null the other. E.g. 12345 % 100 = 00045 = 45. Well, in computer numbers are binary. So you have to null the binary digits (bits). So look at various bit manipulation operators (&, |, ^) to do so.
Since binary is base 2, remainders mod 2^N are exactly represented by the rightmost bits of a value. For example, consider the following 32 bit integer:
00000000001101001101000110010101
This has the two's compliment value of 3461525. The remainder mod 2 is exactly the last bit (1). The remainder mod 4 (2^2) is exactly the last 2 bits (01). The remainder mod 8 (2^3) is exactly the last 3 bits (101). Generally, the remainder mod 2^N is exactly the last N bits.
In short, you need to be able to take your input number, and mask it somehow to get only the last few bits.
A tip: say you're using mod 64. The value of 64 in binary is:
00000000000000000000000001000000
The modulus you're interested in is the last 6 bits. I'll provide you a sequence of operations that can transform that number into a mask (but I'm not going to tell you what they are, you can figure them out yourself :D)
00000000000000000000000001000000 // starting value
11111111111111111111111110111111 // ???
11111111111111111111111111000000 // ???
00000000000000000000000000111111 // the mask you need
Each of those steps equates to exactly one operation that can be performed on an int type. Can you figure them out? Can you see how to simplify my steps? :D
Another hint:
00000000000000000000000001000000 // 64
11111111111111111111111111000000 // -64
Since your divisor is always power of two, it's easy.
uint32_t remainder(uint32_t number, uint32_t power)
{
power = 1 << power;
return (number & (power - 1));
}
Suppose you input number as 5 and divisor as 2
`00000000000000000000000000000101` number
AND
`00000000000000000000000000000001` divisor - 1
=
`00000000000000000000000000000001` remainder (what we expected)
Suppose you input number as 7 and divisor as 4
`00000000000000000000000000000111` number
AND
`00000000000000000000000000000011` divisor - 1
=
`00000000000000000000000000000011` remainder (what we expected)
This only works as long as divisor is a power of two (Except for divisor = 1), so use it carefully.
I was recently asked in an interview how to set the 513th bit of a char[1024] in C, but I'm unsure how to approach the problem. I saw How do you set, clear, and toggle a single bit?, but how do I choose the bit from such a large array?
int bitToSet = 513;
inArray[bitToSet / 8] |= (1 << (bitToSet % 8));
...making certain assumptions about character size and desired endianness.
EDIT: Okay, fine. You can replace 8 with CHAR_BIT if you want.
#include <limits.h>
int charContaining513thBit = 513 / CHAR_BIT;
int offsetOf513thBitInChar = 513 - charContaining513thBit*CHAR_BIT;
int bit513 = array[charContaining513thBit] >> offsetOf513thBitInChar & 1;
You have to know the width of characters (in bits) on your machine. For pretty much everyone, that's 8. You can use the constant CHAR_BIT from limits.h in a C program. You can then do some fairly simple math to find the offset of the bit (depending on how you count them).
Numbering bits from the left, with the 2⁷ bit in a[0] being bit 0, the 2⁰ bit being bit 7, and the 2⁷ bit in a[1] being bit 8, this gives:
offset = 513 / CHAR_BIT; /* using integer (truncating) math, of course */
bit = 513 % CHAR_BIT;
a[offset] |= (0x80>>bit)
There are many sane ways to number bits, here are two:
a[0] a[1]
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 This is the above
7 6 5 4 3 2 1 0 15 14 13 12 11 10 9 8 This is |= (1<<bit)
You could also number from the other end of the array (treating it as one very large big-endian number).
Small optimization:
The / and % operators are rather slow, even on a lot of modern cpus, with modulus being slightly slower. I would replace them with the equivalent operations using bit shifting (and subtraction), which only works nicely when the second operand is a power of two, obviously.
x / 8 becomes x >> 3
x % 8 becomes x-((x>>3)<<3)
for this second operation, just reuse the result from the initial division.
Depending on the desired order (left to right versus right to left), it might change. But the general idea assuming 8 bits per byte would be to choose the byte as. This is expanded into lots of lines of code to hopefully show more clearly the intended steps (or perhaps it just obfuscates the intention):
int bitNum = 513;
int bytePos = bitNum / 8;
Then the bit position would be computed as:
int bitInByte = bitNum % 8;
Then set the bit (assuming the goal is to set it to 1 as opposed to clear or toggle it):
charArray[bytePos] |= ( 1 << bitInByte );
When you say 513th are you using index 0 or 1 for the 1st bit? If it's the former your post refers to the bit at index 512. I think the question is valid since everywhere else in C the first index is always 0.
BTW
static char chr[1024];
...
chr[512>>3]=1<<(512&0x7);