How many 1-10 number can be stored in one byte? - arrays

I have many 1-10 numbers. In C++, is it possible to store more than two in a single byte?
I believe it's possible to store at least 2: a char is from 0-255. This means we can store a number from 0-9 and one from 10-100.
a) Is it possible to store more than 2, with some kind of bit manipulation?
b) What's the fastest way to do this?

There are 10 possible numbers from 1 to 10 (obvious, I know, but it must be said). A choice from 10 possible values requires log(10) / log(2) ~= 3.32 bits to encode. That means that the most you can store in 8 bits is two such choices.
But if you have a large number of them, you can store more than two per byte in aggregate. For example, in 32 bits you can store 9 numbers from 1 to 10 (requiring 29.9 bits), which is 2.25 per byte.

I think you're asking whether you can store 3 decimal digits, for example "7 and 5 and 8".
If so then the answer is, no: because to store 3 independent numbers you need to store any of 1000 values. One byte can store only 256 values.
The most compact/compressed storage format for your numbers is:
Subtract 1 from each number to convert it from "1 to 10" to decimal digit "0 to 9"
Combine the decimal digits and store them as an ordinary (unsigned) binary number
For example, "8 and 6 and 9" -> "7 and 5 and 8" -> "758" -> 0x256 -> 1001010110

First, if memory is not an issue, avoid it. Use signed or unsigned char to store a single value.
If you want to save memory (like transmission of data array over network or save file size), you can manipulate single bits of a byte using bit operators. For example, let's get values from 0 to 15 - it fits into 4 bits. Then
// values from 0 ot 15
unsigned char v1 = 1, v2 = 15;
// pack two values into one byte
unsigned char elem = (v1 << 4) + v2; // shift v1 to left and add v2
// unpack values
v1 = elem >> 4; // shift to right
v2 = elem & 0x0F; // clear higher 4 bits
// of course, you are going to use an array of elems

Related

Long multiplication of a pair of uint64 values [duplicate]

This question already has answers here:
How many 64-bit multiplications are needed to calculate the low 128-bits of a 64-bit by 128-bit product?
(2 answers)
Closed 2 years ago.
How can I multiply a pair of uint64 values safely in order to get the result as a pair of LSB and MSB of the same type?
typedef struct uint128 {
uint64 lsb;
uint64 msb;
};
uint128 mul(uint64 x, uint64 y)
{
uint128 z = {0, 0};
z.lsb = x * y;
if (z.lsb / x != y)
{
z.msb = ?
}
return z;
}
Am I computing the LSB correctly?
How can I compute the MSB correctly?
As said in the comments, the best solution would probably using a library which does that for you. But i will explain how you can do it without a library, because i think you asked to learn something. It is probably not a very efficient way but it works.
When we where in school and we had to multiply 2 numbers without a calculator, we multiplied 2 digits, had a result with 1-2 digits, and wrote them down and in the end we added them all up. We spited the multiplication up so we only had to calculate a single digit multiplication at once. A similar thing is possible with higher numbers on a CPU. But there we do not use decimal digits, we use half of the register size as digit. With that, we can multiply 2 digits and become 2 digits, in one register. In decimal 13*42 can be calculated as:
3* 2 = 0 6
10* 2 = 2 0
3*40 = 1 2 0
10*40 = 0 4 0 0
--------
0 5 4 6
A similar thing can be done with integers. To make it simple, i multiply 2 8 bit numbers to a 16 bit number on a 8 bit CPU, for that i only multiple 4 bit with 4 bit at a time. Lets multiply 0x73 with 0x4F.
0x03*0x0F = 0x002D
0x70*0x0F = 0x0690
0x03*0x40 = 0x00C0
0x70*0x40 = 0x1C00
-------
0x22BD
You basically create an array with 4 elements, in your case each element has the type uint32_t, store or add the result of a single multiplication in the right element(s) of the array, if the result of a single multiplication is too large for a single element, store the higher bits in the higher element. If an addition overflows carry 1 to the next element. In the end you can combine 2 elements of the array, in your case to two uint64_t.

How to get the bits of a hexadecimal number AES implementation

i have a sbox for an AES type implementation like
int box[4][4] = {{0xA,0x3,0xC,0xB},
{0xE,0xF,0x2,0xE},
{0x6,0x4,0x0,0xF},
{0xC,0x4,0xF,0x3}};
i want to get the first 2 bits and last 2 bits of a hexadecimal number and then replace it with the position in the sbox for example
int x = 0xA //Because A has a binary representation from hex as 1010
then the row number would become the first 2 bits of A "10" and the column number would become the second 2 bits of A "10" therefore int x would go to the sbox and be replaced with "0xF"
how could i get the bits of A and use it to look up my sbox ?
x = box[x & 3][(x >> 2) & 3]; will work assuming when you said "the row number would become the first 2 bits" you meant the two lower order bits of the four [i.e., the right two]; otherwise (when you said "first 2" you meant "left 2"), x = box[(x >> 2) & 3][x & 3]; is what you need.
In general, however, your 2 dimensional array accesses are slower than a 1 dimensional array access, so I would use a 1D array instead and not isolate the two pairs of bits as separate indexes. Instead use the low 4 bits of x as a 1D index. Then there won't be any extra shifting and masking or multiplication and addition of the 2D offset address calculation.
If "first 2 bits" meant "rightmost 2 bits"...
int box[16] = {0xA,0xE,0x6,0xC, 0x3,0xF,0x4,0x4, 0xC,0x2,0x0,0xF, 0xB,0xE,0xF,0x3};
If "first 2 bits" meant "leftmost 2 bits"...
int box[16] = {0xA,0x3,0xC,0xB, 0xE,0xF,0x2,0xE, 0x6,0x4,0x0,0xF, 0xC,0x4,0xF,0x3};
Then, to use the box...
x = box[x & 0xF]; // use the bottom 4 bits as single index
Hope that helps :-)

c: bit reversal logic

I was looking at the below bit reversal code and just wondering how does one come up with these kind of things. (source : http://www.cl.cam.ac.uk/~am21/hakmemc.html)
/* reverse 8 bits (Schroeppel) */
unsigned reverse_8bits(unsigned41 a) {
return ((a * 0x000202020202) /* 5 copies in 40 bits */
& 0x010884422010) /* where bits coincide with reverse repeated base 2^10 */
/* PDP-10: 041(6 bits):020420420020(35 bits) */
% 1023; /* casting out 2^10 - 1's */
}
Can someone explain what does comment "where bits coincide with reverse repeated base 2^10" mean?
Also how does "%1023" pull out the relevent bits? Is there any general idea in this?
It is a very broad question you are asking.
Here is an explanation of what % 1023 might be about: you know how computing n % 9 is like summing the digits of the base-10 representation of n? For instance, 52 % 9 = 7 = 5 + 2.
The code in your question is doing the same thing with 1023 = 1024 - 1 instead of 9 = 10 - 1. It is using the operation % 1023 to gather multiple results that have been computed “independently” as 10-bit slices of a large number.
And this is the beginning of a clue as to how the constants 0x000202020202 and 0x010884422010 are chosen: they make wide integer operations operate as independent simpler operations on 10-bit slices of a large number.
Expanding on Pascal Cuoq idea, here is an explaination.
The general idea is, in any base, if any number is divided by (base-1), the remainder will be sum of all the digits in the number.
For example, 34 when divided by 9 leaves 7 as remainder. This is because 34 can be written as 3 * 10 + 4
i.e. 34 = 3 * 10 + 4
= 3 * (9 +1) + 4
= 3 * 9 + (3 +4)
Now, 9 divides 3 * 9, leaving remainder (3 + 4). This process can be extended to any base 'b', since (b^n - 1) is always divided by (b-1).
Now, coming to the problem, if a number is represented in base 1024, and if the number is divided by 1023, the remainder will be sum of its digits.
To convert a binary number to base 1024, we can group bits of 10 from the right side into single number
For example, to convert binary number 0x010884422010(0b10000100010000100010000100010000000010000) to base 1024, we can group it into 10 bits number as follows
(1) (0000100010) (0001000100) (0010001000) (0000010000) =
(0b0000000001)*1024^4 + (0b0000100010)*1024^3 + (0b0001000100)*1024^2 + (0b0010001000)*1024^1 + (0b0000010000)*1024^0
So, when this number is divided by 1023, the remainder will sum of
0b0000000001
+ 0b0000100010
+ 0b0001000100
+ 0b0010001000
+ 0b0000010000
--------------------
0b0011111111
If you observe the above digits closely, the '1' bits in each above digit occupy complementay positions. So, when added together, it should pull all the 8 bits in the original number.
So, in the above code, "a * 0x000202020202", creates 5 copies of the byte "a". When the result is ANDed with 0x010884422010, we selectively choose 8 bits in the 5 copies of "a". When "% 1023" is applied, we pull all the 8 bits.
So, how does it actually reverse bits? That is bit clever. The idea is, the "1" bit in the digit 0b0000000001 is actually aligned with MSB of the original byte. So, when you "AND" and you are actually ANDing MSB of the original byte with LSB of the magic number digit. Similary the digit 0b0000100010 is aligned with second and sixth bits from MSB and so on.
So, when you add all the digits of the magic number, the resulting number will be reverse of the original byte.

Need help understanding bitmaps, bitwise operations, and C

Disclaimer: I am asking these questions in relation to an assignment. The assignment itself calls for implementing a bitmap and doing some operations with that, but that is not what I am asking about. I just want to understand the concepts so I can try the implementation for myself.
I need help understanding bitmaps/bit arrays and bitwise operations. I understand the basics of binary and how left/right shift work, but I don't know exactly how that use is beneficial.
Basically, I need to implement a bitmap to store the results of a prime sieve (of Eratosthenes.) This is a small part of a larger assignment focused on different IPC methods, but to get to that part I need to get the sieve completed first. I've never had to use bitwise operations nor have I ever learned about bitmaps, so I'm kind of on my own to learn this.
From what I can tell, bitmaps are arrays of a bit of a certain size, right? By that I mean you could have an 8-bit array or a 32-bit array (in my case, I need to find the primes for a 32-bit unsigned int, so I'd need the 32-bit array.) So if this is an array of bits, 32 of them to be specific, then we're basically talking about a string of 32 1s and 0s. How does this translate into a list of primes? I figure that one method would evaluate the binary number and save it to a new array as decimal, so all the decimal primes exist in one array, but that seems like you're using too much data.
Do I have the gist of bitmaps? Or is there something I'm missing? I've tried reading about this around the internet but I can't find a source that makes it clear enough for me...
Suppose you have a list of primes: {3, 5, 7}. You can store these numbers as a character array: char c[] = {3, 5, 7} and this requires 3 bytes.
Instead lets use a single byte such that each set bit indicates that the number is in the set. For example, 01010100. If we can set the byte we want and later test it we can use this to store the same information in a single byte. To set it:
char b = 0;
// want to set `3` so shift 1 twice to the left
b = b | (1 << 2);
// also set `5`
b = b | (1 << 4);
// and 7
b = b | (1 << 6);
And to test these numbers:
// is 3 in the map:
if (b & (1 << 2)) {
// it is in...
You are going to need a lot more than 32 bits.
You want a sieve for up to 2^32 numbers, so you will need a bit for each one of those. Each bit will represent one number, and will be 0 if the number is prime and 1 if it is composite. (You can save one bit by noting that the first bit must be 2 as 1 is neither prime nor composite. It is easier to waste that one bit.)
2^32 = 4,294,967,296
Divide by 8
536,870,912 bytes, or 1/2 GB.
So you will want an array of 2^29 bytes, or 2^27 4-byte words, or whatever you decide is best, and also a method for manipulating the individual bits stored in the chars (ints) in the array.
It sounds like eventually, you are going to have several threads or processes operating on this shared memory.You may need to store it all in a file if you can't allocate all that memory to yourself.
Say you want to find the bit for x. Then let a = x / 8 and b = x - 8 * a. Then the bit is at arr[a] & (1 << b). (Avoid the modulus operator % wherever possible.)
//mark composite
a = x / 8;
b = x - 8 * a;
arr[a] |= 1 << b;
This sounds like a fun assignment!
A bitmap allows you to construct a large predicate function over the range of numbers you're interested in. If you just have a single 8-bit char, you can store Boolean values for each of the eight values. If you have 2 chars, it doubles your range.
So, say you have a bitmap that already has this information stored, your test function could look something like this:
bool num_in_bitmap (int num, char *bitmap, size_t sz) {
if (num/8 >= sz) return 0;
return (bitmap[num/8] >> (num%8)) & 1;
}

Combining two integers with bit-shifting

I am writing a program, I have to store the distances between pairs of numbers in a hash table.
I will be given a Range R. Let's say the range is 5.
Now I have to find distances between the following pairs:
1 2
1 3
1 4
1 5
2 3
2 4
2 5
3 4
3 5
4 5
that is, the total number of pairs is (R^2/2 -R). I want to store it in a hash table. All these are unsigned integers. So there are 32 bits. My idea was that, I take an unsigned long (64 bits).
Let's say I need to hash the distance between 1 and 5. Now
long k = 1;
k = k<<31;
k+=5;
Since I have 64 bits, I am storing the first number in the first 31 bits and the second number in the second 31 bits. This guarantees unique keys which can then be used for hashing.
But when I do this:
long k = 2;
k << 31;
k+= 2;
The value of k becomes zero.
I am not able to wrap my head around this shifting concept.
Essentially what I am trying to achieve is that,
An unsigned long has | 32bits | 32 bits |
Store |1st integer|2nd integer|
How can I achieve this to get unique keys for each pair?
I am running the code on a 64 bit AMD Opteron processor. sizeof(ulong) returns 8. So it is 64 bits. Do I need a long long in such a case?
Also I need to know if this will create unique keys? From my understanding , it does seem to create unique keys. But I would like a confirmation.
Assuming you're using C or something that follows vaguely similar rules, your problem is primarily with types.
long k = 2; // This defines `k` a a long
k << 31; // This (sort of) shifts k left, but still as a 32-bit long.
What you almost certainly want/need to do is convert k to a long long before you shift it left, so you're shifting in a 64-bit word.
unsigned long first_number = 2;
unsigned long long both_numbers = (unsigned long long)first_number << 32;
unsigned long second_number = 5;
both_numbers |= second_number;
In this case, if (for example) you print out both_numbers, in hexadecimal, you should get 0x0000000200000005.
The concept makes sense. As Oli has added, you want to shift by 32, not 31 - shifting by 31 will put it in the 31st bit, so if you shifted back to the right to try and get the first number you would end up with a bit missing, and the second number would be potentially huge because you could have put a 1 in the uppermost bit.
But if you want to do bit manipulation, I would do instead:
k = 1 << 32;
k = k|5;
It really should produce the same result though. Are you sure that long is 64 bits on your machine? This is not always the case (although it usually is, I think). If long is actually 32 bits, 2<<31 will result in 0.
How large is R? You can get away with a 32 bit sized variable if R doesn't go past 65535...

Resources