Long multiplication of a pair of uint64 values [duplicate] - c

This question already has answers here:
How many 64-bit multiplications are needed to calculate the low 128-bits of a 64-bit by 128-bit product?
(2 answers)
Closed 2 years ago.
How can I multiply a pair of uint64 values safely in order to get the result as a pair of LSB and MSB of the same type?
typedef struct uint128 {
uint64 lsb;
uint64 msb;
};
uint128 mul(uint64 x, uint64 y)
{
uint128 z = {0, 0};
z.lsb = x * y;
if (z.lsb / x != y)
{
z.msb = ?
}
return z;
}
Am I computing the LSB correctly?
How can I compute the MSB correctly?

As said in the comments, the best solution would probably using a library which does that for you. But i will explain how you can do it without a library, because i think you asked to learn something. It is probably not a very efficient way but it works.
When we where in school and we had to multiply 2 numbers without a calculator, we multiplied 2 digits, had a result with 1-2 digits, and wrote them down and in the end we added them all up. We spited the multiplication up so we only had to calculate a single digit multiplication at once. A similar thing is possible with higher numbers on a CPU. But there we do not use decimal digits, we use half of the register size as digit. With that, we can multiply 2 digits and become 2 digits, in one register. In decimal 13*42 can be calculated as:
3* 2 = 0 6
10* 2 = 2 0
3*40 = 1 2 0
10*40 = 0 4 0 0
--------
0 5 4 6
A similar thing can be done with integers. To make it simple, i multiply 2 8 bit numbers to a 16 bit number on a 8 bit CPU, for that i only multiple 4 bit with 4 bit at a time. Lets multiply 0x73 with 0x4F.
0x03*0x0F = 0x002D
0x70*0x0F = 0x0690
0x03*0x40 = 0x00C0
0x70*0x40 = 0x1C00
-------
0x22BD
You basically create an array with 4 elements, in your case each element has the type uint32_t, store or add the result of a single multiplication in the right element(s) of the array, if the result of a single multiplication is too large for a single element, store the higher bits in the higher element. If an addition overflows carry 1 to the next element. In the end you can combine 2 elements of the array, in your case to two uint64_t.

Related

How many 1-10 number can be stored in one byte?

I have many 1-10 numbers. In C++, is it possible to store more than two in a single byte?
I believe it's possible to store at least 2: a char is from 0-255. This means we can store a number from 0-9 and one from 10-100.
a) Is it possible to store more than 2, with some kind of bit manipulation?
b) What's the fastest way to do this?
There are 10 possible numbers from 1 to 10 (obvious, I know, but it must be said). A choice from 10 possible values requires log(10) / log(2) ~= 3.32 bits to encode. That means that the most you can store in 8 bits is two such choices.
But if you have a large number of them, you can store more than two per byte in aggregate. For example, in 32 bits you can store 9 numbers from 1 to 10 (requiring 29.9 bits), which is 2.25 per byte.
I think you're asking whether you can store 3 decimal digits, for example "7 and 5 and 8".
If so then the answer is, no: because to store 3 independent numbers you need to store any of 1000 values. One byte can store only 256 values.
The most compact/compressed storage format for your numbers is:
Subtract 1 from each number to convert it from "1 to 10" to decimal digit "0 to 9"
Combine the decimal digits and store them as an ordinary (unsigned) binary number
For example, "8 and 6 and 9" -> "7 and 5 and 8" -> "758" -> 0x256 -> 1001010110
First, if memory is not an issue, avoid it. Use signed or unsigned char to store a single value.
If you want to save memory (like transmission of data array over network or save file size), you can manipulate single bits of a byte using bit operators. For example, let's get values from 0 to 15 - it fits into 4 bits. Then
// values from 0 ot 15
unsigned char v1 = 1, v2 = 15;
// pack two values into one byte
unsigned char elem = (v1 << 4) + v2; // shift v1 to left and add v2
// unpack values
v1 = elem >> 4; // shift to right
v2 = elem & 0x0F; // clear higher 4 bits
// of course, you are going to use an array of elems

c: bit reversal logic

I was looking at the below bit reversal code and just wondering how does one come up with these kind of things. (source : http://www.cl.cam.ac.uk/~am21/hakmemc.html)
/* reverse 8 bits (Schroeppel) */
unsigned reverse_8bits(unsigned41 a) {
return ((a * 0x000202020202) /* 5 copies in 40 bits */
& 0x010884422010) /* where bits coincide with reverse repeated base 2^10 */
/* PDP-10: 041(6 bits):020420420020(35 bits) */
% 1023; /* casting out 2^10 - 1's */
}
Can someone explain what does comment "where bits coincide with reverse repeated base 2^10" mean?
Also how does "%1023" pull out the relevent bits? Is there any general idea in this?
It is a very broad question you are asking.
Here is an explanation of what % 1023 might be about: you know how computing n % 9 is like summing the digits of the base-10 representation of n? For instance, 52 % 9 = 7 = 5 + 2.
The code in your question is doing the same thing with 1023 = 1024 - 1 instead of 9 = 10 - 1. It is using the operation % 1023 to gather multiple results that have been computed “independently” as 10-bit slices of a large number.
And this is the beginning of a clue as to how the constants 0x000202020202 and 0x010884422010 are chosen: they make wide integer operations operate as independent simpler operations on 10-bit slices of a large number.
Expanding on Pascal Cuoq idea, here is an explaination.
The general idea is, in any base, if any number is divided by (base-1), the remainder will be sum of all the digits in the number.
For example, 34 when divided by 9 leaves 7 as remainder. This is because 34 can be written as 3 * 10 + 4
i.e. 34 = 3 * 10 + 4
= 3 * (9 +1) + 4
= 3 * 9 + (3 +4)
Now, 9 divides 3 * 9, leaving remainder (3 + 4). This process can be extended to any base 'b', since (b^n - 1) is always divided by (b-1).
Now, coming to the problem, if a number is represented in base 1024, and if the number is divided by 1023, the remainder will be sum of its digits.
To convert a binary number to base 1024, we can group bits of 10 from the right side into single number
For example, to convert binary number 0x010884422010(0b10000100010000100010000100010000000010000) to base 1024, we can group it into 10 bits number as follows
(1) (0000100010) (0001000100) (0010001000) (0000010000) =
(0b0000000001)*1024^4 + (0b0000100010)*1024^3 + (0b0001000100)*1024^2 + (0b0010001000)*1024^1 + (0b0000010000)*1024^0
So, when this number is divided by 1023, the remainder will sum of
0b0000000001
+ 0b0000100010
+ 0b0001000100
+ 0b0010001000
+ 0b0000010000
--------------------
0b0011111111
If you observe the above digits closely, the '1' bits in each above digit occupy complementay positions. So, when added together, it should pull all the 8 bits in the original number.
So, in the above code, "a * 0x000202020202", creates 5 copies of the byte "a". When the result is ANDed with 0x010884422010, we selectively choose 8 bits in the 5 copies of "a". When "% 1023" is applied, we pull all the 8 bits.
So, how does it actually reverse bits? That is bit clever. The idea is, the "1" bit in the digit 0b0000000001 is actually aligned with MSB of the original byte. So, when you "AND" and you are actually ANDing MSB of the original byte with LSB of the magic number digit. Similary the digit 0b0000100010 is aligned with second and sixth bits from MSB and so on.
So, when you add all the digits of the magic number, the resulting number will be reverse of the original byte.

Homework - C bit puzzle - Perform % using C bit operations (no looping, conditionals, function calls, etc)

I'm completely stuck on how to do this homework problem and looking for a hint or two to keep me going. I'm limited to 20 operations (= doesn't count in this 20).
I'm supposed to fill in a function that looks like this:
/* Supposed to do x%(2^n).
For example: for x = 15 and n = 2, the result would be 3.
Additionally, if positive overflow occurs, the result should be the
maximum positive number, and if negative overflow occurs, the result
should be the most negative number.
*/
int remainder_power_of_2(int x, int n){
int twoToN = 1 << n;
/* Magic...? How can I do this without looping? We are assuming it is a
32 bit machine, and we can't use constants bigger than 8 bits
(0xFF is valid for example).
However, I can make a 32 bit number by ORing together a bunch of stuff.
Valid operations are: << >> + ~ ! | & ^
*/
return theAnswer;
}
I was thinking maybe I could shift the twoToN over left... until I somehow check (without if/else) that it is bigger than x, and then shift back to the right once... then xor it with x... and repeat? But I only have 20 operations!
Hint: In decadic system to do a modulo by power of 10, you just leave the last few digits and null the other. E.g. 12345 % 100 = 00045 = 45. Well, in computer numbers are binary. So you have to null the binary digits (bits). So look at various bit manipulation operators (&, |, ^) to do so.
Since binary is base 2, remainders mod 2^N are exactly represented by the rightmost bits of a value. For example, consider the following 32 bit integer:
00000000001101001101000110010101
This has the two's compliment value of 3461525. The remainder mod 2 is exactly the last bit (1). The remainder mod 4 (2^2) is exactly the last 2 bits (01). The remainder mod 8 (2^3) is exactly the last 3 bits (101). Generally, the remainder mod 2^N is exactly the last N bits.
In short, you need to be able to take your input number, and mask it somehow to get only the last few bits.
A tip: say you're using mod 64. The value of 64 in binary is:
00000000000000000000000001000000
The modulus you're interested in is the last 6 bits. I'll provide you a sequence of operations that can transform that number into a mask (but I'm not going to tell you what they are, you can figure them out yourself :D)
00000000000000000000000001000000 // starting value
11111111111111111111111110111111 // ???
11111111111111111111111111000000 // ???
00000000000000000000000000111111 // the mask you need
Each of those steps equates to exactly one operation that can be performed on an int type. Can you figure them out? Can you see how to simplify my steps? :D
Another hint:
00000000000000000000000001000000 // 64
11111111111111111111111111000000 // -64
Since your divisor is always power of two, it's easy.
uint32_t remainder(uint32_t number, uint32_t power)
{
power = 1 << power;
return (number & (power - 1));
}
Suppose you input number as 5 and divisor as 2
`00000000000000000000000000000101` number
AND
`00000000000000000000000000000001` divisor - 1
=
`00000000000000000000000000000001` remainder (what we expected)
Suppose you input number as 7 and divisor as 4
`00000000000000000000000000000111` number
AND
`00000000000000000000000000000011` divisor - 1
=
`00000000000000000000000000000011` remainder (what we expected)
This only works as long as divisor is a power of two (Except for divisor = 1), so use it carefully.

A "dynamic bitfield" in C

In this question, assume all integers are unsigned for simplicity.
Suppose I would like to write 2 functions, pack and unpack, which let you pack integers of smaller width into, say, a 64-bit integer. However, the location and width of the integers is given at runtime, so I can't use C bitfields.
Quickest is to explain with an example. For simplicity, I'll illustrate with 8-bit integers:
* *
bit # 8 7 6 5 4 3 2 1
myint 0 1 1 0 0 0 1 1
Suppose I want to "unpack" at location 5, an integer of width 2. These are the two bits marked with an asterisk. The result of that operation should be 0b01. Similarly, If I unpack at location 2, of width 6, I would get 0b100011.
I can write the unpack function easily with a bitshift-left followed by a bitshift right.
But I can't think of a clear way to write an equivalent "pack" function, which will do the opposite.
Say given an integer 0b11, packing it into myint (from above) at location 5 and width 2 would yield
* *
bit # 8 7 6 5 4 3 2 1
myint 0 1 1 1 0 0 1 1
Best I came up with involves a lot of concatinating bit-strings with OR, << and >>. Before I implement and test it, maybe somebody sees a clever quick solution?
Off the top of my head, untested.
int pack(int oldPackedInteger, int bitOffset, int bitCount, int value) {
int mask = (1 << bitCount) -1;
mask <<= bitOffset;
oldPackedInteger &= ~mask;
oldPackedInteger |= value << bitOffset;
return oldPackedInteger;
}
In your example:
int value = 0x63;
value = pack(value, 4, 2, 0x3);
To write the value "3" at an offset of 4 (with two bits available) when 0x63 is the current value.

How to subtract two unsigned ints with wrap around or overflow

There are two unsigned ints (x and y) that need to be subtracted. x is always larger than y. However, both x and y can wrap around; for example, if they were both bytes, after 0xff comes 0x00. The problem case is if x wraps around, while y does not. Now x appears to be smaller than y. Luckily, x will not wrap around twice (only once is guaranteed). Assuming bytes, x has wrapped and is now 0x2, whereas y has not and is 0xFE. The right answer of x - y is supposed to be 0x4.
Maybe,
( x > y) ? (x-y) : (x+0xff-y);
But I think there is another way, something involving 2s compliment?, and in this embedded system, x and y are the largest unsigned int types, so adding 0xff... is not possible
What is the best way to write the statement (target language is C)?
Assuming two unsigned integers:
If you know that one is supposed to be "larger" than the other, just subtract. It will work provided you haven't wrapped around more than once (obviously, if you have, you won't be able to tell).
If you don't know that one is larger than the other, subtract and cast the result to a signed int of the same width. It will work provided the difference between the two is in the range of the signed int (if not, you won't be able to tell).
To clarify: the scenario described by the original poster seems to be confusing people, but is typical of monotonically increasing fixed-width counters, such as hardware tick counters, or sequence numbers in protocols. The counter goes (e.g. for 8 bits) 0xfc, 0xfd, 0xfe, 0xff, 0x00, 0x01, 0x02, 0x03 etc., and you know that of the two values x and y that you have, x comes later. If x==0x02 and y==0xfe, the calculation x-y (as an 8-bit result) will give the correct answer of 4, assuming that subtraction of two n-bit values wraps modulo 2n - which C99 guarantees for subtraction of unsigned values. (Note: the C standard does not guarantee this behaviour for subtraction of signed values.)
Here's a little more detail of why it 'just works' when you subtract the 'smaller' from the 'larger'.
A couple of things going into this…
1. In hardware, subtraction uses addition: The appropriate operand is simply negated before being added.
2. In two’s complement (which pretty much everything uses), an integer is negated by inverting all the bits then adding 1.
Hardware does this more efficiently than it sounds from the above description, but that’s the basic algorithm for subtraction (even when values are unsigned).
So, lets figure 2 – 250 using 8bit unsigned integers. In binary we have
0 0 0 0 0 0 1 0
- 1 1 1 1 1 0 1 0
We negate the operand being subtracted and then add. Recall that to negate we invert all the bits then add 1. After inverting the bits of the second operand we have
0 0 0 0 0 1 0 1
Then after adding 1 we have
0 0 0 0 0 1 1 0
Now we perform addition...
0 0 0 0 0 0 1 0
+ 0 0 0 0 0 1 1 0
= 0 0 0 0 1 0 0 0 = 8, which is the result we wanted from 2 - 250
Maybe I don't understand, but what's wrong with:
unsigned r = x - y;
The question, as stated, is confusing. You said that you are subtracting unsigned values. If x is always larger than y, as you said, then x - y cannot possibly wrap around or overflow. So you just do x - y (if that's what you need) and that's it.
This is an efficient way to determine the amount of free space in a circular buffer or do sliding window flow control.
Use unsigned ints for head and tail - increment them and let them wrap!
Buffer length has to be a power of 2.
free = ((head - tail) & size_mask), where size_mask is 2^n-1 the buffer or window size.
Just to put the already correct answer into code:
If you know that x is the smaller value, the following calculation just works:
int main()
{
uint8_t x = 0xff;
uint8_t y = x + 20;
uint8_t res = y - x;
printf("Expect 20: %d\n", res); // res is 20
return 0;
}
If you do not know which one is smaller:
int main()
{
uint8_t x = 0xff;
uint8_t y = x + 20;
int8_t res1 = (int8_t)x - y;
int8_t res2 = (int8_t)y - x;
printf("Expect -20 and 20: %d and %d\n", res1, res2);
return 0;
}
Where the difference must be inside the range of uint8_t in this case.
The code experiment helped me to understand the solution better.
The problem should be stated as follows:
Let's assume the position (angle) of two pointers a and b of a clock is given by an uint8_t. The whole circumerence is devided into the 256 values of an uint8_t. How can the smaller distance between the two pointer be calculated efficiently?
A solution is:
uint8_t smaller_distance = abs( (int8_t)( a - b ) );
I suspect there is nothing more effient as otherwise there would be something more efficient than abs().
To echo everyone else replying, if you just subtract the two and interpret the result as unsigned you'll be fine.
Unless you have an explicit counterexample.
Your example of x = 0x2, y= 0x14 would not result in 0x4, it would result in 0xEE, unless you have more constraints on the math that are unstated.
Yet another answer, and hopefully easy to understand:
SUMMARY:
It's assumed the OP's x and y are assigned values from a counter, e.g., from a timer.
(x - y) will always give the value desired, even if the counter wraps.
This assumes the counter is incremented less than 2^N times between y and x,
for N-bit unsigned int's.
DESCRIPTION:
A counter variable is unsigned and it can wrap around.
A uint8 counter would have values:
0, 1, 2, ..., 255, 0, 1, 2, ..., 255, ...
The number of counter tics between two points can be calculated as shown below.
This assumes the counter is incremented less than 256 times, between y and x.
uint8 x, y, counter, counterTics;
<initalize the counter>
<do stuff while the counter increments>
y = counter;
<do stuff while the counter increments>
x = counter;
counterTics = x - y;
EXPLANATION:
For uint8, and the counter-tics from y to x is less than 256 (i.e., less than 2^8):
If (x >= y) then: the counter did not wrap, counterTics == x - y
If (x < y) then: the counter wrapped, counterTics == (256-y) + x
(256-y) is the number of tics before wrapping.
x is the number of tics after wrapping.
Note: if those calculations are made in the order shown, no negative numbers are involved.
This equation holds for both cases: counterTics == (256+x-y) mod 256
For uintN, where N is the number of bits:
counterTics == ((2^N)+x-y) mod (2^N)
The last equation also describes the result in C when subtracting unsigned int's, in general.
This is not to say the compiler or processor uses that equation when subtracting unsigned int's.
RATIONALE:
The explanation is consistent with what is described in this ACM paper:
"Understanding Integer Overflow in C/C++", by Dietz, et al.
HARDWARE INTEGER ARITHMETIC
When an n-bit addition or subtraction operation on unsigned or two’s complement integers overflows, the result “wraps around,” effectively subtracting 2n from, or adding 2n to, the true mathematical result. Equivalently, the result can be considered to occupy n+1 bits; the lower n bits are placed into the result register and the highest-order bit is placed into the processor’s carry flag.
INTEGER ARITHMETIC IN C AND C++
3.3. Unsigned Overflow
A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type.
Thus, the semantics for unsigned overflow in C/C++ are precisely the same as the semantics of processor-level unsigned overflow as described in Section 2. As shown in Table I, UINT MAX+1 must evaluate to zero in a conforming C and C++ implementation.
Also, it's easy to write a C program to test that the cases shown work as described.

Resources