Is it possible to optimize multiplication of an integer with -1/1 without using any multiplication and conditionals/branches?
Can it be done only with bitwise operations and integer addition?
Edit: The final goal is to optimize a scalar product of two integer vectors, where one of the vectors has only -1/1 values.
Most modern processors have an ALU with a fast multiplier (meaning it takes about the same time to add two numbers as to multiply them, give or take one CPU clock), so doing anything but for (i=0;i<VectorLength;++i) { p += (x[i] * y[i]) ; } isn't likely to help. However, try a simple if and see if that gives any benefits gained from the CPU's branch prediction:
for (i=0;i<VectorLength;++i) { p += (y[i]<0) ? -x[i] : x[i] ; }
In any case, if the CPU has fast multiply, doing any trick that involves more than one ALU operation (e.g., negation followed by addition, as in some of the examples given here) will more likely cause loss of performance compared to just one multiplication.
Yes, for instance, the following function returns a*b, where b is either +1 or -1:
int mul(int a, int b)
{
int c[3] = { -a, 0, +a };
return c[1+b];
}
or if both a and b are restricted to +-1:
int mul(int a, int b)
{
int c[5] = { +1, 0, -1, 0, +1 };
return c[a+b+2];
}
Yet another variant without memory access (faster than the ones above):
int mul(int a, int b)
{
return 1 - (signed)( (unsigned)(a+1) ^ (unsigned)(b+1) );
}
This answer works with any signed integer representation (sign-magnitude, ones' complement, two's complement) and does not cause any undefined behaviour.
However, I cannot guarantee that this will be faster than normal multiplication.
int Multiplication(int x, int PlusOrMinusOne)
{
PlusOrMinusOne >>= 1; //becomes 0 or -1
//optionally build 2's complement (invert all bits plus 1)
return (x ^ PlusOrMinusOne) + (PlusOrMinusOne & 1);
}
Here a nice resource for such Bit Twiddling Hacks.
Strictly speaking, no, because C allows for three different integer representations:
Signed magnitude
Ones' complement
Two’s complement
There is a proposal to strictly make it two's complement but until that makes it into the standard, you don't know what your negative numbers look like, so you can't really do too many bit hacks with them.
Less strictly speaking, most implementations use two's complement, so it is reasonably safe to use the hacks shown in other answers.
Assuming 2's compliment for integers:
int i1 = ...; // +1 or -1
int i2 = ...; // +1 or -1
unsigned u1 = i1 + 1; // 0 or 2
unsigned u2 = i2 + 1; // 0 or 2
unsigned u = u1 + u2; // 0 or 2 or 4: 0 and 4 need to become 1, 2 needs to become -1
u = (u & ~4); // 0 is 1 and 2 is -1
int i = u - 1; // -1 is 1 and 1 is -1
i = ~i + 1; // -1 is -1 and 1 is 1 :-)
The same as one-liner:
int i = ~(((i1 + i2 + 2) & ~4) - 1) + 1;
The following is true for i1 and i2 being either +1 or -1:
i == i1 * i2
Related
So I'm doing a practice Capture The Flag problem, The problem reads:
I was trying to implement RSA in C but I forgot the modulus. But I think this might be good enough already?
And then there is this code written in C attached to the question:
#include <stdio.h>
int main()
{
unsigned long long int flag1 = <redacted>;
flag1++;
unsigned long long int flag2 = <redacted>;
unsigned long long int ct1 = 1;
unsigned long long int ct2 = 1;
for (int i = 0; i<65537; i++)
{
ct1 = ct1 * flag1;
ct2 = ct2 * flag2;
}
printf("%llu\n",ct1);
printf("%llu\n",ct2);
}
/*OUTPUT:
7904812928421683021
16220282676865089917
*/
I want to get the flag, that is I want to find the values of flag1 and flag2 which outputted ct1(7904812928421683021) and ct2(16220282676865089917) after running in the program. Therefore I want to get the values of flag1 and flag 2 which is indicated by 'redacted' in the program.
I've done some research and found that 65537 is the public key exponent also in the question it states that they forgot the modulus. I've been sitting at this problem for hours now and still couldn't find anything particularly useful. I'm a beginner at cryptography so any help would be greatly appreciated.
If anyone of you could help it would mean a lot.
Thank you.
By appearances, unsigned long long int is 64 bits in the C implementation being used, so the arithmetic is performed modulo 264.
The loop that multiplies by flag1 or flag2 65537 times computes flag165537 modulo 264 and flag265537 modulo 264. We are given the results 7904812928421683021 and 16220282676865089917. To find flag1 and flag2 prior to the loop, we want to compute the inverse function.
By a generalization of Fermat’s little theorem, a𝜑(n) ≡ 1 mod n, where 𝜑 is Euler’s totient function. This means that exponentiating modulo n by 𝜑(n) is the same as exponentiating modulo n by 0, which means that exponentiation modulo n works modulo 𝜑(n) in the exponent. In other words, ab ≡ ac mod n if b ≡ c mod 𝜑(n).
The way this is useful to us is that when we have ab mod n, if we can find a c such that bc ≡ 1 mod 𝜑(n), then we can compute (ab)c mod n = abc mod n = a1 mod n = a mod n.
The Wikipedia page for Euler’s totient function tells us 𝜑(n) = n•product(1−1/p for prime p|n). (That product is the multiplication of 1−1/p for each prime p that divides n.) Since our n is 264, the only prime that divides it is 2, for which 1−1/2 = ½, so 𝜑(264) = 264•½ = 263.
Then we need to find the c such that bc ≡ 1 mod 𝜑(n) for b = 65537 and 𝜑(n) = 263. This can be done with the extended Euclidean algorithm. However, since we only need to do it once, and the numbers involved are large enough to be awkward to do in common C implementations, we can simply ask Wolfram Alpha for 65537^-1 mod 2^64, for which it tells us 9,223,090,566,172,966,913.
Next, we need to be able to raise a number to the power of 9,223,090,566,172,966,913. As this would take too long to do by simple iterative multiplication, we can instead use the algorithm below:
#include <inttypes.h>
#include <stdint.h>
#include <stdio.h>
/* This routine computes x**e modulo 2^64 by multiplying a running product by
each x**p such that p is a power of two whose corresponding bit is set in
e. Thus, if e is 19, which is 10011 in binary, the powers of two
represented in it are 2^16, 2^1, and 2^0, so we multiply the running
product by x^(2^0), x^(2^1), and x^(2^16), yielding x^(2^0 + 2^1 + 2^16) =
x^19.
*/
static uint64_t pow_u64(uint64_t x, uint64_t e)
{
uint64_t y = 1; // Initialize running product to 1.
while (e) // Continue while bits remain in exponent.
{
if (e & 1) // If current bit is set, multiply by power of x.
y *= x;
x *= x; // Update to next x to a power-of-two.
e >>= 1; // Update e to move next bit into low position.
}
return y;
}
static void Do(uint64_t y)
{
uint64_t c = 9223090566172966913u;
printf("%" PRIu64 " ^ %" PRIu64 " = %" PRIu64 ".\n", y, c, pow_u64(y, c));
}
int main(void)
{
Do(7904812928421683021u);
Do(16220282676865089917u);
}
This produces the output:
7904812928421683021 ^ 9223090566172966913 = 7380380986431332173.
16220282676865089917 ^ 9223090566172966913 = 5716833052698820989.
from which we see the values of flag1 and flag2 before the loop are 7,380,380,986,431,332,173 and 5,716,833,052,698,820,989.
Since flag1 was incremented with ++ after it was initialized by <redacted>, we subtract 1 to get the initial value, 7,380,380,986,431,332,172. flag2 was directly initialized with 5,716,833,052,698,820,989.
I want to get the carry bit of adding two unsigned 64-bit integers in c.
I can use x86-64 asm if needed.
code:
#include <stdio.h>
typedef unsigned long long llu;
int main(void){
llu a = -1, b = -1;
int carry = /*carry of a+b*/;
llu res = a+b;
printf("a+b = %llu (because addition overflowed), carry bit = %d\n", res, carry);
return 0;
}
As #EugeneSh. observes, the carry is either 0 or 1. Moreover, given that a and b both have the same unsigned type, their sum is well defined even if the arithmetic result exceeds the range of their type. Moreover, the (C) result of the sum will be less than both a and b when overflow occurs, and greater otherwise, so we can use the fact that C relational operations evaluate to either 0 or 1 to express the carry bit as
carry = (a + b) < a;
That does not require any headers, nor does it depend on a specific upper bound, or even on a and b having the same type. As long as both have unsigned types, it reports correctly on whether the sum overflows the wider of their types or unsigned int (whichever is wider), which is the same as their sum setting the carry bit. As a bonus, it is expressed in terms of the sum itself, which I think makes it clear what's being tested.
Carry can be only 0 or 1. 1 if there was a wrapping-around and 0 otherwise.
The wrapping-around is happening in case a + b > ULONG_LONG_MAX is true . Note, this is in mathematical terms, not in terms of C, as if a + b is actually overflowing, then this will not work. Instead you want to rearrange it to be a > ULONG_LONG_MAX - b. So the value of carry will be:
carry = a > ULONG_LONG_MAX - b ? 1 : 0;
or any preferred style equivalent.
Don't forget to include limits.h.
A buddy of mine had these puzzles and this is one that is eluding me. Here is the problem, you are given a number and you want to return that number times 3 and divided by 16 rounding towards 0. Should be easy. The catch? You can only use the ! ~ & ^ | + << >> operators and of them only a combination of 12.
int mult(int x){
//some code here...
return y;
}
My attempt at it has been:
int hold = x + x + x;
int hold1 = 8;
hold1 = hold1 & hold;
hold1 = hold1 >> 3;
hold = hold >> 4;
hold = hold + hold1;
return hold;
But that doesn't seem to be working. I think I have a problem of losing bits but I can't seem to come up with a way of saving them. Another perspective would be nice. Just to add, you also can only use variables of type int and no loops, if statements or function calls may be used.
Right now I have the number 0xfffffff. It is supposed to return 0x2ffffff but it is returning 0x3000000.
For this question you need to worry about the lost bits before your division (obviously).
Essentially, if it is negative then you want to add 15 after you multiply by 3. A simple if statement (using your operators) should suffice.
I am not going to give you the code but a step by step would look like,
x = x*3
get the sign and store it in variable foo.
have another variable hold x + 15;
Set up an if statement so that if x is negative it uses that added 15 and if not then it uses the regular number (times 3 which we did above).
Then divide by 16 which you already showed you know how to do. Good luck!
This seems to work (as long as no overflow occurs):
((num<<2)+~num+1)>>4
Try this JavaScript code, run in console:
for (var num = -128; num <= 128; ++num) {
var a = Math.floor(num * 3 / 16);
var b = ((num<<2)+~num+1)>>4;
console.log(
"Input:", num,
"Regular math:", a,
"Bit math:", b,
"Equal: ", a===b
);
}
The Maths
When you divide a positive integer n by 16, you get a positive integer quotient k and a remainder c < 16:
(n/16) = k + (c/16).
(Or simply apply the Euclidan algorithm.) The question asks for multiplication by 3/16, so multiply by 3
(n/16) * 3 = 3k + (c/16) * 3.
The number k is an integer, so the part 3k is still a whole number. However, int arithmetic rounds down, so the second term may lose precision if you divide first, And since c < 16, you can safely multiply first without overflowing (assuming sizeof(int) >= 7). So the algorithm design can be
(3n/16) = 3k + (3c/16).
The design
The integer k is simply n/16 rounded down towards 0. So k can be found by applying a single AND operation. Two further operations will give 3k. Operation count: 3.
The remainder c can also be found using an AND operation (with the missing bits). Multiplication by 3 uses two more operations. And shifts finishes the division. Operation count: 4.
Add them together gives you the final answer.
Total operation count: 8.
Negatives
The above algorithm uses shift operations. It may not work well on negatives. However, assuming two's complement, the sign of n is stored in a sign bit. It can be removed beforing applying the algorithm and reapplied on the answer.
To find and store the sign of n, a single AND is sufficient.
To remove this sign, OR can be used.
Apply the above algorithm.
To restore the sign bit, Use a final OR operation on the algorithm output with the stored sign bit.
This brings the final operation count up to 11.
what you can do is first divide by 4 then add 3 times then again devide by 4.
3*x/16=(x/4+x/4+x/4)/4
with this logic the program can be
main()
{
int x=0xefffffff;
int y;
printf("%x",x);
y=x&(0x80000000);
y=y>>31;
x=(y&(~x+1))+(~y&(x));
x=x>>2;
x=x&(0x3fffffff);
x=x+x+x;
x=x>>2;
x=x&(0x3fffffff);
x=(y&(~x+1))+(~y&(x));
printf("\n%x %d",x,x);
}
AND with 0x3fffffff to make msb's zero. it'l even convert numbers to positive.
This uses 2's complement of negative numbers. with direct methods to divide there will be loss of bit accuracy for negative numbers. so use this work arround of converting -ve to +ve number then perform division operations.
Note that the C99 standard states in section section 6.5.7 that right shifts of signed negative integer invokes implementation-defined behavior. Under the provisions that int is comprised of 32 bits and that right shifting of signed integers maps to an arithmetic shift instruction, the following code works for all int inputs. A fully portable solution that also fulfills the requirements set out in the question may be possible, but I cannot think of one right now.
My basic idea is to split the number into high and low bits to prevent intermediate overflow. The high bits are divided by 16 first (this is an exact operation), then multiplied by three. The low bits are first multiplied by three, then divided by 16. Since arithmetic right shift rounds towards negative infinity instead of towards zero like integer division, a correction needs to be applied to the right shift for negative numbers. For a right shift by N, one needs to add 2N-1 prior to the shift if the number to be shifted is negative.
#include <stdio.h>
#include <stdlib.h>
int ref (int a)
{
long long int t = ((long long int)a * 3) / 16;
return (int)t;
}
int main (void)
{
int a, t, r, c, res;
a = 0;
do {
t = a >> 4; /* high order bits */
r = a & 0xf; /* low order bits */
c = (a >> 31) & 15; /* shift correction. Portable alternative: (a < 0) ? 15 : 0 */
res = t + t + t + ((r + r + r + c) >> 4);
if (res != ref(a)) {
printf ("!!!! error a=%08x res=%08x ref=%08x\n", a, res, ref(a));
return EXIT_FAILURE;
}
a++;
} while (a);
return EXIT_SUCCESS;
}
I found many posts about bitwise division and I completely understand most bitwise usage but I can't think of a specific division. I want to divide a given number (lets say 100) with all the multiples of 2 possible (ATTENTION: I don't want to divide with powers of 2 bit multiples!)
For example: 100/2, 100/4, 100/6, 100/8, 100/10...100/100
Also I know that because of using unsigned int the answers will be rounded for example 100/52=0 but it doesn't really matter, because I can both skip those answers or print them, no problem. My concern is mostly how I can divide with 6 or 10, etc. (multiples of 2). There is need for it to be done in C, because I can manage to transform any code you give me from Java to C.
Following the math shown for the accepted solution to the division by 3 question, you can derive a recurrence for the division algorithm:
To compute (int)(X / Y)
Let k be such that 2k ≥ Y and 2k-1 < Y
(note, 2k = (1 << k))
Let d = 2k - Y
Then, if A = (int)(X / 2k) and B = X % 2k,
X = (1 << k) * A + B
= (1 << k) * A - Y * A + Y * A + B
= d * A + Y * A + B
= Y * A + (d * A + B)
Thus,
X/Y = A + (d * A + B)/Y
In otherwords,
If S(X, Y) := X/Y, then S(X, Y) := A + S(d * A + B, Y).
This recurrence can be implemented with a simple loop. The stopping condition for the loop is when the numerator falls below 2k. The function divu implements the recurrence, using only bitwise operators and using unsigned types. Helper functions for the math operations are left unimplemented, but shouldn't be too hard (the linked answer provides a full add implementation already). The rs() function is for "right-shift", which does sign extension on the unsigned input. The function div is the actual API for int, and checks for divide by zero and negative y before delegating to divu. negate does 2's complement negation.
static unsigned divu (unsigned x, unsigned y) {
unsigned k = 0;
unsigned pow2 = 0;
unsigned mask = 0;
unsigned diff = 0;
unsigned sum = 0;
while ((1 << k) < y) k = add(k, 1);
pow2 = (1 << k);
mask = sub(pow2, 1);
diff = sub(pow2, y);
while (x >= pow2) {
sum = add(sum, rs(x, k));
x = add(mul(diff, rs(x, k)), (x & mask));
}
if (x >= y) sum = add(sum, 1);
return sum;
}
int div (int x, int y) {
assert(y);
if (y > 0) return divu(x, y);
return negate(divu(x, negate(y)));
}
This implementation depends on signed int using 2's complement. For maximal portability, div should convert negative arguments to 2's complement before calling divu. Then, it should convert the result from divu back from 2's complement to the native signed representation.
The following code works for positive numbers. When the dividend or the divisor or both are negative, have flags to change the sign of the answer appropriately.
int divi(long long m, long long n)
{
if(m==0 || n==0 || m<n)
return 0;
long long a,b;
int f=0;
a=n;b=1;
while(a<=m)
{
b = b<<1;
a = a<<1;
f=1;
}
if(f)
{
b = b>>1;
a = a>>1;
}
b = b + divi(m-a,n);
return b;
}
Use the operator / for integer division as much as you can.
For instance, when you want to divide 100 by 6 or 10 you should write 100/6 or 100/10.
When you mention bit wise division do you (1) mean an implementation of operator / or (2) you are referring to the division by a power of two number.
For (1) a processor should have an integer division unit. If not the compiler should provide a good implementation.
For (2) you can use 100>>2 instead of 100/4. If the numerator is known at compile time then a good compiler should automatically use the shift instruction.
Let us say we have x and y and both are signed integers in C, how do we find the most accurate mean value between the two?
I would prefer a solution that does not take advantage of any machine/compiler/toolchain specific workings.
The best I have come up with is:(a / 2) + (b / 2) + !!(a % 2) * !!(b %2) Is there a solution that is more accurate? Faster? Simpler?
What if we know if one is larger than the other a priori?
Thanks.
D
Editor's Note: Please note that the OP expects answers that are not subject to integer overflow when input values are close to the maximum absolute bounds of the C int type. This was not stated in the original question, but is important when giving an answer.
After accept answer (4 yr)
I would expect the function int average_int(int a, int b) to:
1. Work over the entire range of [INT_MIN..INT_MAX] for all combinations of a and b.
2. Have the same result as (a+b)/2, as if using wider math.
When int2x exists, #Santiago Alessandri approach works well.
int avgSS(int a, int b) {
return (int) ( ((int2x) a + b) / 2);
}
Otherwise a variation on #AProgrammer:
Note: wider math is not needed.
int avgC(int a, int b) {
if ((a < 0) == (b < 0)) { // a,b same sign
return a/2 + b/2 + (a%2 + b%2)/2;
}
return (a+b)/2;
}
A solution with more tests, but without %
All below solutions "worked" to within 1 of (a+b)/2 when overflow did not occur, but I was hoping to find one that matched (a+b)/2 for all int.
#Santiago Alessandri Solution works as long as the range of int is narrower than the range of long long - which is usually the case.
((long long)a + (long long)b) / 2
#AProgrammer, the accepted answer, fails about 1/4 of the time to match (a+b)/2. Example inputs like a == 1, b == -2
a/2 + b/2 + (a%2 + b%2)/2
#Guy Sirton, Solution fails about 1/8 of the time to match (a+b)/2. Example inputs like a == 1, b == 0
int sgeq = ((a<0)==(b<0));
int avg = ((!sgeq)*(a+b)+sgeq*(b-a))/2 + sgeq*a;
#R.., Solution fails about 1/4 of the time to match (a+b)/2. Example inputs like a == 1, b == 1
return (a-(a|b)+b)/2+(a|b)/2;
#MatthewD, now deleted solution fails about 5/6 of the time to match (a+b)/2. Example inputs like a == 1, b == -2
unsigned diff;
signed mean;
if (a > b) {
diff = a - b;
mean = b + (diff >> 1);
} else {
diff = b - a;
mean = a + (diff >> 1);
}
If (a^b)<=0 you can just use (a+b)/2 without fear of overflow.
Otherwise, try (a-(a|b)+b)/2+(a|b)/2. -(a|b) is at least as large in magnitude as both a and b and has the opposite sign, so this avoids the overflow.
I did this quickly off the top of my head so there might be some stupid errors. Note that there are no machine-specific hacks here. All behavior is completely determined by the C standard and the fact that it requires twos-complement, ones-complement, or sign-magnitude representation of signed values and specifies that the bitwise operators work on the bit-by-bit representation. Nope, the relative magnitude of a|b depends on the representation...
Edit: You could also use a+(b-a)/2 when they have the same sign. Note that this will give a bias towards a. You can reverse it and get a bias towards b. My solution above, on the other hand, gives bias towards zero if I'm not mistaken.
Another try: One standard approach is (a&b)+(a^b)/2. In twos complement it works regardless of the signs, but I believe it also works in ones complement or sign-magnitude if a and b have the same sign. Care to check it?
Edit: version fixed by #chux - Reinstate Monica:
if ((a < 0) == (b < 0)) { // a,b same sign
return a/2 + b/2 + (a%2 + b%2)/2;
} else {
return (a+b)/2;
}
Original answer (I'd have deleted it if it hadn't been accepted).
a/2 + b/2 + (a%2 + b%2)/2
Seems the simplest one fitting the bill of no assumption on implementation characteristics (it has a dependency on C99 which specifying the result of / as "truncated toward 0" while it was implementation dependent for C90).
It has the advantage of having no test (and thus no costly jumps) and all divisions/remainder are by 2 so the use of bit twiddling techniques by the compiler is possible.
For unsigned integers the average is the floor of (x+y)/2. But the same fails for signed integers. This formula fails for integers whose sum is an odd -ve number as their floor is one less than their average.
You can read up more at Hacker's Delight in section 2.5
The code to calculate average of 2 signed integers without overflow is
int t = (a & b) + ((a ^ b) >> 1)
unsigned t_u = (unsigned)t
int avg = t + ( (t_u >> 31 ) & (a ^ b) )
I have checked it's correctness using Z3 SMT solver
Just a few observations that may help:
"Most accurate" isn't necessarily unique with integers. E.g. for 1 and 4, 2 and 3 are an equally "most accurate" answer. Mathematically (not C integers):
(a+b)/2 = a+(b-a)/2 = b+(a-b)/2
Let's try breaking this down:
If sign(a)!=sign(b) then a+b will will not overflow. This case can be determined by comparing the most significant bit in a two's complement representation.
If sign(a)==sign(b) then if a is greater than b, (a-b) will not overflow. Otherwise (b-a) will not overflow. EDIT: Actually neither will overflow.
What are you trying to optimize exactly? Different processor architectures may have different optimal solutions. For example, in your code replacing the multiplication with an AND may improve performance. Also in a two's complement architecture you can simply (a & b & 1).
I'm just going to throw some code out, not looking too fast but perhaps someone can use and improve:
int sgeq = ((a<0)==(b<0));
int avg = ((!sgeq)*(a+b)+sgeq*(b-a))/2 + sgeq*a
I would do this, convert both to long long(64 bit signed integers) add them up, this won't overflow and then divide the result by 2:
((long long)a + (long long)b) / 2
If you want the decimal part, store it as a double.
It is important to note that the result will fit in a 32 bit integer.
If you are using the highest-rank integer, then you can use:
((double)a + (double)b) / 2
This answer fits to any number of integers:
int[] array = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
decimal avg = 0;
for (int i = 0; i < array.Length; i++){
avg = (array[i] - avg) / (i+1) + avg;
}
expects avg == 5.0 for this test