It seems it should be a straightforward to shift/rotate an array by n bits.
However, the programming language I'm using (Solidity) doesn't have any such operator (i.e. there's no shift or rotate operator)...
I have an unsigned, 256-bit integer (which is a Solidity type uint256).
I was wondering if I could somehow do a shift or rotate operation "manually"?
I mean, perform some series of multiplication (*), mod (%) or similar operations to give the desired shift and rotate? I know this could be very inefficient, but I only need to do this operation once or twice an hour so it doesn't matter in my use-case.
If there isn't a shift function then you will likely have to do some series of *2 or slightly better would be:
val*[2^(number of shifts)]
Related
In my program I have a great presence of the operation n % 10. I know that the module operation can be done much faster when we have n% m where m is the power of 2, since it can be replaced by n & (m-1 ), however Is there any faster way to calculate modulus if the operand is 10?
In my case n is a uint8_t in some cases and in other cases n is an uint32_t.
Because most modern processors can do multiplication much, much faster than division, it is often possible to speed up division and modulus operations where the dividend is a known small constant by replacing the division with one or two multiplications and a few other fast operations (such as shift and addition).
To do so requires computing at compile-time some magic numbers dependent on the dividend; fortunately most modern compilers know how to do this so you don't need to do anything to take advantage. Just let your compiler do the heavy lifting for you, as #chux suggests in an excellent answer.
You can help the compiler by using unsigned types; for some dividends, signed division and modulus are harder to replace.
The basic outline of the optimisation of modulus looks like this:
If you had exact arithmetic, you could replace x % p with p * ((x * (1/p)) % 1). For constant p, 1/p can be precomputed at compile time. The %1 operation simply consists of discarding the fraction part, which is just a right-shift. So that replaces a division with two multiplies, and if p only has a few bits set, the multiply by p might be further optimised into a few left-shifts.
We can do that computation with fixed-point arithmetic, taking advantage of the fact that most processors produce a double-sized result for integer multiplication. Since we don't care about the integer part of the inner multiplication and we know that the result of the outer multiplication must be less than p, we only need to reserve ceil(log2 p) bits for the integer part of the computation, leaving the rest of the bits for the fraction. And that might give us enough precision to correctly handle the possible range of values of x, particularly if x has a limited range (eg. uint8_t or even uint16_t). The key is finding a position of the fixed-point which minimises the error in representation of 1/p.
For many small values of p, that works. For others, there is an alternative (but slower) solution which involves estimating q = x/p using multiplication by the inverse, and then computing x - q * p. If the estimate of q can be guaranteed to be either correct or off by one in a known direction, we only need to correct the final computation by conditionally adding or subtracting p; that can be accomplished without a branch on many modern CPUs. (The direction of the error is known because it will depend only on whether the approximation we chose for the inverse of the dividend was too small or too big.)
In the very specific case of x % 10 where x is a uint_8, you might be able to do better than the above using a 256-byte lookup table. That would only be worthwhile if you were doing the modulus operation in a tight loop over a large number of values, and even then you'd want to profile carefully to verify that it is an improvement.
I doubt whether that's the best expenditure of your time; there are probably much more fruitful optimisation opportunities in your application.
however Is there any faster way to calculate modulus if the operand is 10?
With a good compiler, no. The compiler would have already emitted good code. You can explore different optimization settings with the compiler.
OTOH, if you know of some restrictions that the compiler cannot assume with n % 10, like values are always positive or of a sub-range, you might be able to out optimize the compiler.
Such micro-optimisation is usually not efficient use of programmer's time.
Is there a way to implement the Left Arithmetic Shift and Right Arithmetic Shift, using only operations AND, OR, NOT, XOR?
In each of the operations AND, OR, NOT, and XOR, each bit in the result is solely a function of the one (OT) or two (AND, OR, XOR) bits in the same position in the operands. In a shift by any amount other than zero, each bit in the result is a function of a bit in a different position in the operand being shifted. Therefore, it is not possible to compute a shift solely from AND, OR, NOT, and XOR.
Consider a = 0b0011.
Then we have ~a = 0b1100.
We also have a | ~a = 0b1111.
And also a & ~a = 0b0000.
You can manually check all possible combinations of &, ^, ~, and | to see that we can't make anything more than those four binary values. None of which are 0b0110 (what we want from left shift) or 0b0001 (what we want from right shift).
Since we found a number for which it can't be done, then we know in general it can't be done.
IT IS POSSIBLE
Basically, the main logic of shifting is using shifters (barrel shifter and barrel shifter with multiplexers) which in turn can be created using multiplexers. And you can create multiplexers easily using the logical operations that you mentioned. Here is 2-to-1 MUX created with NAND gates: Mux with NAND
If I want to combine two numbers (Int,Long,...) n1,n2in a non-commutative way, p*n1 + n2 where p is an arbitrary prime seems reasonable enough a choice.
As many hashing options return a byte array, though, I am now trying to substitute the numbers with byte arrays.
Assume a,b:Array[Byte] are of the same length.
+ simply becomes an xor
but what should I use as a "Multiplication"?
p:Long a(n arbitrary) prime, a:Array[Byte] of arbitrary length
I could, of course, convert a to a long, multiply, then convert the result back to an Array of Bytes. The problem with that is that I will need "p*a" to be of the same length as a for the subsequent xor to make sense. I could circumvent this by zero-extending the shorter of the two byte arrays, but then the byte arrays quickly grow in length.
I could, on the other hand, convert p to a byte array and xor it with a. Here, the issue is that then (p*(p*a+b)+c) becomes (a+b+c), which is commutative, which we don't want.
I could add p to every byte in the array (throwing away the overflow).
I could add p to every byte in the array (not throwing away the overflow).
I could circular shift a by some f(p) bits (and hope it doesn't end up becoming a again)
And I could think of a lot more nonsense. But what should I do? What actually makes sense?
If you want to mimic the original ideal of multiplying by a prime, the obvious generalization is to do arithmetic in the Galois field GF(2^8) - see https://en.wikipedia.org/wiki/Finite_field_arithmetic and note that you can essentially use log and antilog tables of size 256 to replace multiplication with not much more than table lookup - https://en.wikipedia.org/wiki/Finite_field_arithmetic#Implementation_tricks. Arithmetic over a finite field of any sort will have many of the nice properties of arithmetic modulo a prime - arithmetic modulo p is GP(p) or GF(p^1), if you prefer.
However this is all rather untried and perhaps a little high-flown. Other options include checksum algorithms such as https://en.wikipedia.org/wiki/Adler-32 or - if you already have a hash algorithm that maps long strings into a short array of bytes, simply concatenating the two arrays of bytes to be combined and running the result through the hash algorithm again, perhaps with some padding before and after to give you some parameters you can play with if you need to vary or tune things.
I am doing a CRC program in C. Basically I am taking input in binary as a char array as dividend and divisor. Now I want to perform division operation on this two numbers. To perform arithmetic operation I will first convert this number in to integers, for example, '0' + t[1]. Now how do I perform bitwise modulo operation on this bits? Or if anyone know any better way to implement CRC on sender and receiver side, please suggest me?
Ok I think my words are bit confusing. I will try to make understand what I want to do by giving a simple example, suppose the dividend entered is 11100101 and the divisor is 11011, then what should happen is
Observe in the image given that selected numbers are divided they are XORed against the result. Here I want the same thing to happen in my program as is illustrated in the image. How do I perform this kind of bit by bit division?
I have a big number (integer, unsigned) stored in 2 variables (as you can see, the high and low part of number):
unsigned long long int high;
unsigned long long int low;
I know how to add or subtract some other that-kind of variable.
But I need to divide that-kind of numbers. How to do it? I know, I can subtract N times, but, maybe, there are more better solutions. ;-)
Language: C
Yes. It will involve shifts, and I don't recommend doing that in C. This is one of those rare examples where assembler can still prove its value, easily making things run hundreds of times faster (And I don't think I'm exaggerating this.)
I don't claim total correctness, but the following should get you going :
(1) Initialize result to zero.
(2) Shift divisor as many bits as possible to the left, without letting it become greater than the dividend.
(3) Subtract shifted divisor from dividend and add one to result.
(4) Now shift divisor to the right until once again, it is less than the remaining dividend, and for each right-shift, left-shift result by one bit. Go back to (3) unless stopping condition is satisfied. (Stopping condition must be something like "divisor has become zero", but I'm not certain about that.)
It really feels great to get back to some REAL programming problems :-)
Have you looked at any large-number libraries, such as GNU MP BigNum?
I know, I can subtract N times, but, maybe, there are more better solutions.
Subtracting N times may be slow when N is large.
Better (i.e. more complicated but faster) would be shift-and-subtract, using the algorithm you learned to do long division of decimal numbers in elementary school.
[There may also be 3rd-party library and/or compiler-specific support for such numbers.]
Hmm. I suppose if you have some headroom in "high", you could shift it all up one digit, divide high by the number, then add the remainder to the top remaining digit in low and divide low by the number, then shift everything back.
Here's another library doing 128 bit arithmetic. GnuCash: Math128.
Per my commenters below, my previous answer was stupid.
Quickly, my new answer would be that when I've tried to do this in the past, it almost always involved shifting, because it's the only operation that can be applied across multiple "words", if you will, and have it look the same as if it were one large word (with the exception of having to track carryover bits).
There are a couple different approaches to it, but I don't know of any better general direction than using shifts, unless your hardware has some special operations.
You could implement a "BigInt" type algorithm that does divisions on string arrays. Create 1 string array for each high,low pair and do the division. Store the result in another string array, then convert back to high,low integer pair.
Since the language is C, the array would probably be a character array. Consider it analogous to the "string array" I was mentioning above.
You can do addition and subtraction of arbitrarily large binary objects using the assembler looping and "add/subtract with carry (adc/sbb)" instructions. You can implement the other operations using them. I've never investigated doing anything beyond those two personally.
If your processor (or your C library) has a fast 64-bit divide, you can break the 128-bit divide into pieces (the same way you'd do a 32-bit divide on processors that had 16-bit divisions).
By the way, there are all sorts of tricks you can use if you know what typical values will be for the dividend and divisor. What is the source of these numbers? If a lot of your cases can be solved quickly, it might be OK the occasional case takes a long time.
Also, if you can find cases where an approximate answer is OK, that opens the door to a lot of speedy approximations.