How is a CRC32 checksum calculated? - c

Maybe I'm just not seeing it, but CRC32 seems either needlessly complicated, or insufficiently explained anywhere I could find on the web.
I understand that it is the remainder from a non-carry-based arithmetic division of the message value, divided by the (generator) polynomial, but the actual implementation of it escapes me.
I've read A Painless Guide To CRC Error Detection Algorithms, and I must say it was not painless. It goes over the theory rather well, but the author never gets to a simple "this is it." He does say what the parameters are for the standard CRC32 algorithm, but he neglects to lay out clearly how you get to it.
The part that gets me is when he says "this is it" and then adds on, "oh by the way, it can be reversed or started with different initial conditions," and doesn't give a clear answer of what the final way of calculating a CRC32 checksum given all of the changes he just added.
Is there a simpler explanation of how CRC32 is calculated?
I attempted to code in C how the table is formed:
for (i = 0; i < 256; i++)
{
temp = i;
for (j = 0; j < 8; j++)
{
if (temp & 1)
{
temp >>= 1;
temp ^= 0xEDB88320;
}
else {temp >>= 1;}
}
testcrc[i] = temp;
}
but this seems to generate values inconsistent with values I have found elsewhere on the Internet. I could use the values I found online, but I want to understand how they were created.
Any help in clearing up these incredibly confusing numbers would be very appreciated.

The polynomial for CRC32 is:
x32 + x26 + x23 + x22 + x16 + x12 + x11 + x10 + x8 + x7 + x5 + x4 + x2 + x + 1
Wikipedia
CRC calculation
Or in hex and binary:
0x 01 04 C1 1D B7
1 0000 0100 1100 0001 0001 1101 1011 0111
The highest term (x32) is usually not explicitly written, so it can instead be represented in hex just as
0x 04 C1 1D B7
Feel free to count the 1s and 0s, but you'll find they match up with the polynomial, where 1 is bit 0 (or the first bit) and x is bit 1 (or the second bit).
Why this polynomial? Because there needs to be a standard given polynomial and the standard was set by IEEE 802.3. Also it is extremely difficult to find a polynomial that detects different bit errors effectively.
You can think of the CRC-32 as a series of "Binary Arithmetic with No Carries", or basically "XOR and shift operations". This is technically called Polynomial Arithmetic.
CRC primer, Chapter 5
To better understand it, think of this multiplication:
(x^3 + x^2 + x^0)(x^3 + x^1 + x^0)
= (x^6 + x^4 + x^3
+ x^5 + x^3 + x^2
+ x^3 + x^1 + x^0)
= x^6 + x^5 + x^4 + 3*x^3 + x^2 + x^1 + x^0
If we assume x is base 2 then we get:
x^7 + x^3 + x^2 + x^1 + x^0
CRC primer Chp.5
Why? Because 3x^3 is 11x^11 (but we need only 1 or 0 pre digit) so we carry over:
=1x^110 + 1x^101 + 1x^100 + 11x^11 + 1x^10 + 1x^1 + x^0
=1x^110 + 1x^101 + 1x^100 + 1x^100 + 1x^11 + 1x^10 + 1x^1 + x^0
=1x^110 + 1x^101 + 1x^101 + 1x^11 + 1x^10 + 1x^1 + x^0
=1x^110 + 1x^110 + 1x^11 + 1x^10 + 1x^1 + x^0
=1x^111 + 1x^11 + 1x^10 + 1x^1 + x^0
But mathematicians changed the rules so that it is mod 2. So basically any binary polynomial mod 2 is just addition without carry or XORs. So our original equation looks like:
=( 1x^110 + 1x^101 + 1x^100 + 11x^11 + 1x^10 + 1x^1 + x^0 ) MOD 2
=( 1x^110 + 1x^101 + 1x^100 + 1x^11 + 1x^10 + 1x^1 + x^0 )
= x^6 + x^5 + x^4 + 3*x^3 + x^2 + x^1 + x^0 (or that original number we had)
I know this is a leap of faith but this is beyond my capability as a line-programmer. If you are a hard-core CS-student or engineer I challenge to break this down. Everyone will benefit from this analysis.
So to work out a full example:
Original message : 1101011011
Polynomial of (W)idth 4 : 10011
Message after appending W zeros : 11010110110000
Now we divide the augmented Message by the Poly using CRC arithmetic. This is the same division as before:
1100001010 = Quotient (nobody cares about the quotient)
_______________
10011 ) 11010110110000 = Augmented message (1101011011 + 0000)
=Poly 10011,,.,,....
-----,,.,,....
10011,.,,....
10011,.,,....
-----,.,,....
00001.,,....
00000.,,....
-----.,,....
00010,,....
00000,,....
-----,,....
00101,....
00000,....
-----,....
01011....
00000....
-----....
10110...
10011...
-----...
01010..
00000..
-----..
10100.
10011.
-----.
01110
00000
-----
1110 = Remainder = THE CHECKSUM!!!!
The division yields a quotient, which we throw away, and a remainder, which is the calculated checksum. This ends the calculation. Usually, the checksum is then appended to the message and the result transmitted. In this case the transmission would be: 11010110111110.
CRC primer, Chapter 7
Only use a 32-bit number as your divisor and use your entire stream as your dividend. Throw out the quotient and keep the remainder. Tack the remainder on the end of your message and you have a CRC32.
Average guy review:
QUOTIENT
----------
DIVISOR ) DIVIDEND
= REMAINDER
Take the first 32 bits.
Shift bits
If 32 bits are less than DIVISOR, go to step 2.
XOR 32 bits by DIVISOR. Go to step 2.
(Note that the stream has to be dividable by 32 bits or it should be padded. For example, an 8-bit ANSI stream would have to be padded. Also at the end of the stream, the division is halted.)

For IEEE802.3, CRC-32. Think of the entire message as a serial bit stream, append 32 zeros to the end of the message. Next, you MUST reverse the bits of EVERY byte of the message and do a 1's complement the first 32 bits. Now divide by the CRC-32 polynomial, 0x104C11DB7. Finally, you must 1's complement the 32-bit remainder of this division bit-reverse each of the 4 bytes of the remainder. This becomes the 32-bit CRC that is appended to the end of the message.
The reason for this strange procedure is that the first Ethernet implementations would serialize the message one byte at a time and transmit the least significant bit of every byte first. The serial bit stream then went through a serial CRC-32 shift register computation, which was simply complemented and sent out on the wire after the message was completed. The reason for complementing the first 32 bits of the message is so that you don't get an all zero CRC even if the message was all zeros.

I published a tutorial on CRC-32 hashes, here:
CRC-32 hash tutorial - AutoHotkey Community
In this example from it, I demonstrate how to calculate the CRC-32 hash for the 'ANSI' (1 byte per character) string 'abc':
calculate the CRC-32 hash for the 'ANSI' string 'abc':
inputs:
dividend: binary for 'abc': 0b011000010110001001100011 = 0x616263
polynomial: 0b100000100110000010001110110110111 = 0x104C11DB7
start with the 3 bytes 'abc':
61 62 63 (as hex)
01100001 01100010 01100011 (as bin)
reverse the bits in each byte:
10000110 01000110 11000110
append 32 0 bits:
10000110010001101100011000000000000000000000000000000000
XOR (exclusive or) the first 4 bytes with 0xFFFFFFFF:
(i.e. flip the first 32 bits:)
01111001101110010011100111111111000000000000000000000000
next we will perform 'CRC division':
a simple description of 'CRC division':
we put a 33-bit box around the start of a binary number,
start of process:
if the first bit is 1, we XOR the number with the polynomial,
if the first bit is 0, we do nothing,
we then move the 33-bit box right by 1 bit,
if we have reached the end of the number,
then the 33-bit box contains the 'remainder',
otherwise we go back to 'start of process'
note: every time we perform a XOR, the number begins with a 1 bit,
and the polynomial always begins with a 1 bit,
1 XORed with 1 gives 0, so the resulting number will always begin with a 0 bit
'CRC division':
'divide' by the polynomial 0x104C11DB7:
01111001101110010011100111111111000000000000000000000000
100000100110000010001110110110111
---------------------------------
111000100010010111111010010010110
100000100110000010001110110110111
---------------------------------
110000001000101011101001001000010
100000100110000010001110110110111
---------------------------------
100001011101010011001111111101010
100000100110000010001110110110111
---------------------------------
111101101000100000100101110100000
100000100110000010001110110110111
---------------------------------
111010011101000101010110000101110
100000100110000010001110110110111
---------------------------------
110101110110001110110001100110010
100000100110000010001110110110111
---------------------------------
101010100000011001111110100001010
100000100110000010001110110110111
---------------------------------
101000011001101111000001011110100
100000100110000010001110110110111
---------------------------------
100011111110110100111110100001100
100000100110000010001110110110111
---------------------------------
110110001101101100000101110110000
100000100110000010001110110110111
---------------------------------
101101010111011100010110000001110
100000100110000010001110110110111
---------------------------------
110111000101111001100011011100100
100000100110000010001110110110111
---------------------------------
10111100011111011101101101010011
we obtain the 32-bit remainder:
0b10111100011111011101101101010011 = 0xBC7DDB53
note: the remainder is a 32-bit number, it may start with a 1 bit or a 0 bit
XOR the remainder with 0xFFFFFFFF:
(i.e. flip the 32 bits:)
0b01000011100000100010010010101100 = 0x438224AC
reverse bits:
bit-reverse the 4 bytes (32 bits), treating them as one entity:
(e.g. 'abcdefgh ijklmnop qrstuvwx yzABCDEF'
to 'FEDCBAzy xwvutsrq ponmlkji hgfedcba':)
0b00110101001001000100000111000010 = 0x352441C2
thus the CRC-32 hash for the 'ANSI' string 'abc' is: 0x352441C2

A CRC is pretty simple; you take a polynomial represented as bits and the data, and divide the polynomial into the data (or you represent the data as a polynomial and do the same thing). The remainder, which is between 0 and the polynomial is the CRC. Your code is a bit hard to understand, partly because it's incomplete: temp and testcrc are not declared, so it's unclear what's being indexed, and how much data is running through the algorithm.
The way to understand CRCs is to try to compute a few using a short piece of data (16 bits or so) with a short polynomial -- 4 bits, perhaps. If you practice this way, you'll really understand how you might go about coding it.
If you're doing it frequently, a CRC is quite slow to compute in software. Hardware computation is much more efficient, and requires just a few gates.

In addition to the Wikipedia Cyclic redundancy check and Computation of CRC articles, I found a paper entitled Reversing CRC - Theory and Practice* to be a good reference.
There are essentially three approaches for computing a CRC: an algebraic approach, a bit-oriented approach, and a table-driven approach. In Reversing CRC - Theory and Practice*, each of these three algorithms/approaches is explained in theory accompanied in the APPENDIX by an implementation for the CRC32 in the C programming language.
* PDF Link
Reversing CRC – Theory and Practice.
HU Berlin Public Report
SAR-PR-2006-05
May 2006
Authors:
Martin Stigge, Henryk Plötz, Wolf Müller, Jens-Peter Redlich

Then there is always Rosetta Code, which shows crc32 implemented in dozens of computer languages. https://rosettacode.org/wiki/CRC-32 and has links to many explanations and implementations.

In order to reduce crc32 to taking the reminder you need to:
Invert bits on each byte
xor first four bytes with 0xFF (this is to avoid errors on the leading 0s)
Add padding at the end (this is to make the last 4 bytes take part in the hash)
Compute the reminder
Reverse the bits again
xor the result again.
In code this is:
func CRC32 (file []byte) uint32 {
for i , v := range(file) {
file[i] = bits.Reverse8(v)
}
for i := 0; i < 4; i++ {
file[i] ^= 0xFF
}
// Add padding
file = append(file, []byte{0, 0, 0, 0}...)
newReminder := bits.Reverse32(reminderIEEE(file))
return newReminder ^ 0xFFFFFFFF
}
where reminderIEEE is the pure reminder on GF(2)[x]

Related

c bitwise operation to match description

I'm supposed to match the worded descriptions to the bitwise operations. W is one less than the total bits in a's and b's data structure. So if a is 32 bits long W is 31 Here are the worded descriptions:
1. One’s complement of a
2. a.
3. a&b.
4. a * 7.
5. a / 4 .
6. (a<0)?1:-1.
and here are the bitwise descriptions:
a. ̃( ̃a | (b ˆ (MIN_INT + MAX_INT)))
b. ((aˆb)& ̃b)|( ̃(aˆb)&b)
c. 1+(a<<3)+ ̃a
d. (a<<4)+(a<<2)+(a<<1)
e. ((a<0)?(a+3):a)>>2
f. a ˆ (MIN_INT + MAX_INT)
g. ̃((a|( ̃a+1))>>W)&1
h. ̃((a >> W) << 1)
i. a >> 2
I have a few of them solved namely:
a. ̃( ̃a | (b ˆ (MIN_INT + MAX_INT))) = a & b
b. ((aˆb)& ̃b)|( ̃(aˆb)&b) = a
c. 1+(a<<3)+ ̃a = 7 * a
d. (a<<4)+(a<<2)+(a<<1) = 16*a + 4*a + 2*a = 22*a
e. e. ((a<0)?(a+3):a)>>2 = (a<0)?(a/4 + 3/4) : a/4 = a/4 + ((a<0)?(3/4:0)
f. a ˆ (MIN_INT + MAX_INT) = ~a
i. a >> 2 = a/4
So basically all I need help with are g and h
g. ̃((a|( ̃a+1))>>W)&1
h. ̃((a >> W) << 1)
If you wouldn't mind could you also provide an explanation if you could?
I think this is what is going on with g:
g. ̃((a|( ̃a+1))>>W)&1 = ~((a|(two's complement of a) >>W)&1
= ~((a|sign of two's complement of a) &1 = ~(-a)&1
but this could be 1 or 0 so I don't think I did this right.
and for this one:
h. ̃((a >> W) << 1) = ~((sign of a) << 1) = ~((sign of a)*2)
and I don't know where to go from there...
Thank you for your help!!!
For g, consider that (a|~a) sets all bits to 1, so:
~((a|~a) >> W) & 1
~(all_ones >> W) & 1
~1 & 1
0
The only way adding 1 to ~a could possibly affect this result is if the addition flipped the most significant bit of ~a (due to the right shift by W). That can only happen if a is 0 or 2^W. In the latter case, we will get the same result as above because the top bit of (a|X) will always be set. However, when a is 0 ~a+1 (0's twos complement) is also 0 and the final result of the entire expression will instead be 1.
Therefore, g is 1 when a is zero, otherwise it is 0 (i.e. - g is equivalent to the C expression a == 0). That seemingly doesn't match any of your worded descriptions. Indeed, I don't see how any expression (X & 1) possibly matches any of your worded descriptions. None of your worded descriptions matches an expression that evaluates to only 0 or 1 (for all values of a, b).
For h, consider that if a is negative, then its top most bit is set. Because a is signed, right shifting it 31 positions drags the sign bit across all 32 bits of a. Then left shifting it one position sets the least significant bit to 0. Complementing that yields 1. If a is non-negative, then its top most bit is 0 and right shifting that 31 positions yields 0. Left shifting that 1 position still yields 0. Complementing that yields all bits set, which is the 2's complement rep of -1. Therefore, h is equivalent to (a < 0 ? 1 : -1) or #6 of your worded descriptions.

Bit hacking and modulo operation

While reading this: http://graphics.stanford.edu/~seander/bithacks.html#ReverseByteWith64BitsDiv
I came to the phrase:
The last step, which involves modulus division by 2^10 - 1, has the
effect of merging together each set of 10 bits (from positions 0-9,
10-19, 20-29, ...) in the 64-bit value.
(it is about reversing the bits in a number)...
so I did some calculations:
reverted = (input * 0x0202020202ULL & 0x010884422010ULL) % 1023;
b = 74 : 01001010
b
* 0x0202020202 : 1000000010000000100000001000000010
= 9494949494 :01001010010010100100101001001010010010100
& 10884422010 :10000100010000100010000100010000000010000
= 84000010 : 10000100000000000000000000010000
% 1023 : 1111111111
= 82 : 01010010
Now, the only part which is somewhat unclear is the part where the big number modulo by 1023 (2^10 - 1) packs and gives me the inverted bits... I did not find any good doc about relationship between bit operations and the modulo operation (beside x % 2^n == x & (2^n - 1))) so maybe if someone would cast a light on this it would be very fruitful.
The modulo operation does not give you the inverted bits per se, it is just a binning operation.
First Line : word expansion
b * 0x0202020202 = 01001010 01001010 01001010 01001010 01001010 0
The multiplication operation has a convolution property, which means it replicate the input variable several times (5 here since it's a 8-bit word).
First Line : reversing bits
That's the most tricky part of the hack. You have to remember that we are working on a 8-bit word : b = abcdefgh, where [a-h] are either 1 or 0.
b * 0x0202020202 = abcdefghabcdefghabcdefghabcdefghabcdefgha
& 10884422010 = a0000f000b0000g000c0000h000d00000000e0000
Last Line : word binning
Modulo has a peculiar property : 10 ≡ 1 (mod 9) so 100 ≡ 10*10 ≡ 10*1 (mod 9) ≡ 1 (mod 9).
More generally, for a base b, b ≡ 1 (mod b - 1) so for all number a ≡ sum(a_k*b^k) ≡ sum (a_k) (mod b - 1).
In the example, base = 1024 (10 bits) so
b ≡ a0000f000b0000g000c0000h000d00000000e0000
≡ a*base^4 + 0000f000b0*base^3 + 000g000c00*base^2 + 00h000d000*base +00000e0000
≡ a + 0000f000b0 + 000g000c00 + 00h000d000 + 00000e0000 (mod b - 1)
≡ 000000000a
+ 0000f000b0
+ 000g000c00
+ 00h000d000
+ 00000e0000 (mod b - 1)
≡ 00hgfedcba (mod b - 1) since there is no carry (no overlap)

Fastest computation of sum x^5 + x^4 + x^3...+x^0 (Bitwise possible ?) with x=16

For a tree layout that takes benefit of cache line prefetching (reading _next_ cacheline is cheap), I need to solve the address calculation in a really fast way. I was able to boil down the problem to:
newIndex = nowIndex + 1 + (localChildIndex*X)
x would be for example: X = 45 + 44 + 43 + 42 +40.
Note: 4 is the branching factor. In reality it will be 16, so a power of 2. This should be useful to use bitwise stuff?
It would be very bad if it would need a loop to calculate X (performancewise) and stuff like division/multiplication. This appeals to be an interesting problem which I wasn’t able to come up with some nice way of computing it.
Since its part of a tree traversal, 2 modes would be possible: absolute calculation, independent of prior calculations AND incremental calculation which starts with a high X being kept in a variable and then some minimal stuff done to it with every deeper level of the tree.
I hope I was able to make clear what the math should do. Not sure if there is a way to do this fast & without loop - but maybe somebody can come up with a really smart solution. I would like to thank everybody for their help - StackOverflow have been a great teacher to me in the past and I hope to be able to give back more in the future, as my knowledge increases.
I'll answer this in increasing complexity and generality.
If x is fixed to 16 then just use a constant value 1118481. Hooray! (Name it, using magical numbers is bad practice)
If you have a few cases known at compile time use a few constants or even defines, for example:
#define X_2 63
#define X_4 1365
#define X_8 37449
#define X_16 1118481
...
If you have several cases known at execution time initialize and use a lookup table indexed with the exponent.
int _X[MAX_EXPONENT]; // note: give it a more meaningful name :)
Initialize it and then just access with the known exponent of 2^exp at execution time.
newIndex = nowIndex + 1 + (localChildIndex*_X[exp]);
How are these values precalculated, or how to calculate them efficiently on the fly:
The sum X = x^n + x^(n - 1) + ... + x^1 + x^0 is a geometric serie and its finite sum is:
X = x^n + x^(n - 1) + ... + x^1 + x^0 = (1-x^(n + 1))/(1-x)
About the bitwise operations, as Oli Charlesworth has stated if x is a power of 2 (in binary 0..010..0) x^n is also a power of 2, and the sum of different powers of two is equivalent to the OR operation. Thus we could make an expression like:
Let exp be the exponent so that x = 2^exp. (For 16, exp = 4). Then,
X = x^5 + ... + x^1 + x^0
X = (2^exp)^5 + ... + (2^exp)^1 + 1
X = 2^(exp*5) + ... + 2^(exp*1) + 1
now using bitwise, 2^n = 1<<n
X = 1<<(exp*5) | ... | 1<<exp | 1
In C:
int X;
int exp = 4; //for x == 16
X = 1 << (exp*5) | 1 << (exp*4) | 1 << (exp*3) | 1 << (exp*2) | 1 << (exp*1) | 1;
And finally, I can't resist to say: if your expression were more complex and you had to evaluate an arbitrary polynomial a_n*x^n + ... + a_1*x^1 + a_0 in x, instead of implementing the obvious loop, a faster way to evaluate the polynomial is using the Horner's rule.

Can anyone explain why '>>2' shift means 'divided by 4' in C codes?

I know and understand the result.
For example:
<br>
7 (decimal) = 00000111 (binary) <br>
and 7 >> 2 = 00000001 (binary) <br>
00000001 (binary) is same as 7 / 4 = 1 <br>
So 7 >> 2 = 7 / 4 <br>
<br>
But I'd like to know how this logic was created.
Can anyone elaborate on this logic?
(Maybe it just popped up in a genius' head?)
And are there any other similar logics like this ?
It didn't "pop-up" in a genius' head. Right shifting binary numbers would divide a number by 2 and left shifting the numbers would multiply it by 2. This is because 10 is 2 in binary. Multiplying a number by 10(be it binary or decimal or hexadecimal) appends a 0 to the number(which is effectively left shifting). Similarly, dividing by 10(or 2) removes a binary digit from the number(effectively right shifting). This is how the logic really works.
There are plenty of such bit-twiddlery(a word I invented a minute ago) in computer world.
http://graphics.stanford.edu/~seander/bithacks.html Here is for the starters.
This is my favorite book: http://www.amazon.com/Hackers-Delight-Edition-Henry-Warren/dp/0321842685/ref=dp_ob_image_bk on bit-twiddlery.
It is actually defined that way in the C standard.
From section 6.5.7:
The result of E1 >> E2 is E1 right-shifted E2 bit positions. [...]
the value of the result is the integral part of the quotient of E1 / 2E2
On most architectures, x >> 2 is only equal to x / 4 for non-negative numbers. For negative numbers, it usually rounds the opposite direction.
Compilers have always been able to optimize x / 4 into x >> 2. This technique is called "strength reduction", and even the oldest compilers can do this. So there is no benefit to writing x / 4 as x >> 2.
Elaborating on Aniket Inge's answer:
Number: 30710 = 1001100112
How multiply by 10 works in decimal system
10 * (30710)
= 10 * (3*102 + 7*100)
= 3*102+1 + 7*100+1
= 3*103 + 7*101
= 307010
= 30710 << 1
Similarly multiply by 2 in binary,
2 * (1001100112)
= 2 * (1*28 + 1*25 + 1*24 + 1*21 1*20)
= 1*28+1 + 1*25+1 + 1*24+1 + 1*21+1 1*20+1
= 1*29 + 1*26 + 1*25 + 1*22 + 1*21
= 10011001102
= 1001100112 << 1
I think you are confused by the "2" in:
7 >> 2
and are thinking it should divide by 2.
The "2" here means shift the number ("7" in this case) "2" bit positions to the right.
Shifting a number "1"bit position to the right will have the effect of dividing by 2:
8 >> 1 = 4 // In binary: (00001000) >> 1 = (00000100)
and shifting a number "2"bit positions to the right will have the effect of dividing by 4:
8 >> 2 = 2 // In binary: (00001000) >> 2 = (00000010)
Its inherent in the binary number system used in computer.
a similar logic is --- left shifting 'n' times means multiplying by 2^n.
An easy way to see why it works, is to look at the familiar decimal ten-based number system, 050 is fifty, shift it to the right, it becomes 005, five, equivalent to dividing it by 10. The same thing with shifting left, 050 becomes 500, five hundred, equivalent to multiplying it by 10.
All the other numeral systems work the same way.
they do that because shifting is more efficient than actual division. you're just moving all the digits to the right or left, logically multiplying/dividing by 2 per shift
If you're wondering why 7/4 = 1, that's because the rest of the result, (3/4) is truncated off so that it's an interger.
Just my two cents: I did not see any mention to the fact that shifting right does not always produce the same results as dividing by 2. Since right shifting rounds toward negative infinity and integer division rounds to zero, some values (like -1 in two's complement) will just not work as expected when divided.
It's because >> and << operators are shifting the binary data.
Binary value 1000 is the double of binary value 0100
Binary value 0010 is the quarter of binary value 1000
You can call it an idea of a genius mind or just the need of the computer language.
To my belief, a Computer as a device never divides or multiplies numbers, rather it only has a logic of adding or simply shifting the bits from here to there. You can make an algorithm work by telling your computer to multiply, subtract them up, but when the logic reaches for actual processing, your results will be either an outcome of shifting of bits or just adding of bits.
You can simply think that for getting the result of a number being divided by 4, the computer actually right shifts the bits to two places, and gives the result:
7 in 8-bit binary = 00000111
Shift Right 2 places = 00000001 // (Which is for sure equal to Decimal 1)
Further examples:
//-- We can divide 9 by four by Right Shifting 2 places
9 in 8-bit binary = 00001001
Shift right 2 places: 00000010 // (Which is equal to 9/4 or Decimal 2)
A person with deep knowledge of assembly language programming can explain it with more examples. If you want to know the actual sense behind all this, I guess you need to study bit level arithmetic and assembly language of computer.

mathematical equation for AND bitwise operation?

In a shift left operation for example,
5 << 1 = 10
10 << 1 = 20
then a mathematical equation can be made,
n << 1 = n * 2.
If there is an equation for a shift left operation,
then is it possible that there is also a
mathematical equation for
an AND operation?
or any other bitwise operators?
There is no straightforward single operation that maps to every bitwise operation. However, they can all be simulated through iterative means (or one really long formula).
(a & b)
can be done with:
(((a/1 % 2) * (b/1 % 2)) * 1) +
(((a/2 % 2) * (b/2 % 2)) * 2) +
(((a/4 % 2) * (b/4 % 2)) * 4) +
...
(((a/n % 2) * (b/n % 2)) * n)
Where n is 2 to the number of bits that A and B are composed minus one. This assumes integer division (remainder is discarded).
That depends on what you mean by "mathematical equation". There is no easy arithmetic one.
If you look at it from a formal number-theoretic standpoint you can describe bitwise "and" (and "or" and "xor") using only addition, multiplication and -- and this is a rather big "and" from the lay perspective -- first-order predicate logic. But that is most certainly not what you meant, not least because these tools are enough to describe anything a computer can do at all.
Except for specific circumstances, it is not possible to describe bitwise operations in other mathematical operations.
An and operation with 2n-1 is the same as a modulus operation with 2n. An and operation with the inverse of 2n-1 can be seen as a division by 2n, a truncation, and a multiplication by same.
It depends on what you mean by “mathematical”. If you are looking for simple school algebra, then answer is no. But mathematics is not sacred — mathematicians define new operations and concepts all the time.
For example, you can represent 32-bit numbers as vectors of 32 booleans, and then define “AND” operation on them which does standard boolean “and” between their corresponding elements.
Yes,they are sums. Consider for a binary word of length n. It can be written as the following;
A=a0*2^0+a1*2^1+a2*2^3....an*2^n. Where an is an element of {0,1}
Therefore if an is a bit in A and bn is a bit in B, then;
AandB=a0*b0*2^0+a1*b1*2^1...an*bn*2^n
similarly
AxorB=(a0+b0)mod2*2^0+(a1+b1)mod2*2^1...+(an+bn)mod2*2^n
Consider now the identity;
Axor1=notA
We now have the three operators we need (Bitwise AND,Bitwise XOR and Bitwise NOT)
From these two we can make anything we want.
For example, bitwise OR
not[(notA)and(notB)]=not[not(AorB)]=AorB
Its not guaranteed to be pretty though.
In response to the comment regarding mod2 arithmetic not being very basic, that's true in a sense. However,while its common because of the prevalence of computers nowadays, the entire subject we are touching on here is not particularly "basic". The OP has grasped something fundamental. There are finite algebraic structures studied in the mathematical field known as "Abstract Algebra" such as addition and multiplication modulo n (where n is some number such as 2, 8 or 2^32). There are other structures using binary operations (addition is a binary operation, it takes two operands and produces a result, as is multiplication, and xor) such as xor, and ,bit shifts etc, that are "isomorphic" to the addition and multiplication over integers mod n. that means they act the same way, they are associative, distributive etc. (although they may or may not be commutative, think of matrix multiplication) Its hard to tell someone where to start looking for more information. I guess the best way would be to start with a book on formal mathematics.(Mathematical proofs) You need that to understand any advanced mathematics text. Then a text on abstract algebra. If your a computer science major you will get a lot of this in your classes. If your a mathematics major, you will study these things in depth all in good time. If your a history major, Im not knocking history , im a history channel junkie, but you should switch majors because your wasting your talents!
Here is a proof that for 2-bit bitwise operations you cannot describe & with
just + - and * (check this, just came up with it now, so, who knows):
The question is, can we find a polynomial
x & y == P(x, y)
where
P(x, y) = a0_0 + a1_0*x + a0_1*y + a2_0*x^ + ...
Here's what it would have to look like:
0 1 2 3
--------
0| 0 0 0 0
1| 0 1 0 1
2| 0 0 2 2
3| 0 1 2 3
First, clearly a0_0 == 0. Next you can see that if P is
rewritten:
|------- Q(x, y) --------|
P(x, y) = xy*R(x,y) + a1_0*x + a0_1*y + ...
And y is held 0, while x varies over 0, 1, 2, 3; then Q(x, y) must be 0 for
each of those values. Likewise if x is held 0 and y varied. So Q(x, y)
may be set to 0 without loss of generality.
But now, since P(2, 2) = 2, yet 2 * 2 == 0, the polynomial P cannot
exist.
And, I think this would generalize to more bits, too.
So the answer is, if you're looking for just +, * and -, no you can't do
it.

Resources