Flipping bytes, doing arithmetic and flipping them back again

Flipping bytes, doing arithmetic and flipping them back again - c

I have a programming/math related question regarding converting between big endian and little endian and doing arithmetic.
Assume we have two integers in little endian mode:
int a = 5;
int b = 6;
//a+b = 11
Let's flip the bytes and add them again:
int a = 1280;
int b = 1536;
//a+b = 2816
Now if we flip the byte order of 2816 we get 11. So essentially we can do arithmetic computation between little endian and big endian and once converted they represent the same number?
Does this have a theory/name behind it in the computer science world?

It doesn't work if the addition involves carrying since carrying propagates right-to-left. Swapping digits doesn't mean carrying switches direction, so any bytes that overflow into the next byte will be different.
Let's look at an example in hex, pretending that endianness means each 4-bit nibble is swapped:
int a = 0x68;
int b = 0x0B;
//a+b: 0x73
int a = 0x86;
int b = 0xB0;
//a+b: 0x136
816 + B16 is 1316. That 1 is carried and adds on to the 6 in the first sum. But in the second sum it's not carried right and added to the 6, it's carried left and overflows into the third hex digit.

First, it should be noted that your assumption that int in C has 16 bits is wrong. In most modern systems int is a 32-bit type, so if we reverse (not flip, which typically means taking the complement) the bytes of 5 we'll get 83886080 (0x05000000), not 1280 (0x0500)
Is the size of C "int" 2 bytes or 4 bytes?
What does the C++ standard state the size of int, long type to be?
Also note that you should write in hex to make it easier to understand because computers don't work in decimal:
int16_t a = 0x0005;
int16_t b = 0x0006;
// a+b = 0x000B
int16_t a = 0x0500; // 1280
int16_t b = 0x0600; // 1536
//a+b = 0x0B00
OK now as others said, ntohl(htonl(5) + htonl(6)) happens to be the same as 5 + 6 just be cause you have small numbers that their reverses' sum don't overflow. Choosing larger numbers and you'll see the difference right away
However that property does hold in ones' complement for systems where values are stored in 2 smaller parts like this case
In ones' complement one does arithmetic with end-around carry by propagating the carry out back to the carry in. That makes ones' complement arithmetic endian independent if one has only one internal "carry break" (i.e. the stored value is broken into two separate chunks) because of the "circular carry"
Suppose we have xxyy and zztt then xxyy + zztt is done like this
carry
xx yy
+ zz <───── tt
──────────────
carry aa bb
│ ↑
└─────────────┘
When we reverse the chunks, yyxx + ttzz is carried the same way. Because xx, yy, zz, tt are chunks of bits of any length, it works for PDP's mixed endian, or when you store a 32-bit number in two 16-bit parts, a 64-bit number in two 32-bit parts...
For example:
0x7896 + 0x6987 = 0xE21D
0x9678 + 0x8769 = 0x11DE1 → 0x1DE1 + 1 = 0x1DE2
0x2345 + 0x9ABC = 0xBE01
0x4523 + 0xBC9A = 0x101BD → 0x01BD + 1 = 0x01BE
0xABCD + 0xBCDE = 0x168AB → 0x68AB + 1 = 0x68AC
0xCDAB + 0xDEBC = 0x1AC67 → 0xAC67 + 1 = 0xAC68
Or John Kugelman's example above: 0x68 + 0x0B = 0x73; 0x86 + 0xB0 = 0x136 → 0x36 + 1 = 0x37
The end-around carry is one of the reasons why ones' complement was chosen for TCP checksum, because you can calculate the sum in higher precision easily. 16-bit CPUs can work in 16-bit units like normal, but 32 and 64-bit CPUs can add 32 and 64-bit chunks in parallel without worrying about the carry when SIMD isn't available like the SWAR technique

This only appears to work because you happened to pick numbers that are small enough so that they as well as their sum fit into one byte. As long as everything going on in your number stays within its respective byte, you can obviously shuffle and deshuffle your bytes however you want, it won't make a difference. If you pick larger numbers, e.g., 1234 and 4321, you will notice that it won't work anymore. In fact, you will most likely end up invoking undefined behavior because your int will overflow…
Apart from all that, you will almost certainly want to read this: https://commandcenter.blogspot.com/2012/04/byte-order-fallacy.html

Related

Read Specific values from given address?

Given the 16 bit address(memory address, not value) 0x1144, 16-bit is divided into Type bit 15- bit 16 , Module bit 9 to bit 14, group bit 1 to bit 8
Type = Bit 15 - Bit 16
Module = Bit 9 - Bit 14
Group = Bit 1 - Bit 8
Read and Print the Values in Following variables.
uint16_t Type;
uint16_t Module;
uint16_t Group;
How to read & print values using C.
I tried with
uint16_t *ptr = 0x1144;
Type = *ptr >> 14;
Module = *ptr << 2;
Module = Module >> 10;
Group = *ptr << 8;
Group = Group >> 8;
Is this the correct ?

You can use Bit fields in C which is quite commonly used while addressing individual bit positions in HW registers. Just model your bit positions in a struct by defining below (on my machine with little endian)
typedef struct {
uint16_t group:8;
uint16_t module:6;
uint16_t type:2;
}nwGroup ;
All you need to do is cast the address containing your 16 bit value to this struct type and you can access the fields individually after that.
uint16_t *val = (uint16_t *)0x1144;
nwGroup *ptr = (nwGroup*)(val);
printf("NW group: %d\n", ptr->group);

(Note: We usually number things starting from 0 in computer science, so I've interpreted your requirements from a zero-based mindset.)
Use a combination of masking and shifting. For example, if you want to be able to recover the NetworkModule value, create a mask that has 1-bits in the positions you want, and 0-bits everywhere else:
#define NetworkModuleMask 0x3E00 // that is, 0011 1110 0000 0000
Now you can use that to mask out the unwanted bits using bitwise AND:
int address = 0x1144;
int networkModule = address & NetworkModuleMask
Another way to do it, which is essentially equivalent, is to use division and modulo operators with powers of 2. For example, your NetworkGroup is the
Then, to interpret the value as a number, you'll want to shift it right by 9 bits:
#define NetworkModulePosition 9
networkModule = networkModule >> NetworkModulePosition
You can use a similar process to construct an address using component values: shift each part into position and then bitwise OR it into the address.
You can also approach the problem arithmetically, using division and modulo operators with powers of 2. Dividing an integer by a power of 2 is the same as shifting it to the right by some number of bits, and modulo by a power of 2 is the same as shifting some number of bits to the right and then clearing some number of bits on the left, so you end up doing pretty much what we did above. For example, your NetworkGroup value is the low 8 bits of the address, so you can recover it by taking the address mod 2^^8, or 256. The NetworkType is the highest 2 bits, and you can recover that by dividing the address by 2^^14, or 16384.

32-bit multiplication through 16-bit shifting

I am writing a soft-multiplication function call using shifting and addition. The existing function call goes like this:
unsigned long __mulsi3 (unsigned long a, unsigned long b) {
unsigned long answer = 0;
while(b)
{
if(b & 1) {
answer += a;
};
a <<= 1;
b >>= 1;
}
return answer;
}
Although my hardware does not have a multiplier, I have a hard shifter. The shifter is able to shift up to 16 bits at one time.
If I want to make full use of my 16-bit shifter. Any suggestions on how can I adapt the code above to reflect my hardware's capabilities? The given code shifts only 1-bit per iteration.
The 16-bit shifter can shift 32-bit unsigned long values up to 16 places at a time. The sizeof(unsigned long) == 32 bits

The ability to shift multiple bits is not going to help much, unless you have a hardware multiply, say 8-bit x 8-bit, or you can afford some RAM/ROM to do (say) a 4-bit by 4-bit multiply by lookup.
The straightforward shift and add (as you are doing) can be helped by swapping the arguments so that the multiplier is the smaller.
If your machine is faster doing 16 bit things in general, then treating your 32-bit 'a' as 'a1:a0' 16-bits at a time, and similarly 'b', you just might be able to same some cycles. Your result is only 32-bits, so you don't need to do 'a1 * b1' -- though one or both of those may be zero, so the win may not be big! Also, you only need the ls 16-bits of 'a0 * b1', so that can be done entirely 16-bits -- but if b1 (assuming b <= a) is generally zero this is not a big win, either. For 'a * b0', you need a 32-bit 'a' and 32-bit adds into 'answer', but your multiplier is 16-bits only... which may or may not help.
Skipping runs of multiplier zeros could help -- depending on processor and any properties of the multiplier.
FWIW: doing the magic 'a1*b1', '(a1-a0)*(b0-b1)', 'a0*b0' and combining the result by shifts, adds and subtracts is, in my small experience, an absolute nightmare... the signs of '(a1-a0)', '(b0-b1)' and their product have to be respected, which makes a bit of a mess of what looks like a cute trick. By the time you have finished with that and the adds and subtracts, you have to have a mighty slow multiply to make it all worth while ! When multiplying very, very long integers this may help... but there the memory issues may dominate... when I tried it, it was something of a disappointment.

Having 16-bit shifts can help you in making minor speed enhancement using the following approach:
(U1 * P + U0) * (V1 * P + V0) =
= U1 * V1 * P * P + U1 * V0 * P + U0 * V1 * P + U0 * V0 =
= U1 * V1 * (P*P+P) + (U1-U0) * (V0-V1) * P + U0 * V0 * (1-P)
provided P is a convenient power of 2 (for example, 2^16, 2^32), so multiplying to it is a fast shift. This reduces from 4 to 3 multiplications of smaller numbers, and, recursively, O(N^1.58) instead of O(N^2) for very long numbers.
This method is named Karatsubaʼs multiplication. There are more advanced versions described there.
For small numbers (e.g. 8 by 8 bits), the following method is fast, if you have enough fast ROM:
a * b = square(a+b)/4 - square(a-b)/4
if to tabulate int(square(x)/4), you'll need 1022 bytes for unsigned multiplication and 510 bytes for signed one.

The basic approach is (assuming shifting by 1) :-
Shift the top 16 bits
Set the bottom bit of the top 16 bits to the top bit of the bottom 16 bits
Shift the bottom 16 bits
Depends a bit on your hardware...
but you could try :-
assuming unsigned long is 32 bits
assuming Big Endian
then :-
union Data32
{
unsigned long l;
unsigned short s[2];
};
unsigned long shiftleft32(unsigned long valueToShift, unsigned short bitsToShift)
{
union Data32 u;
u.l = valueToShift
u.s[0] <<= bitsToShift;
u.s[0] |= (u.s[1] >> (16 - bitsToShift);
u.s[1] <<= bitsToShift
return u.l;
}
then do the same in reverse for shifting right

the code above is multiplying on the traditional way, the way we learnt in primary school :
EX:
0101
* 0111
-------
0101
0101.
0101..
--------
100011
of course you can not approach it like that if you don't have either a multiplier operator or 1-bit shifter!
though, you can do it in other ways, for example a loop :
unsigned long _mult(unsigned long a, unsigned long b)
{
unsigned long res =0;
while (a > 0)
{
res += b;
a--;
}
return res;
}
It is costy but it serves your needings, anyways you can think about other approaches if you have more constraints (like computation time ...)

converting little endian hex to big endian decimal in C

I am trying to understand and implement a simple file system based on FAT12. I am currently looking at the following snippet of code and its driving me crazy:
int getTotalSize(char * mmap)
{
int *tmp1 = malloc(sizeof(int));
int *tmp2 = malloc(sizeof(int));
int retVal;
* tmp1 = mmap[19];
* tmp2 = mmap[20];
printf("%d and %d read\n",*tmp1,*tmp2);
retVal = *tmp1+((*tmp2)<<8);
free(tmp1);
free(tmp2);
return retVal;
};
From what I've read so far, the FAT12 format stores the integers in little endian format.
and the code above is getting the size of the file system which is stored in the 19th and 20th byte of boot sector.
however I don't understand why retVal = *tmp1+((*tmp2)<<8); works. is the bitwise <<8 converting the second byte to decimal? or to big endian format?
why is it only doing it to the second byte and not the first one?
the bytes in question are [in little endian format] :
40 0B
and i tried converting them manually by switching the order first to
0B 40
and then converting from hex to decimal, and I get the right output, I just don't understand how adding the first byte to the bitwise shift of second byte does the same thing?
Thanks

The use of malloc() here is seriously facepalm-inducing. Utterly unnecessary, and a serious "code smell" (makes me doubt the overall quality of the code). Also, mmap clearly should be unsigned char (or, even better, uint8_t).
That said, the code you're asking about is pretty straight-forward.
Given two byte-sized values a and b, there are two ways of combining them into a 16-bit value (which is what the code is doing): you can either consider a to be the least-significant byte, or b.
Using boxes, the 16-bit value can look either like this:
+---+---+
| a | b |
+---+---+
or like this, if you instead consider b to be the most significant byte:
+---+---+
| b | a |
+---+---+
The way to combine the lsb and the msb into 16-bit value is simply:
result = (msb * 256) + lsb;
UPDATE: The 256 comes from the fact that that's the "worth" of each successively more significant byte in a multibyte number. Compare it to the role of 10 in a decimal number (to combine two single-digit decimal numbers c and d you would use result = 10 * c + d).
Consider msb = 0x01 and lsb = 0x00, then the above would be:
result = 0x1 * 256 + 0 = 256 = 0x0100
You can see that the msb byte ended up in the upper part of the 16-bit value, just as expected.
Your code is using << 8 to do bitwise shifting to the left, which is the same as multiplying by 28, i.e. 256.
Note that result above is a value, i.e. not a byte buffer in memory, so its endianness doesn't matter.

I see no problem combining individual digits or bytes into larger integers.
Let's do decimal with 2 digits: 1 (least significant) and 2 (most significant):
1 + 2 * 10 = 21 (10 is the system base)
Let's now do base-256 with 2 digits: 0x40 (least significant) and 0x0B (most significant):
0x40 + 0x0B * 0x100 = 0x0B40 (0x100=256 is the system base)
The problem, however, is likely lying somewhere else, in how 12-bit integers are stored in FAT12.
A 12-bit integer occupies 1.5 8-bit bytes. And in 3 bytes you have 2 12-bit integers.
Suppose, you have 0x12, 0x34, 0x56 as those 3 bytes.
In order to extract the first integer you only need take the first byte (0x12) and the 4 least significant bits of the second (0x04) and combine them like this:
0x12 + ((0x34 & 0x0F) << 8) == 0x412
In order to extract the second integer you need to take the 4 most significant bits of the second byte (0x03) and the third byte (0x56) and combine them like this:
(0x56 << 4) + (0x34 >> 4) == 0x563
If you read the official Microsoft's document on FAT (look up fatgen103 online), you'll find all the FAT relevant formulas/pseudo code.

The << operator is the left shift operator. It takes the value to the left of the operator, and shift it by the number used on the right side of the operator.
So in your case, it shifts the value of *tmp2 eight bits to the left, and combines it with the value of *tmp1 to generate a 16 bit value from two eight bit values.
For example, lets say you have the integer 1. This is, in 16-bit binary, 0000000000000001. If you shift it left by eight bits, you end up with the binary value 0000000100000000, i.e. 256 in decimal.
The presentation (i.e. binary, decimal or hexadecimal) has nothing to do with it. All integers are stored the same way on the computer.

Overflow of 32 bit variable

Currently I am implementing an equation (2^A)[X + Y*(2^B)] in one of my applications.
The issue is with the overflow of 32 bit value and I cannot use 64 bit data type.
Suppose when B = 3 and Y = 805306367, it overflows 32bit value, but when X = -2147483648, the result comes backs to 32 bit range.
So I want to store the result of (Y*2^B). Can anyone suggest some solution for this.... A and B are having value from -15 to 15 and X,Y can have values from 2147483647..-2147483648.
Output can range from 0...4294967295.

If the number is too big for a 32 bit variable, then you either use more bits (either by storing in a bigger variable, or using multiple variables) or you give up precision and store it in a float. Since Y can be MAX_INT, by definition you can't multiply it by a number greater than 1 and still have it fit in a 32 bit int.

I'd use loop, instead of multiplication, in this case. Something like this:
int newX = X;
int poweredB = ( 1 << B ); // 2^B
for( int i = 0; i < poweredB ; ++i )
{
newX += Y; // or change X directly, if you will not need it later.
}
int result = ( 1 << A ) * newX;
But note : this will work only for some situations - only if you have the guarantee, that this result will not overflow. In your case, when Y is large positive and X is large negative number ("large" - argh, this is too subjective), this will definitely work. But if X is large positive and Y is large positive - there will be overflow again. And not only in this case, but with many others.

Based on the values for A and B in the assignment I suppose the expected solution would involve this:
the following are best done unsigned so store the signs for X and Y and operate on their absolute value
Store X and Y in two variables each, one holding the high 16 bits the other holding the low bits
something like
int hiY = Y & 0xFFFF0000;
int loY = Y & 0x0000FFFF;
Shift the high parts right so that all the variables have the high bits 0
Y*(2^B) is actually a shift of Y to the left by B bits. It is equivalent to shifting the high and low parts by B bits and, since you've shifted the high part, both operations will fit inside their 32 bit integer
Process X similarly in the high and low parts
Keeping track of the signs of X and Y calculate X + Y*(2^B) for the high and low parts
Again shift both the high and low results by A bits
Join the high and low parts into the final result

If you can't use 64-bits because your local C does not support them rather than some other overriding reason, you might consider The GNU Multiple Precision Arithmetic Library at http://gmplib.org/

how is data stored at bit level according to "Endianness"?

I read about Endianness and understood squat...
so I wrote this
main()
{
int k = 0xA5B9BF9F;
BYTE *b = (BYTE*)&k; //value at *b is 9f
b++; //value at *b is BF
b++; //value at *b is B9
b++; //value at *b is A5
}
k was equal to A5 B9 BF 9F
and (byte)pointer "walk" o/p was 9F BF b9 A5
so I get it bytes are stored backwards...ok.
~
so now I thought how is it stored at BIT level...
I means is "9f"(1001 1111) stored as "f9"(1111 1001)?
so I wrote this
int _tmain(int argc, _TCHAR* argv[])
{
int k = 0xA5B9BF9F;
void *ptr = &k;
bool temp= TRUE;
cout<<"ready or not here I come \n"<<endl;
for(int i=0;i<32;i++)
{
temp = *( (bool*)ptr + i );
if( temp )
cout<<"1 ";
if( !temp)
cout<<"0 ";
if(i==7||i==15||i==23)
cout<<" - ";
}
}
I get some random output
even for nos. like "32" I dont get anything sensible.
why ?

Just for completeness, machines are described in terms of both byte order and bit order.
The intel x86 is called Consistent Little Endian because it stores multi-byte values in LSB to MSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31.
The Motorola 68000 is called Inconsistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^0 and b31 = 2^31 (same as intel, which is why it is called 'Inconsistent' Big Endian).
The 32-bit IBM/Motorola PowerPC is called Consistent Big Endian because it stores multi-byte values in MSB to LSB order as memory address increases. Its bit numbering convention is b0 = 2^31 and b31 = 2^0.
Under normal high level language use the bit order is generally transparent to the developer. When writing in assembly language or working with the hardware, the bit numbering does come into play.

Endianness, as you discovered by your experiment refers to the order that bytes are stored in an object.
Bits do not get stored differently, they're always 8 bits, and always "human readable" (high->low).
Now that we've discussed that you don't need your code... About your code:
for(int i=0;i<32;i++)
{
temp = *( (bool*)ptr + i );
...
}
This isn't doing what you think it's doing. You're iterating over 0-32, the number of bits in a word - good. But your temp assignment is all wrong :)
It's important to note that a bool* is the same size as an int* is the same size as a BigStruct*. All pointers on the same machine are the same size - 32bits on a 32bit machine, 64bits on a 64bit machine.
ptr + i is adding i bytes to the ptr address. When i>3, you're reading a whole new word... this could possibly cause a segfault.
What you want to use is bit-masks. Something like this should work:
for (int i = 0; i < 32; i++) {
unsigned int mask = 1 << i;
bool bit_is_one = static_cast<unsigned int>(ptr) & mask;
...
}

Your machine almost certainly can't address individual bits of memory, so the layout of bits inside a byte is meaningless. Endianness refers only to the ordering of bytes inside multibyte objects.
To make your second program make sense (though there isn't really any reason to, since it won't give you any meaningful results) you need to learn about the bitwise operators - particularly & for this application.

Byte Endianness
On different machines this code may give different results:
union endian_example {
unsigned long u;
unsigned char a[sizeof(unsigned long)];
} x;
x.u = 0x0a0b0c0d;
int i;
for (i = 0; i< sizeof(unsigned long); i++) {
printf("%u\n", (unsigned)x.a[i]);
}
This is because different machines are free to store values in any byte order they wish. This is fairly arbitrary. There is no backwards or forwards in the grand scheme of things.
Bit Endianness
Usually you don't have to ever worry about bit endianness. The most common way to access individual bits is with shifts ( >>, << ) but those are really tied to values, not bytes or bits. They preform an arithmatic operation on a value. That value is stored in bits (which are in bytes).
Where you may run into a problem in C with bit endianness is if you ever use a bit field. This is a rarely used (for this reason and a few others) "feature" of C that allows you to tell the compiler how many bits a member of a struct will use.
struct thing {
unsigned y:1; // y will be one bit and can have the values 0 and 1
signed z:1; // z can only have the values 0 and -1
unsigned a:2; // a can be 0, 1, 2, or 3
unsigned b:4; // b is just here to take up the rest of the a byte
};
In this the bit endianness is compiler dependant. Should y be the most or least significant bit in a thing? Who knows? If you care about the bit ordering (describing things like the layout of a IPv4 packet header, control registers of device, or just a storage formate in a file) then you probably don't want to worry about some different compiler doing this the wrong way. Also, compilers aren't always as smart about how they work with bit fields as one would hope.

This line here:
temp = *( (bool*)ptr + i );
... when you do pointer arithmetic like this, the compiler moves the pointer on by the number you added times the sizeof the thing you are pointing to. Because you are casting your void* to a bool*, the compiler will be moving the pointer along by the size of one "bool", which is probably just an int under the covers, so you'll be printing out memory from further along than you thought.
You can't address the individual bits in a byte, so it's almost meaningless to ask which way round they are stored. (Your machine can store them whichever way it wants and you won't be able to tell). The only time you might care about it is when you come to actually spit bits out over a physical interface like I2C or RS232 or similar, where you have to actually spit the bits out one-by-one. Even then, though, the protocol would define which order to spit the bits out in, and the device driver code would have to translate between "an int with value 0xAABBCCDD" and "a bit sequence 11100011... [whatever] in protocol order".

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight