Convert 4 bytes char to int32 in C - c

I first convert an int32 number to char[4] array, then convert the array back to int32 by (int *), but the number isn't the same as before:
unsigned int num = 2130706432;
unsigned int x;
unsigned char a[4];
a[0] = (num>>24) & 0xFF;
a[1] = (num>>16) & 0xFF;
a[2] = (num>>8) & 0xFF;
a[3] = num & 0xFF;
x = *(int *)a;
printf("%d\n", x);
the output is 127. And if I set num = 127, the output is 2130706432.
Does anyone have ideas?

Reverse the order of the a[] indexes, e.g,. a[0] -> a[3]
I think you have the endianness in reverse.
Try this:
a[3] = (num>>24) & 0xFF;
a[2] = (num>>16) & 0xFF;
a[1] = (num>>8) & 0xFF;
a[0] = num & 0xFF;

To see what happens use
printf("%x\n", ...);
to print both input and output number.
Endian-independent way:
x = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3];

This line is never going to work correctly on a little-endian machine:
x = *(int *)a;
You need to unpack the data before you print out the value.

Your code a[0] = (num>>24) & 0xFF; takes the most significant 8 bits from num and sticks them in the first byte of a. On little endian machines the first byte holds the least signficant bits. That means that on little endian machines, this code takes the most significant 8 bits and stores them in the place where the least significant bits go, changing the value.
2130706432 is 0x7F000000 in hex, and 127 is 0x0000007F.
Also, x = *(int *)a; results in undefined behavior. Consider hardware where reading an int from an improperly aligned address causes a bus error. If a doesn't happen to be aligned properly for an int then the program would crash.
A correct approach to interpreting the bytes as an int would be std::memcpy(&x, a, sizeof x);

Related

How to convert a char pointer array into a int pointer array in C

I have an 8 byte char pointer that has 2 integers stored inside it. how do I store it in a int array pointer so that the int array has the 1st integer is in array[0] and the 2nd integer is in array[1].
The code I made so far:
char * wirte_buff= (char*) malloc(8*sizeof(char*));
int i, j;
i = 16;
j = 18;
/*separates integer i and integer j into 4-bytes each*/
for(n=0; n<=3; n++){
wirte_buff[n] = (i >> 8*(3-n)) & 0xFF;
wirte_buff[4+n] = (j >> 8*(3-n)) & 0xFF;
}
int* intArray = (int*) wirte_buff; //puts char pointer to
printf("intArray[0] value is %d \n", intArray[0]);
printf("intArray[1] value is %d \n", intArray[1]);
When I did this it the expected result was 16 and 18, but I unexpectedly got 268435456 and 301989888.
Assuming you are aware of the strict aliasing rule violation, your code would generate the result you expect in a big endian architecture, in which the four bytes composing an integer are stored starting from the most significant byte:
------------------------------------------------------------------------------
| byte3 (bit 24:31) | byte2 (bit 16:23) | byte1 (bit 8:15) | byte0 (bit 0:7) |
------------------------------------------------------------------------------
But you are apparently running your code in a little endian architecture machine:
------------------------------------------------------------------------------
| byte0 (bit 0:7) | byte1 (bit 8:15) | byte2 (bit 16:23) | byte3 (bit 24:31) |
------------------------------------------------------------------------------
So, in order to displace your integer in the char array, you need that:
The byte 0 of i, that is i >> (8 * 0), is at index 0 of wirte_buff array
The byte 1 of i, that is i >> (8 * 1), is at index 1 of wirte_buff array
The byte 2 of i, that is i >> (8 * 2), is at index 2 of wirte_buff array
The byte 3 of i, that is i >> (8 * 3), is at index 3 of wirte_buff array
This translates in
wirte_buff[n] = (i >> 8*(n)) & 0xFF;
and the same, of course, for j:
wirte_buff[4+n] = (j >> 8*(n)) & 0xFF;
This code is wrong in many ways.
char * wirte_buff= (char*) malloc(8*sizeof(char*)); allocates 8 char* and no data. You don't assign these pointers anywhere, so they remain uninitialized.
i >> ... etc performs bitwise operations on a signed type, which is always wrong. If the value is negative, you end up with implementation-defined results.
Should you convert the int value into char, then char has implementation-defined signedness so you don't know if you end up with a negative value or possibly an overflow/underflow.
Should you avoid that as well, you can't read a char back through another type with (int*) wirte_buff; ... intArray[0] because these are not compatible types. You might read misaligned data. You will also violate strict pointer aliasing, see What is the strict aliasing rule?
There is no expected behavior of the posted code and I doubt you can salvage it. You will have to re-write this from scratch and especially avoid all the fishy conversions.

Bit Shifting - Finding nth byte in a number [duplicate]

I know you can get the first byte by using
int x = number & ((1<<8)-1);
or
int x = number & 0xFF;
But I don't know how to get the nth byte of an integer.
For example, 1234 is 00000000 00000000 00000100 11010010 as 32bit integer
How can I get all of those bytes? first one would be 210, second would be 4 and the last two would be 0.
int x = (number >> (8*n)) & 0xff;
where n is 0 for the first byte, 1 for the second byte, etc.
For the (n+1)th byte in whatever order they appear in memory (which is also least- to most- significant on little-endian machines like x86):
int x = ((unsigned char *)(&number))[n];
For the (n+1)th byte from least to most significant on big-endian machines:
int x = ((unsigned char *)(&number))[sizeof(int) - 1 - n];
For the (n+1)th byte from least to most significant (any endian):
int x = ((unsigned int)number >> (n << 3)) & 0xff;
Of course, these all assume that n < sizeof(int), and that number is an int.
int nth = (number >> (n * 8)) & 0xFF;
Carry it into the lowest byte and take it in the "familiar" manner.
If you are wanting a byte, wouldn't the better solution be:
byte x = (byte)(number >> (8 * n));
This way, you are returning and dealing with a byte instead of an int, so we are using less memory, and we don't have to do the binary and operation & 0xff just to mask the result down to a byte. I also saw that the person asking the question used an int in their example, but that doesn't make it right.
I know this question was asked a long time ago, but I just ran into this problem, and I think that this is a better solution regardless.
//was trying to do inplace, would have been better if I had swapped higher and lower bytes somehow
uint32_t reverseBytes(uint32_t value) {
uint32_t temp;
size_t size=sizeof(uint32_t);
for(int i=0; i<size/2; i++){
//get byte i
temp = (value >> (8*i)) & 0xff;
//put higher in lower byte
value = ((value & (~(0xff << (8*i)))) | (value & ((0xff << (8*(size-i-1)))))>>(8*(size-2*i-1))) ;
//move lower byte which was stored in temp to higher byte
value=((value & (~(0xff << (8*(size-i-1)))))|(temp << (8*(size-i-1))));
}
return value;
}

what does a[0] = addr & 0xff?

i'm currently learning from the book "the shellcoder's handbook", I have a strong understanding of c but recently I came across a piece of code that I can't grasp.
Here is the piece of code:
char a[4];
unsigned int addr = 0x0806d3b0;
a[0] = addr & 0xff;
a[1] = (addr & 0xff00) >> 8;
a[2] = (addr & 0xff0000) >> 16;
a[3] = (addr) >> 24;
So the question is what does this, what is addr & 0xff (and the three lines below it) and what makes >> 8 to it (I know that it divides it 8 times by 2)?
Ps: don't hesitate to tell me if you have ideas for the tags that I should use.
The variable addr is 32 bits of data, while each element in the array a is 8 bits. What the code does is copy the 32 bits of addr into the array a, one byte at a time.
Lets take this line:
a[1] = (addr & 0xff00) >> 8;
And then do it step by step.
addr & 0xff00 This gets the bits 8 to 15 of the value in addr, the result after the operation is 0x0000d300.
>> 8 This shifts the bits to the right, so 0x0000d300 becomes 0x000000d3.
Assign the resulting value of the mask and shift to a[1].
The code is trying to enforce endianness on the data input. Specifically, it is trying to enforce little endian behavior on the data. Here is the explaination:
a[0] = addr & 0xff; /* gets the LSB 0xb0 */
a[1] = (addr & 0xff00) >> 8; /* gets the 2nd LSB 0xd3 */
a[2] = (addr & 0xff0000) >> 16; /* gets 2nd MSB 0x06 */
a[3] = (addr) >> 24; /* gets the MSB 0x08 */
So basically, the code is masking and separating out every byte of data and storing it in the array "a" in the little endian format.
unsigned char a[4]; /* I think using unsigned char is better in this case */
unsigned int addr = 0x0806d3b0;
a[0] = addr & 0xff; /* get the least significant byte 0xb0 */
a[1] = (addr & 0xff00) >> 8; /* get the second least significant byte 0xd3 */
a[2] = (addr & 0xff0000) >> 16; /* get the second most significant byte 0x06 */
a[3] = (addr) >> 24; /* get the most significant byte 0x08 */
Apparently, the code isolates the individual bytes from addr to store them in the array a so they can be indexed. The first line
a[0] = addr & 0xff;
masks out the byte of lowest value by using 0xff as a bit mask; the subsequent lines do the same, but in addition shift the result to the rightmost position. Finally, the the last line
a[3] = (addr) >> 24;
no masking is necessary anymore, as all unneccesary information is discarded by the shift.
The code is effectively storing a 32 bit adress in a 4 chars long array. As you may know, a char has a byte (8 bit). It first copies the first byte of the adress, then shifts, copies the second byte, then shifts, etc. You get the gist.
It enforces endianness, and stores the integer in little-endian format in a.
See the illustration on wikipedia.
also, why not visualize the bit shifting results..
char a[4];
unsigned int addr = 0x0806d3b0;
a[0] = addr & 0xff;
a[1] = (addr & 0xff00) >> 8;
a[2] = (addr & 0xff0000) >> 16;
a[3] = (addr) >> 24;
int i = 0;
for( ; i < 4; i++ )
{
printf( "a[%d] = %02x\t", i, (unsigned char)a[i] );
}
printf("\n" );
Output:
a[0] = b0 a[1] = d3 a[2] = 06 a[3] = 08
I addition to the multiple answers given, the code has some flaws that need to be fixed to make the code portable. In particular, the char type is very dangerous to use for storing values, because of its implementation-defined signedness. Very classic C bug. If the code was taken from a book, then you should read that book sceptically.
While we are at it, we can also tidy up the code, make it overly explicit to avoid potential future maintenance bugs, remove some implicit type promotions of integer literals etc.
#include <stdint.h>
uint8_t a[4];
uint32_t addr = 0x0806d3b0UL;
a[0] = addr & 0xFFu;
a[1] = (addr >> 8) & 0xFFu;
a[2] = (addr >> 16) & 0xFFu;
a[3] = (addr >> 24) & 0xFFu;
The masks & 0xFFu are strictly speaking not needed, but they might save you from some false positive compiler warnings about wrong integer types. Alternatively, each shift result could be cast to uint8_t and that would have been fine too.

c get nth byte of integer

I know you can get the first byte by using
int x = number & ((1<<8)-1);
or
int x = number & 0xFF;
But I don't know how to get the nth byte of an integer.
For example, 1234 is 00000000 00000000 00000100 11010010 as 32bit integer
How can I get all of those bytes? first one would be 210, second would be 4 and the last two would be 0.
int x = (number >> (8*n)) & 0xff;
where n is 0 for the first byte, 1 for the second byte, etc.
For the (n+1)th byte in whatever order they appear in memory (which is also least- to most- significant on little-endian machines like x86):
int x = ((unsigned char *)(&number))[n];
For the (n+1)th byte from least to most significant on big-endian machines:
int x = ((unsigned char *)(&number))[sizeof(int) - 1 - n];
For the (n+1)th byte from least to most significant (any endian):
int x = ((unsigned int)number >> (n << 3)) & 0xff;
Of course, these all assume that n < sizeof(int), and that number is an int.
int nth = (number >> (n * 8)) & 0xFF;
Carry it into the lowest byte and take it in the "familiar" manner.
If you are wanting a byte, wouldn't the better solution be:
byte x = (byte)(number >> (8 * n));
This way, you are returning and dealing with a byte instead of an int, so we are using less memory, and we don't have to do the binary and operation & 0xff just to mask the result down to a byte. I also saw that the person asking the question used an int in their example, but that doesn't make it right.
I know this question was asked a long time ago, but I just ran into this problem, and I think that this is a better solution regardless.
//was trying to do inplace, would have been better if I had swapped higher and lower bytes somehow
uint32_t reverseBytes(uint32_t value) {
uint32_t temp;
size_t size=sizeof(uint32_t);
for(int i=0; i<size/2; i++){
//get byte i
temp = (value >> (8*i)) & 0xff;
//put higher in lower byte
value = ((value & (~(0xff << (8*i)))) | (value & ((0xff << (8*(size-i-1)))))>>(8*(size-2*i-1))) ;
//move lower byte which was stored in temp to higher byte
value=((value & (~(0xff << (8*(size-i-1)))))|(temp << (8*(size-i-1))));
}
return value;
}

How to write a 24 bit message after reading from a 4-byte integer on a big endian machine (C)?

I am constructing a message to send a 24-bit number over the network.
For little endian machines, the code is (ptr is the pointer to the message buffer):
*ptr++ = (num >> 16) & 0xFF;
*ptr++ = (num >> 8) & 0xFF;
*ptr++ = (num) & 0xFF;
(So if num0, num1, num2 and num3 are the individual bytes making up num, the message would be encoded as num2|num1|num0.)
What should be the code for encoding num2|num1|num0 on a big endian machine?
The question here is, in what byte order shall the message be sent/constructed ? Because whether you are on a little or big endian machine doesn't matter with respect to num, as you're already dividing num into individual bytes in an endian-agnostic way.
The code you've posted stores 24 bits of num in big endian (aka network byte order). So if that's what you want you're already done. If you want to store it in big little instead, just reverse the order:
*ptr++ = (num) & 0xFF;
*ptr++ = (num >> 8) & 0xFF;
*ptr++ = (num >> 16) & 0xFF;
Your code is portable regardless of endianess. The shift operators >> << work with the values, not with the representation.
In the receiving machine, regardless of endian-ness, if you receive them in same order as they are stored in ptr, assemble them like this:
num = (ptr[0] << 16) + (ptr[1] << 8) + (ptr[2]);
int main(int argc, char** argv) {
int a, b;
a = 0x0f000000; // Contain 32 bit value
printf("before = %d\n", a);
b = a & (~0xff000000); // convert the last 8 bits to zero so we got only 24 bit value in b
printf("After = %d\n", b);
return (EXIT_SUCCESS);
}
There is a number containing a 32-bit value but number b contains only 24 bits, starting from least significant digit. And that doesn't depend on endianness because bitwise operators don't work with memory representation.
So you can use
num = num & (~0xff000000);
to get the last 24-bit value.

Resources