I need to convert 4 byte data which is in below format to a original int value . I cannot change the below assignment of int to 4 bytes.
int main() {
//code
int num = 1000;
char a[4];
a[0] = ( char )(num>>24) ;
a[1] = ( char )(num>>16) ;
a[2] = ( char )(num>>8) ;
a[3] = ( char )num ;
printf("Original number is:%d\n", (a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3] ) );
return 0;
}
I was expecting output to be 1000, but output is 768. How do we restore the original number from above byte array ?Is this an endianess issue ?
a[0] = ( char )(num>>24) ;
That works “okay” in this example. However, in situations where num is negative, the result is implementation-defined (C 2018 6.5.7 5).
In the remaining assignments to a[1], a[2], and a[3], values that may exceed the range of char will be converted to char automatically. If char is signed, the results of these conversions are implementation-defined or a signal is raised (6.3.1.3 3). So that is a problem we will have to fix, below.
First, for num = 1000, let’s suppose that −24 is stored in a[3]. This is the result we would get by taking the low eight bits of 1000 and putting them in an eight-bit two’s complement char, which is likely what your implementation uses. Then, we have a[0] = 0, a[1] = 0, a[2] = 3, and a[3] = −24.
Now let’s consider a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3].
a[0] << 24 and a[1] << 16 both yield 0. a[2] << 8 is 3 << 8, which produces 768, or 300 in hexadecimal. a[3] is −24. While a[3] is a char, it is promoted to an int when used in an expression (6.3.1.1 2). Still assuming your C implementation uses two’s complement, the binary for −24 is 11111111111111111111111111101000, or ffffffe8 in hexadecimal.
When we bitwise OR 300 and ffffffe8, the result is ffffffe8, which, in a 32-bit two’s complement int, is −24.
The easiest way to fix this is to change char a[4]; to unsigned char a[4];. That avoids any negative char values.
However, to make your code completely work for any value of int (assuming it is four bytes and two’s complement), we need to make some other changes:
unsigned char a[4];
/* Convert the signed num to unsigned before shifting.
Shifts of unsigned values are better defined than shifts
of signed values.
*/
a[0] = (unsigned) num >> 24;
a[1] = (unsigned) num >> 16;
a[2] = (unsigned) num >> 8;
a[3] = (unsigned) num;
/* The cast in the last assignment is not really needed since
we are assigning to an unsigned char, and it will be converted
as desired, but we keep it for uniformity.
*/
// Reconstruct the value using all unsigned values.
unsigned u = (unsigned) a[0] << 24 | (unsigned) a[1] << 16 | (unsigned) a[2] << 8 | a[3];
/* Copy the bits into an int. (Include <string.h> to get memcpy.)
Note: It is easy to go from signed to unsigned because the C standard
completely defines that conversion. For unsigned to signed, the
conversion is not completely defined, so we have to use some indirect
method to get the bits into an int.
*/
int i;
memcpy(&i, &u, sizeof i);
printf("Original number: %d.\n", i);
We need to use an unsigned value to reconstruct the bits because C’s shift operators are not well defined for signed values, especially when we want to shift a bit into the sign bit. Once we have the bits in the unsigned object, we can copy them into an int.
Related
I want to compose the number 0xAAEFCDAB from individual bytes. Everything goes well up to 4 tetrads, and for some reason extra 4 bytes are added with it. What am I doing wrong?
#include <stdio.h>
int main(void) {
unsigned long int a = 0;
a = a | ((0xAB) << 0);
printf("%lX\n", a);
a = a | ((0xCD) << 8);
printf("%lX\n", a);
a = a | ((0xEF) << 16);
printf("%lX\n", a);
a = a | ((0xAA) << 24);
printf("%lX\n", a);
return 0;
}
Output:
Constants in C are actually typed, which might not be obvious at first, and the default type for a constant is an int which is a signed 32-bit integer (it depends on the platform, but it probably is in your case).
In signed numbers, the highest bit describes the sign of the number: 1 is negative and 0 is positive (for more details you can read about two's complement).
When you perform the operation 0xAB << 24 it results in a 32-bit signed value of 0xAB000000 which is equal to 10101011 00000000 00000000 00000000 in binary. As you can see, the highest bit is set to 1, which means that the entire 32-bit signed number is actually negative.
In order to perform the | OR operation between a (which is a 64-bit unsigned number) and a 32-bit signed number, some type conversions must be performed. The size promotion is performed first, and the 32-bit signed value of 0xAB000000 is promoted to a 64-bit signed value of 0xFFFFFFFFAB000000, according to the rules of the two's complement system. This is a 64-bit signed number which has the same numerical value as the 32-bit signed one before conversion.
Afterwards, type conversion is performed from 64-bit signed to 64-bit unsigned value in order to OR the value with a. This fills the top bits with ones and results in the value you see on the screen.
In order to force your constants to be different type than 32-bit signed int you may use suffixes such as u and l, as shown in the website I linked in the beginning of my answer. In your case, a ul suffix should work best, indicating a 64-bit unsigned value. Your lines of code which OR constants with your a variable would then look similarly to this:
a = a | ((0xAAul) << 24);
Alternatively, if you want to limit yourself to 4 bytes only, a 32-bit unsigned int is enough to hold them. In that case, I suggest you change your a variable type to unsigned int and use the u suffix for your constants. Do not forget to change the printf formats to reflect the type change. The resulting code looks like this:
#include <stdio.h>
int main(void) {
unsigned int a = 0;
a = a | ((0xABu) << 0);
printf("%X\n", a);
a = a | ((0xCDu) << 8);
printf("%X\n", a);
a = a | ((0xEFu) << 16);
printf("%X\n", a);
a = a | ((0xAAu) << 24);
printf("%X\n", a);
return 0;
}
My last suggestion is to not use the default int and long types when portability and size in bits are important to you. These types are not guaranteed to have the same amount of bits on all platforms. Instead use types defined in the <stdint.h> header file, in your case probably either a uint64_t or uint32_t. These two are guaranteed to be unsigned integers (their signed counterparts omit the 'u': int64_t and int32_t) while being 64-bit and 32-bit in size respectively on all platforms. For Pros and Cons of using them instead of traditional int and long types I refer you to this Stack Overflow answer.
a = a | ((0xAA) << 24);
((0xAA) << 24) is a negative number (it is int), then it is sign extended to the size of 'unsigned long' which adds those 0xffffffff at the beginning.
You need to tell the compiler that you want an unsigned number.
a = a | ((0xAAU) << 24);
int main(void) {
unsigned long int a = 0;
a = a | ((0xAB) << 0);
printf("%lX\n", a);
a = a | ((0xCD) << 8);
printf("%lX\n", a);
a = a | ((0xEF) << 16);
printf("%lX\n", a);
a = a | ((0xAAUL) << 24);
printf("%lX\n", a);
printf("%d\n", ((0xAA) << 24));
return 0;
}
https://gcc.godbolt.org/z/fjv19bKGc
0xAA gets treated as a signed value when it is scaled up during the bit shifting. Since its high bit is 1 (0xAA = 10101010b), the scaled value is sign extended to 0x...FFFFFFAA before you shift and OR it to a.
You need to cast 0xAA to an unsigned value before bit shifting it, so it gets zero extended instead.
I have an 8 byte char pointer that has 2 integers stored inside it. how do I store it in a int array pointer so that the int array has the 1st integer is in array[0] and the 2nd integer is in array[1].
The code I made so far:
char * wirte_buff= (char*) malloc(8*sizeof(char*));
int i, j;
i = 16;
j = 18;
/*separates integer i and integer j into 4-bytes each*/
for(n=0; n<=3; n++){
wirte_buff[n] = (i >> 8*(3-n)) & 0xFF;
wirte_buff[4+n] = (j >> 8*(3-n)) & 0xFF;
}
int* intArray = (int*) wirte_buff; //puts char pointer to
printf("intArray[0] value is %d \n", intArray[0]);
printf("intArray[1] value is %d \n", intArray[1]);
When I did this it the expected result was 16 and 18, but I unexpectedly got 268435456 and 301989888.
Assuming you are aware of the strict aliasing rule violation, your code would generate the result you expect in a big endian architecture, in which the four bytes composing an integer are stored starting from the most significant byte:
------------------------------------------------------------------------------
| byte3 (bit 24:31) | byte2 (bit 16:23) | byte1 (bit 8:15) | byte0 (bit 0:7) |
------------------------------------------------------------------------------
But you are apparently running your code in a little endian architecture machine:
------------------------------------------------------------------------------
| byte0 (bit 0:7) | byte1 (bit 8:15) | byte2 (bit 16:23) | byte3 (bit 24:31) |
------------------------------------------------------------------------------
So, in order to displace your integer in the char array, you need that:
The byte 0 of i, that is i >> (8 * 0), is at index 0 of wirte_buff array
The byte 1 of i, that is i >> (8 * 1), is at index 1 of wirte_buff array
The byte 2 of i, that is i >> (8 * 2), is at index 2 of wirte_buff array
The byte 3 of i, that is i >> (8 * 3), is at index 3 of wirte_buff array
This translates in
wirte_buff[n] = (i >> 8*(n)) & 0xFF;
and the same, of course, for j:
wirte_buff[4+n] = (j >> 8*(n)) & 0xFF;
This code is wrong in many ways.
char * wirte_buff= (char*) malloc(8*sizeof(char*)); allocates 8 char* and no data. You don't assign these pointers anywhere, so they remain uninitialized.
i >> ... etc performs bitwise operations on a signed type, which is always wrong. If the value is negative, you end up with implementation-defined results.
Should you convert the int value into char, then char has implementation-defined signedness so you don't know if you end up with a negative value or possibly an overflow/underflow.
Should you avoid that as well, you can't read a char back through another type with (int*) wirte_buff; ... intArray[0] because these are not compatible types. You might read misaligned data. You will also violate strict pointer aliasing, see What is the strict aliasing rule?
There is no expected behavior of the posted code and I doubt you can salvage it. You will have to re-write this from scratch and especially avoid all the fishy conversions.
Is there anyone can help me to explain what is the difference between unsigned char and char in XOR operation?
#include <stdio.h>
int main() {
char a[2] = { 0x56, 0xa5 }; // a[0] 0101 0110
// a[1] 1010 0101
a[0] = a[0] ^ a[1]; // a[0] 1111 0011 f3
printf("%02x", a[0]);
puts("");
unsigned char b[2] = { 0x56, 0xa5 }; // b[0] 0101 0110
// b[1] 1010 0101
b[0] = b[0] ^ b[1]; // b[0] 1111 0011 f3
printf("%02x", b[0]);
puts("");
}
result:
fffffff3
f3
[Finished in 0.0s]
Another example:
#include <stdio.h>
int main() {
char a[2] = { 0x01, 0x0a };
a[0] = a[0] ^ a[1];
printf("%02x", a[0]);
puts("");
unsigned char b[2] = { 0x01, 0x0a };
b[0] = b[0] ^ b[1];
printf("%02x", b[0]);
puts("");
}
result:
0b
0b
[Finished in 0.0s]
In the first case, your code
printf("%02x", a[0]);
Passes a char value to a variadic function printf. The char value is promoted to int type and passed as such. The value of a[0] is -13 because the char type happens to be signed by default on your environment, The value is preserved by the promotion as int and printf receives it as an int.
The format %02x expects an unsigned int value. printf was passed an int value, an incorrect type invoking undefined behavior. Since int and unsigned int use the same parameter passing convention on your platform, this negative value of -13 is interpreted as an unsigned int with the same bit pattern, with value 0xFFFFFFF3 because the int on your platform has 32 bits and negative values are represented in 2s complement. The string produced by printf is fffffff3. This behavior is not actually guaranteed by the C Standard.
In the second example, b[0] is an unsigned char with value 243 (0xf3). Its promotion to int preserves the value and the int passed to printf is 243. The same undefined behavior is invoked as printf is passed an int instead of an unsigned int. In your particular case, the conversion performed by printf yields the same value, which printed as hex with at least 2 digits padded with leading 0s gives f3.
To avoid this ambiguity, you should either cast the operand as unsigned char:
printf("%02x", (unsigned)(unsigned char)a[0]);
Or specify its actual type as unsigned char:
printf("%02hhx", (unsigned char)a[0]);
(Type char is signed, sizeof(int) is 4, 8 bits per byte.)
Both operands of a, are promoted to int, because of integer promotions:
a[0]^a[1];
Because a[1] is a signed type, char, the number 0xa5 actually represents a negative value of -91. The representation of the value -91 in type int is 0xffffffa5.
So the calculation becomes:
0x00000056 ^ 0xffffffa5
or in decimal:
86 ^ -91
The result is of that operation is:0xfffffff3
The unsigned char version of this calculations doesn't have this 'problem'.
I first convert an int32 number to char[4] array, then convert the array back to int32 by (int *), but the number isn't the same as before:
unsigned int num = 2130706432;
unsigned int x;
unsigned char a[4];
a[0] = (num>>24) & 0xFF;
a[1] = (num>>16) & 0xFF;
a[2] = (num>>8) & 0xFF;
a[3] = num & 0xFF;
x = *(int *)a;
printf("%d\n", x);
the output is 127. And if I set num = 127, the output is 2130706432.
Does anyone have ideas?
Reverse the order of the a[] indexes, e.g,. a[0] -> a[3]
I think you have the endianness in reverse.
Try this:
a[3] = (num>>24) & 0xFF;
a[2] = (num>>16) & 0xFF;
a[1] = (num>>8) & 0xFF;
a[0] = num & 0xFF;
To see what happens use
printf("%x\n", ...);
to print both input and output number.
Endian-independent way:
x = (a[0] << 24) | (a[1] << 16) | (a[2] << 8) | a[3];
This line is never going to work correctly on a little-endian machine:
x = *(int *)a;
You need to unpack the data before you print out the value.
Your code a[0] = (num>>24) & 0xFF; takes the most significant 8 bits from num and sticks them in the first byte of a. On little endian machines the first byte holds the least signficant bits. That means that on little endian machines, this code takes the most significant 8 bits and stores them in the place where the least significant bits go, changing the value.
2130706432 is 0x7F000000 in hex, and 127 is 0x0000007F.
Also, x = *(int *)a; results in undefined behavior. Consider hardware where reading an int from an improperly aligned address causes a bus error. If a doesn't happen to be aligned properly for an int then the program would crash.
A correct approach to interpreting the bytes as an int would be std::memcpy(&x, a, sizeof x);
I have an unsigned char array with 2 elements that represents a signed integer. How can I convert these 2 bytes into a signed integer?
Edit: The unsigned char array is in little endian
For maximum safety, use
int i = *(signed char *)(&c[0]);
i *= 1 << CHAR_BIT;
i |= c[1];
for big endian. Swap c[0] and c[1] for little endian.
(Explanation: we interpret the byte at c[0] as a signed char, then arithmetically left shift it in a portable way, then add in c[1].)
wrap them up in a union:
union {
unsigned char a[2];
int16_t smt;
} number;
Now after filling the array you can use this as number.smt
It depend of endianness.
Something for big endian :
unsigned char x[2];
short y = (x[0] << 8) | x[1]
Something for little endian :
unsigned char x[2];
short y = (x[1] << 8) | x[0]
The portable solution:
unsigned char c[2];
long tmp;
int result;
tmp = (long)c[0] << 8 | c[1];
if (tmp < 32768)
result = tmp;
else
result = tmp - 65536;
This assumes that the bytes in the array represent a 16 bit, two's complement, big endian signed integer. If they are a little endian integer, just swap c[1] and c[0].
(In the highly unlikely case that it is ones' complement, use 65535 instead of 65536 as the value to subtract. Sign-magnitude is left as an exercise for the reader ;)