Is there anyone can help me to explain what is the difference between unsigned char and char in XOR operation?
#include <stdio.h>
int main() {
char a[2] = { 0x56, 0xa5 }; // a[0] 0101 0110
// a[1] 1010 0101
a[0] = a[0] ^ a[1]; // a[0] 1111 0011 f3
printf("%02x", a[0]);
puts("");
unsigned char b[2] = { 0x56, 0xa5 }; // b[0] 0101 0110
// b[1] 1010 0101
b[0] = b[0] ^ b[1]; // b[0] 1111 0011 f3
printf("%02x", b[0]);
puts("");
}
result:
fffffff3
f3
[Finished in 0.0s]
Another example:
#include <stdio.h>
int main() {
char a[2] = { 0x01, 0x0a };
a[0] = a[0] ^ a[1];
printf("%02x", a[0]);
puts("");
unsigned char b[2] = { 0x01, 0x0a };
b[0] = b[0] ^ b[1];
printf("%02x", b[0]);
puts("");
}
result:
0b
0b
[Finished in 0.0s]
In the first case, your code
printf("%02x", a[0]);
Passes a char value to a variadic function printf. The char value is promoted to int type and passed as such. The value of a[0] is -13 because the char type happens to be signed by default on your environment, The value is preserved by the promotion as int and printf receives it as an int.
The format %02x expects an unsigned int value. printf was passed an int value, an incorrect type invoking undefined behavior. Since int and unsigned int use the same parameter passing convention on your platform, this negative value of -13 is interpreted as an unsigned int with the same bit pattern, with value 0xFFFFFFF3 because the int on your platform has 32 bits and negative values are represented in 2s complement. The string produced by printf is fffffff3. This behavior is not actually guaranteed by the C Standard.
In the second example, b[0] is an unsigned char with value 243 (0xf3). Its promotion to int preserves the value and the int passed to printf is 243. The same undefined behavior is invoked as printf is passed an int instead of an unsigned int. In your particular case, the conversion performed by printf yields the same value, which printed as hex with at least 2 digits padded with leading 0s gives f3.
To avoid this ambiguity, you should either cast the operand as unsigned char:
printf("%02x", (unsigned)(unsigned char)a[0]);
Or specify its actual type as unsigned char:
printf("%02hhx", (unsigned char)a[0]);
(Type char is signed, sizeof(int) is 4, 8 bits per byte.)
Both operands of a, are promoted to int, because of integer promotions:
a[0]^a[1];
Because a[1] is a signed type, char, the number 0xa5 actually represents a negative value of -91. The representation of the value -91 in type int is 0xffffffa5.
So the calculation becomes:
0x00000056 ^ 0xffffffa5
or in decimal:
86 ^ -91
The result is of that operation is:0xfffffff3
The unsigned char version of this calculations doesn't have this 'problem'.
Related
I accidentally used "%d" to print an unsigned integer using an online compiler. I thought errors would pop out, but my program can run successfully. It's good that my codes are working, but I just don't understand why.
#include <stdio.h>
int main() {
unsigned int x = 1
printf( "%d", x);
return 0;
}
The value of the "unsigned integer" was small enough that the MSB (most significant bit) was not set. If it were, printf() would have treated the value as a "negative signed integer" value.
int main() {
uint32_t x = 0x5;
uint32_t y = 0xC0000000;
printf( "%d %u %d\n", x, y, y );
return 0;
}
5 3221225472 -1073741824
You can see the difference.
With new-fangled compilers that "read into" printf format specifiers and match those with the datatypes of following parameters, it may be that the online compiler may-or-may-not have been able to report this type mismatch with a warning. This may be something you will want to look into.
refer to printf() manual, they said:
A character that specifies the type of conversion to be applied. The
conversion specifiers and their meanings are:
d, i
The int argument is
converted to signed decimal notation. The precision, if any, gives the
minimum number of digits that must appear; if the converted value
requires fewer digits, it is padded on the left with zeros. The
default precision is 1. When 0 is printed with an explicit precision
0, the output is empty.
so it means that the parameter even if it's in unsigned representation, it will be converted into its signed int representation and printed, see the following code example:
#include <stdio.h>
int main(){
signed int x1 = -2147483648;
unsigned int x2 = -2147483648;
unsigned long long x3 = -2147483648;
printf("signed int x1 = %d\n", x1);
printf("unsigned int x2 = %d\n", x2);
printf("signed long long x3 = %d\n", x3);
}
and this is the output:
signed int x1 = -2147483648
unsigned int x2 = -2147483648
signed long long x3 = -2147483648
so it means no matter what is the type of the variable printed, as long as you specified %d as format specifier, the variable will be converted into its representation in signed int and be printed
in case of unsigned char like for example:
#include <stdio.h>
int main(){
unsigned char s = -10;
printf("s = %d",s);
}
the output is :
s = 246
as the binary representation of unsigned char s = -10 is :
1111 0110
where the MSB bit is 1, but when it's converted into signed int, the new representation is :
0000 0000 0000 0000 0000 0000 1111 0110
so the MSB is no longer have that 1 bit in its MSB which represents whether the number is positive or negative.
I need to convert 4 byte data which is in below format to a original int value . I cannot change the below assignment of int to 4 bytes.
int main() {
//code
int num = 1000;
char a[4];
a[0] = ( char )(num>>24) ;
a[1] = ( char )(num>>16) ;
a[2] = ( char )(num>>8) ;
a[3] = ( char )num ;
printf("Original number is:%d\n", (a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3] ) );
return 0;
}
I was expecting output to be 1000, but output is 768. How do we restore the original number from above byte array ?Is this an endianess issue ?
a[0] = ( char )(num>>24) ;
That works “okay” in this example. However, in situations where num is negative, the result is implementation-defined (C 2018 6.5.7 5).
In the remaining assignments to a[1], a[2], and a[3], values that may exceed the range of char will be converted to char automatically. If char is signed, the results of these conversions are implementation-defined or a signal is raised (6.3.1.3 3). So that is a problem we will have to fix, below.
First, for num = 1000, let’s suppose that −24 is stored in a[3]. This is the result we would get by taking the low eight bits of 1000 and putting them in an eight-bit two’s complement char, which is likely what your implementation uses. Then, we have a[0] = 0, a[1] = 0, a[2] = 3, and a[3] = −24.
Now let’s consider a[0] << 24 | a[1] << 16 | a[2] << 8 | a[3].
a[0] << 24 and a[1] << 16 both yield 0. a[2] << 8 is 3 << 8, which produces 768, or 300 in hexadecimal. a[3] is −24. While a[3] is a char, it is promoted to an int when used in an expression (6.3.1.1 2). Still assuming your C implementation uses two’s complement, the binary for −24 is 11111111111111111111111111101000, or ffffffe8 in hexadecimal.
When we bitwise OR 300 and ffffffe8, the result is ffffffe8, which, in a 32-bit two’s complement int, is −24.
The easiest way to fix this is to change char a[4]; to unsigned char a[4];. That avoids any negative char values.
However, to make your code completely work for any value of int (assuming it is four bytes and two’s complement), we need to make some other changes:
unsigned char a[4];
/* Convert the signed num to unsigned before shifting.
Shifts of unsigned values are better defined than shifts
of signed values.
*/
a[0] = (unsigned) num >> 24;
a[1] = (unsigned) num >> 16;
a[2] = (unsigned) num >> 8;
a[3] = (unsigned) num;
/* The cast in the last assignment is not really needed since
we are assigning to an unsigned char, and it will be converted
as desired, but we keep it for uniformity.
*/
// Reconstruct the value using all unsigned values.
unsigned u = (unsigned) a[0] << 24 | (unsigned) a[1] << 16 | (unsigned) a[2] << 8 | a[3];
/* Copy the bits into an int. (Include <string.h> to get memcpy.)
Note: It is easy to go from signed to unsigned because the C standard
completely defines that conversion. For unsigned to signed, the
conversion is not completely defined, so we have to use some indirect
method to get the bits into an int.
*/
int i;
memcpy(&i, &u, sizeof i);
printf("Original number: %d.\n", i);
We need to use an unsigned value to reconstruct the bits because C’s shift operators are not well defined for signed values, especially when we want to shift a bit into the sign bit. Once we have the bits in the unsigned object, we can copy them into an int.
I'm confused. Why in this program a gives me 0xFFFFFFA0 but b gives me 0xA0? It's weird.
#include <stdio.h>
int main()
{
char a = 0xA0;
int b = 0xA0;
printf("a = %x\n", a);
printf("b = %x\n", b);
}
Default type of a is signed in char a = 0xA0; and in any signed data type whether its char of int you should be careful of sign bit, if sign bit is set means number will be negative and store as two's compliment way.
char a = 0xA0; /* only 1 byte for a but since sign bit is set,
it gets copied into remaining bytes also */
a => 1010 0000
|
this sign bit gets copied
1111 1111 1111 1111 1111 1111 1010 0000
f f f f f f A 0
In case of int b = 0xA0; sign bit(31st bit) is 0 so what ever it contains i.e 0xA0 will be printed.
Let's take this step-by-step.
char a = 0xA0;
0xA0 in an integer constant with a value of 160 and type int.
In OP's case, a char is encoded like a signed char with an 8-bit range. 160 is more than the maximum 8-bit char and so assigning an out-of-range value to a some signed integer type is implementation defined behavior. In OP's case, the value "wrapped around" and a took on the value of 160 - 256 or -96.
// Try
printf("a = %d\n", a);
With printf() (a variadic function), char a it passed to the ... part and so goes though the usual integer promotions to an int and retains the same value.
printf("a = %x\n", a);
// is just like
printf("a = %x\n", -96);
printf("a = %x\n", a);
With printf(), "%x" expect an unsigned or an int with a value in the non-negative range. With int and -96, it is neither and so the output is undefined behavior.
A typical undefined behavior is this case is to interpret the passed bit pattern as an unsigned. The bit pattern of int -96, as a 32-bit int is 0xFFFFFFA0.
Moral of the story:
Enable a compile warnings. A good compiler would warn about both char a = 0xA0; and printf("a = %x\n", a);
Do not rely on undefined behavior.
I have following code
char temp[] = { 0xAE, 0xFF };
printf("%X\n", temp[0]);
Why output is FFFFFFAE, not just AE?
I tried
printf("%X\n", 0b10101110);
And output is correct: AE.
Suggestions?
The answer you're getting, FFFFFFAE, is a result of the char data type being signed. If you check the value, you'll notice that it's equal to -82, where -82 + 256 = 174, or 0xAE in hexadecimal.
The reason you get the correct output when you print 0b10101110 or even 174 is because you're using the literal values directly, whereas in your example you're first putting the 0xAE value in a signed char where the value is then being sort of "reinterpreted modulo 128", if you wanna think of it that way.
So in other words:
0 = 0 = 0x00
127 = 127 = 0x7F
128 = -128 = 0xFFFFFF80
129 = -127 = 0xFFFFFF81
174 = -82 = 0xFFFFFFAE
255 = -1 = 0xFFFFFFFF
256 = 0 = 0x00
To fix this "problem", you could declare the same array you initially did, just make sure to use an unsigned char type array and your values should print as you expect.
#include <stdio.h>
#include <stdlib.h>
int main()
{
unsigned char temp[] = { 0xAE, 0xFF };
printf("%X\n", temp[0]);
printf("%d\n\n", temp[0]);
printf("%X\n", temp[1]);
printf("%d\n\n", temp[1]);
return EXIT_SUCCESS;
}
Output:
AE
174
FF
255
https://linux.die.net/man/3/printf
According to the man page, %x or %X accept an unsigned integer. Thus it will read 4 bytes from the stack.
In any case, under most architectures you can't pass a parameter that is less then a word (i.e. int or long) in size, and in your case it will be converted to int.
In the first case, you're passing a char, so it will be casted to int. Both are signed, so a signed cast is performed, thus you see preceding FFs.
In your second example, you're actually passing an int all the way, so no cast is performed.
If you'd try:
printf("%X\n", (char) 0b10101110);
You'd see that FFFFFFAE will be printed.
When you pass a smaller than int data type (as char is) to a variadic function (as printf(3) is) the parameter is converted to int in case the parameter is signed and to unsigned int in the case it is unsigned. What is being done and you observe is a sign extension, as the most significative bit of the char variable is active, it is replicated to the thre bytes needed to complete an int.
To solve this and to have the data in 8 bits, you have two possibilities:
Allow your signed char to convert to an int (with sign extension) then mask the bits 8 and above.
printf("%X\n", (int) my_char & 0xff);
Declare your variable as unsigned, so it is promoted to an unsigned int.
unsigned char my_char;
...
printf("%X\n", my_char);
This code causes undefined behaviour. The argument to %X must have type unsigned int, but you supply char.
Undefined behaviour means that anything can happen; including, but not limited to, extra F's appearing in the output.
In the following code:
#include "stdio.h"
signed char a= 0x80;
unsigned char b= 0x01;
void main (void)
{
if(b*a>1)
printf("promoted\n");
else if (b*a<1)
printf("why doesnt promotion work?");
while(1);
}
I expected "promoted' to be printed. But it doesnt. Why?
If I can the datatypes to signed and unsigned int, and have a as a negative number, eg, 0x80000000 and b as a positive number, 0x01, "promoted" gets printed as expected.
PLZ HELP me understand what the problem is!
You've just been caught by the messy type-promotion rules of C.
In C, intermediates of integral type smaller than int are automatically promoted to int.
So you have:
0x80 * 0x01 = -128 * 1
0x80 gets signed extended to type int:
0xffffff80 * 0x00000001 = -128 * 1 = -128
So the result is -128 and thus is less than 1.
When you use type int and unsigned int, both operands get promoted to unsigned int. 0x80000000 * 0x01 = 0x80000000 as an unsigned integer is bigger than 1.
So here's the side-by-side comparison of the type promotion that's taking place:
(signed char) * (unsigned char) -> int
(signed int ) * (unsigned int ) -> unsigned int
(signed char)0x80 * (unsigned char)0x01 -> (int) 0xffffff80
(signed int )0x80000000 * (unsigned int )0x01 -> (unsigned int)0x80000000
(int) 0xffffff80 is negative -> prints "why doesnt promotion work?"
(unsigned int)0x80000000 is positive -> prints "promoted"
Here's a reference to the type-promotion rules of C.
The reason printf("promoted\n"); never runs
is because b*a is always == -128, which is less than 1
a b
0x80 * 0x01 = -128 * 1