So I'm trying to interpret the following output:
short int v = -12345;
unsigned short uv = (unsigned short) v;
printf("v = %d, uv = %u\n", v, uv);
Output:
v = -12345
uv = 53191
So the question is: why is this exact output generated when this program is run on a two's complement machine?
What operations lead to this result when casting the value to unsigned short?
My answer assumes 16-bit two's complement arithmetic.
To find the value of -12345, take 12345, complement it, and add 1.
12345 is 0x3039 is 0011000000111001.
Complementing means changing all the 1's to 0's and all the 0's to 1's:
1100111111000110 is 0xcfc6 is 53190.
Add one: 53191.
So internally, -12345 is represented by 0xcfc7 = 53191.
But if you interpret it as an unsigned number, it's obviously just 53191. (And when you assign a signed value to an unsigned integer of the same size, what typically ends up happening is that you assign the exact bit pattern, without converting anything. Later, however, you will typically interpret that value differently, such as when you print it with %u.)
Another, perhaps easier way to think about this is that 16-bit arithmetic "wraps around" at 216 = 65536. So you can think of 65536 as being another name for 0 (just like 0:00 and 24:00 are both names for midnight). So -12345 is 65536 - 12345 = 53191.
The conversion rules, when converting signed integer to an unsigned integer, defined by C standard requires by repeatedly adding the TYPE_MAX + 1 to the value.
From 6.3.1.3 Signed and unsigned integers:
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
If USHRT_MAX is 65535 and then adding 65535 + 1 + -12345 is 53191.
The output seen does not depend on 2's complement nor 16 or 32- bit int. The output seen is entirely defined and would be the same on a rare 1's complement or sign-magnitude machine. The result does depend on 16-bit unsigned short
-12345 is within the minimum range of a short, so no issues with that assignment. When a short is passed as a ... argument, is goes thought the usual promotion to an int with no change in value as all short are in the range of int. "%d" expects an int, so the output is "-12345"
short int v = -12345;
printf("%d\n", v); // output "-12345\n"
Assigning a negative number to a unsigned type is well defined. With a 16-bit unsigned short, the value of uv is -12345 plus the minimum multiples of USHRT_MAX+1 (65536 in this case) to a final value of 53191.
Passing an unsigned short as an ... argument, the value is converted to int or unsigned, whichever type contains the entire range of unsigned short. IAC, the values does not change. "%u" matches an unsigned. It also matches an int whose values are expressible as either an int or unsigned.
short int v = -12345;
unsigned short uv = (unsigned short) v;
printf("%u\n", v); // output "53191\n"
What operations lead to this result when casting the value to unsigned short?
The casting did not affect the final outcome. The same result would have occurred without the cast. The cast may be useful to quiet warnings.
Related
I have a strange problem. I have a variable whose actual value is only negative(only negative integers are generated for this variable). But in the legacy code, an old colleague of mine used uint16 instead of signed int to store the values of that variable. Now if i wanted to print the actual negative value of that variable, how can i do that(what format specifier to us)? For example if actual value is -75, when i print using %d its giving me 5 digit positive value(I think its because of two's complement). I want to print it as 75 or -75.
If a 16-bit negative integer was stored in a uint16_t called x, then the original value may be calculated as x-65536.
This can be printed with any of1:
printf("%ld", x-65536L);
printf("%d", (int) (x-65536));
int y = x-65536;
printf("%d", y);
Subtracting 65536 works because:
Per C 2018 6.5.16.1 2, the value of the right operand (a negative 16-bit integer is converted to the type of the assignment expression (which is essentially the type of the left operand).
Per 6.3.1.3 2, the conversion to an unsigned integer operates by adding or subtracting “one more than the maximum value that can be represented in the new type”. For uint16_t, one more than its maximum is 65536.
With a 16-bit negative integer, adding 65536 once brings it into the range of a uint16_t.
Therefore, subtracting 65536 restores the original value.
Footnote
1 65536 will be long or int according to whether int is 16 bits or more, so these statements are careful to handle the type correctly. The first uses 65536L to ensure the type is long. The rest convert the value to int. This is safe because, although the type of x-65536 could be long, its value fits in an int—unless you are executing in a C implementation that limits int to −32767 to +32767, and the original value may be −32768, in which case you should stick to the first option.
If an int is 32 bits on your system, then the correct format for printing a 16-bit int is %hu for unsigned and %hd for signed.
Examine the output of:
uint16_t n = (uint16_t)-75;
printf("%d\n", (int)n); // Output: 65461
printf("%hu\n", n); // Output: 65461
printf("%hd\n", n); // Output: -75
#include <inttypes.h>
uint16_t foo = -75;
printf("==> %" PRId16 " <==\n", foo); // type mismatch, undefined behavior
Assuming your friend has somehow correctly put the bit representation for the signed integer into the unsigned integer, the only standard compliant way to extract it back would be to use a union as-
typedef union {
uint16_t input;
int16_t output;
} aliaser;
Now, in your code you can do -
aliaser x;
x.input = foo;
printf("%d", x.output);
Go ahead with the union, as explained by another answer. Just wanted to say that if you are using GCC, there's a very cool feature that allows you to do that sort of "bitwise" casting without writing much code:
printf("%d", ((union {uint16_t in; int16_t out}) foo).out);
See https://gcc.gnu.org/onlinedocs/gcc-4.4.1/gcc/Cast-to-Union.html.
If u is an object of unsigned integer type and a negative number whose magnitude is within range of u's type is stored into it, storing -u to an object of u's type will leave it holding the magnitude of that negative number. This behavior does not depend upon how u is represented. For example, if u and v are 16-bit unsigned short, but int is 32 bits, then storing -60000 into u will leave it holding 5536 [the implementation will behave as though it adds 65536 to the value stored until it's within range of unsigned short]. Evaluating -u will yield -5536, and storing -5536 into v will leave it holding 60000.
A short int in C contains 16 bits and the first bit represents whether or not the value is negative or positive. I have a C program that is as follows:
int main() {
short int v;
unsigned short int uv;
v = -20000;
uv = v;
printf("\nuv = %hu\n", uv);
return 0;
}
Since the value of v is negative I know the first bit of the variable is a 1. So I expect the output of the program to equal uv = 52,768 b/c 20,000 + (2^15) = 52,768.
Instead I am getting uv = 45536 as the output. What part of my logic is incorrect?
The behavior you're seeing can be explained by the conversion rules of C:
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
(This quote is from C99.)
-20000 can't be represented by an unsigned short because it's negative. The target type is unsigned, so the value is converted by repeatedly adding 65536 (which is USHORT_MAX + 1) until it is in range: -20000 + 65536 is exactly 45536.
Note that this behavior is mandated by the C standard and has nothing to do with how negative numbers are actually represented in memory (in particular, it works the same way even on machines using sign/magnitude or ones' complement).
I am trying to convert 65529 from an unsigned int to a signed int. I tried doing a cast like this:
unsigned int x = 65529;
int y = (int) x;
But y is still returning 65529 when it should return -7. Why is that?
It seems like you are expecting int and unsigned int to be a 16-bit integer. That's apparently not the case. Most likely, it's a 32-bit integer - which is large enough to avoid the wrap-around that you're expecting.
Note that there is no fully C-compliant way to do this because casting between signed/unsigned for values out of range is implementation-defined. But this will still work in most cases:
unsigned int x = 65529;
int y = (short) x; // If short is a 16-bit integer.
or alternatively:
unsigned int x = 65529;
int y = (int16_t) x; // This is defined in <stdint.h>
I know it's an old question, but it's a good one, so how about this?
unsigned short int x = 65529U;
short int y = *(short int*)&x;
printf("%d\n", y);
This works because we are casting the address of x to the signed version of it's type, that's permitted by the C standard. Not all type punning like this (most in fact) is legal. The standard says this.
An object shall have its stored value accessed only by an lvalue that has one of the following types:
the declared type of the object,
a qualified version of the declared type of the object,
a type that is the signed or unsigned type corresponding to the declared type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the declared type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
a character type.
So, alas, since we are accessing the bits of x as if they were a signed (via the pointer), the actual conversion operation is replaced by reading what appears to be just a negative signed short, and conversion takes place without issue. However, it's possible for this to screw up on a one's complement machine, but those are so, so rare, and so, so obsolete, I wouldn't even bother with looking out for them.
#Mysticial got it. A short is usually 16-bit and will illustrate the answer:
int main()
{
unsigned int x = 65529;
int y = (int) x;
printf("%d\n", y);
unsigned short z = 65529;
short zz = (short)z;
printf("%d\n", zz);
}
65529
-7
Press any key to continue . . .
A little more detail. It's all about how signed numbers are stored in memory. Do a search for twos-complement notation for more detail, but here are the basics.
So let's look at 65529 decimal. It can be represented as FFF9h in hexadecimal. We can also represent that in binary as:
11111111 11111001
When we declare short zz = 65529;, the compiler interprets 65529 as a signed value. In twos-complement notation, the top bit signifies whether a signed value is positive or negative. In this case, you can see the top bit is a 1, so it is treated as a negative number. That's why it prints out -7.
For an unsigned short, we don't care about sign since it's unsigned. So when we print it out using %d, we use all 16 bits, so it's interpreted as 65529.
To understand why, you need to know that the CPU represents signed numbers using the two's complement (maybe not all, but many).
byte n = 1; //0000 0001 = 1
n = ~n + 1; //1111 1110 + 0000 0001 = 1111 1111 = -1
And also, that the type int and unsigned int can be of different sized depending on your CPU. When doing specific stuff like this:
#include <stdint.h>
int8_t ibyte;
uint8_t ubyte;
int16_t iword;
//......
The representation of the values 65529u and -7 are identical for 16-bit ints. Only the interpretation of the bits is different.
For larger ints and these values, you need to sign extend; one way is with logical operations
int y = (int )(x | 0xffff0000u); // assumes 16 to 32 extension, x is > 32767
If speed is not an issue, or divide is fast on your processor,
int y = ((int ) (x * 65536u)) / 65536;
The multiply shifts left 16 bits (again, assuming 16 to 32 extension), and the divide shifts right maintaining the sign.
You are expecting that your int type is 16 bit wide, in which case you'd indeed get a negative value. But most likely it's 32 bits wide, so a signed int can represent 65529 just fine. You can check this by printing sizeof(int).
To answer the question posted in the comment above - try something like this:
unsigned short int x = 65529U;
short int y = (short int)x;
printf("%d\n", y);
or
unsigned short int x = 65529U;
short int y = 0;
memcpy(&y, &x, sizeof(short int);
printf("%d\n", y);
Since converting unsigned values use to represent positive numbers converting it can be done by setting the most significant bit to 0. Therefore a program will not interpret that as a Two`s complement value. One caveat is that this will lose information for numbers that near max of the unsigned type.
template <typename TUnsigned, typename TSinged>
TSinged UnsignedToSigned(TUnsigned val)
{
return val & ~(1 << ((sizeof(TUnsigned) * 8) - 1));
}
I know this is an old question, but I think the responders may have misinterpreted it. I think what was intended was to convert a 16-digit bit sequence received as an unsigned integer (technically, an unsigned short) into a signed integer. This might happen (it recently did to me) when you need to convert something received from a network from network byte order to host byte order. In that case, use a union:
unsigned short value_from_network;
unsigned short host_val = ntohs(value_from_network);
// Now suppose host_val is 65529.
union SignedUnsigned {
short s_int;
unsigned short us_int;
};
SignedUnsigned su;
su.us_int = host_val;
short minus_seven = su.s_int;
And now minus_seven has the value -7.
#include <stdio.h>
void fun3(int a, int b, int c)
{
printf("%d \n", a+b+c );
}
void fun2 ( int x, int y)
{
fun3(0x33333333,0x30303030, 0x31313131);
printf("%d \n", x+y);
}
fun1 (int x)
{
fun2(0x22222222,0x20202020);
printf("%d \n", x);
}
main()
{
fun1(0x1111111);
}
I'm going through the above program for stack corruption. I am getting the o/p for the above program with some undesired values. All I could understand is if the added value is beyond 0xFFFFFFFF then the small negative integer becomes the largest value say -1 becomes 0xFFFFFFFF. Any insights on this
EDIT (Corrections) (I missed the point. My answer is right for constants, but the question contains parameters of functions, then what happens here is overflow of signed integer objects and, as correctly pointed out #Cornstalks in his comment, this is undefined behaviour).
/EDIT
In fun1() you are using printf() in a wrong way.
You wrote "%d" to accept an int, but this is not true if your number is greater that MAX_INT.
You have to check the value of MAX_INT in your system.
If you write an integer constant in hexadecimal format, the standard C (ISO C99 or C11) tries to put the value in the first type that the constant can fit, by following this order:
int, unsigned int, long int, unsigned long int, long long int,
unsigned long long int.
Thus, if you have a constant greater that MAX_INT (the max. value in the range of int), your constant (if positive) has type unsigned int, but the directive %d expected a signed int value. Thus, it will be shown some negative number.
Worst, if your constant is a value greater than UMAX_INT (the max. value in the range of unsigned int) then the type of the constant will be the first of long int, unsigned long int, long long int, with precision strictly bigger than of unsigned int.
This implies that %d becomes a wrong directive.
If you cannot be completely sure about how big will be your values, you could do a cast to the biggest integer type:
printf("%lld", (long long int) 0x33333333333);
The directive %lld stands for long long int.
If you are interested always in positive values, you have to use %llu and cast to unsigned long long int:
printf("%llu", (unsigned long long int) 0x33333333333);
In this way, you avoids any "funny" numbers, as much as, you show big numbers without loosing any precision.
Remark: The constants INT_MAX, UINT_MAX, and the like, are in limits.h.
Important: The automatic sequence of casts is only valid for octal and hexadecimal constants. For decimal constants there is another rule:
int, long int, long long int.
To #Cornstalks' point: INT_MIN is 0x80000000, and (int)-1 is 0xFFFFFFFF in 2's complement (on a 32-bit system, anyway).
This allows the instruction set to do things in signed arithmetic like:
1 + -2 = -1
becomes (as signed shorts, for brevity)
0x0001 + 0xFFFE = 0xFFFF
... then:
1 + -1 = 0
is represented internally with overflow as
0x0001 + 0xFFFF = 0x0000
Also to #Cornstalks' point: the internal representation (as well as overflow addition) is an implementation detail. C implementations (and instruction sets) need not represent integers in 2's complement, so providing hex values for signed integer types may tie you to a subset of C implementations.
fun3 will attempt to print the value 0x94949494. This is greater than the max 4-byte integer value of 0x7FFFFFFF, so it will "overflow" and (on virtually every computer made today) produce (if I did my arithmetic correctly) the negative number -0x6B6B6B6C, which is -1802201964.
fun1 and fun2 should print the "expected" positive results.
I am trying to convert 65529 from an unsigned int to a signed int. I tried doing a cast like this:
unsigned int x = 65529;
int y = (int) x;
But y is still returning 65529 when it should return -7. Why is that?
It seems like you are expecting int and unsigned int to be a 16-bit integer. That's apparently not the case. Most likely, it's a 32-bit integer - which is large enough to avoid the wrap-around that you're expecting.
Note that there is no fully C-compliant way to do this because casting between signed/unsigned for values out of range is implementation-defined. But this will still work in most cases:
unsigned int x = 65529;
int y = (short) x; // If short is a 16-bit integer.
or alternatively:
unsigned int x = 65529;
int y = (int16_t) x; // This is defined in <stdint.h>
I know it's an old question, but it's a good one, so how about this?
unsigned short int x = 65529U;
short int y = *(short int*)&x;
printf("%d\n", y);
This works because we are casting the address of x to the signed version of it's type, that's permitted by the C standard. Not all type punning like this (most in fact) is legal. The standard says this.
An object shall have its stored value accessed only by an lvalue that has one of the following types:
the declared type of the object,
a qualified version of the declared type of the object,
a type that is the signed or unsigned type corresponding to the declared type of the object,
a type that is the signed or unsigned type corresponding to a qualified version of the declared type of the object,
an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union),
a character type.
So, alas, since we are accessing the bits of x as if they were a signed (via the pointer), the actual conversion operation is replaced by reading what appears to be just a negative signed short, and conversion takes place without issue. However, it's possible for this to screw up on a one's complement machine, but those are so, so rare, and so, so obsolete, I wouldn't even bother with looking out for them.
#Mysticial got it. A short is usually 16-bit and will illustrate the answer:
int main()
{
unsigned int x = 65529;
int y = (int) x;
printf("%d\n", y);
unsigned short z = 65529;
short zz = (short)z;
printf("%d\n", zz);
}
65529
-7
Press any key to continue . . .
A little more detail. It's all about how signed numbers are stored in memory. Do a search for twos-complement notation for more detail, but here are the basics.
So let's look at 65529 decimal. It can be represented as FFF9h in hexadecimal. We can also represent that in binary as:
11111111 11111001
When we declare short zz = 65529;, the compiler interprets 65529 as a signed value. In twos-complement notation, the top bit signifies whether a signed value is positive or negative. In this case, you can see the top bit is a 1, so it is treated as a negative number. That's why it prints out -7.
For an unsigned short, we don't care about sign since it's unsigned. So when we print it out using %d, we use all 16 bits, so it's interpreted as 65529.
To understand why, you need to know that the CPU represents signed numbers using the two's complement (maybe not all, but many).
byte n = 1; //0000 0001 = 1
n = ~n + 1; //1111 1110 + 0000 0001 = 1111 1111 = -1
And also, that the type int and unsigned int can be of different sized depending on your CPU. When doing specific stuff like this:
#include <stdint.h>
int8_t ibyte;
uint8_t ubyte;
int16_t iword;
//......
The representation of the values 65529u and -7 are identical for 16-bit ints. Only the interpretation of the bits is different.
For larger ints and these values, you need to sign extend; one way is with logical operations
int y = (int )(x | 0xffff0000u); // assumes 16 to 32 extension, x is > 32767
If speed is not an issue, or divide is fast on your processor,
int y = ((int ) (x * 65536u)) / 65536;
The multiply shifts left 16 bits (again, assuming 16 to 32 extension), and the divide shifts right maintaining the sign.
You are expecting that your int type is 16 bit wide, in which case you'd indeed get a negative value. But most likely it's 32 bits wide, so a signed int can represent 65529 just fine. You can check this by printing sizeof(int).
To answer the question posted in the comment above - try something like this:
unsigned short int x = 65529U;
short int y = (short int)x;
printf("%d\n", y);
or
unsigned short int x = 65529U;
short int y = 0;
memcpy(&y, &x, sizeof(short int);
printf("%d\n", y);
Since converting unsigned values use to represent positive numbers converting it can be done by setting the most significant bit to 0. Therefore a program will not interpret that as a Two`s complement value. One caveat is that this will lose information for numbers that near max of the unsigned type.
template <typename TUnsigned, typename TSinged>
TSinged UnsignedToSigned(TUnsigned val)
{
return val & ~(1 << ((sizeof(TUnsigned) * 8) - 1));
}
I know this is an old question, but I think the responders may have misinterpreted it. I think what was intended was to convert a 16-digit bit sequence received as an unsigned integer (technically, an unsigned short) into a signed integer. This might happen (it recently did to me) when you need to convert something received from a network from network byte order to host byte order. In that case, use a union:
unsigned short value_from_network;
unsigned short host_val = ntohs(value_from_network);
// Now suppose host_val is 65529.
union SignedUnsigned {
short s_int;
unsigned short us_int;
};
SignedUnsigned su;
su.us_int = host_val;
short minus_seven = su.s_int;
And now minus_seven has the value -7.