Printing the actual negative number stored inside unsigned int - c

I have a strange problem. I have a variable whose actual value is only negative(only negative integers are generated for this variable). But in the legacy code, an old colleague of mine used uint16 instead of signed int to store the values of that variable. Now if i wanted to print the actual negative value of that variable, how can i do that(what format specifier to us)? For example if actual value is -75, when i print using %d its giving me 5 digit positive value(I think its because of two's complement). I want to print it as 75 or -75.

If a 16-bit negative integer was stored in a uint16_t called x, then the original value may be calculated as x-65536.
This can be printed with any of1:
printf("%ld", x-65536L);
printf("%d", (int) (x-65536));
int y = x-65536;
printf("%d", y);
Subtracting 65536 works because:
Per C 2018 6.5.16.1 2, the value of the right operand (a negative 16-bit integer is converted to the type of the assignment expression (which is essentially the type of the left operand).
Per 6.3.1.3 2, the conversion to an unsigned integer operates by adding or subtracting “one more than the maximum value that can be represented in the new type”. For uint16_t, one more than its maximum is 65536.
With a 16-bit negative integer, adding 65536 once brings it into the range of a uint16_t.
Therefore, subtracting 65536 restores the original value.
Footnote
1 65536 will be long or int according to whether int is 16 bits or more, so these statements are careful to handle the type correctly. The first uses 65536L to ensure the type is long. The rest convert the value to int. This is safe because, although the type of x-65536 could be long, its value fits in an int—unless you are executing in a C implementation that limits int to −32767 to +32767, and the original value may be −32768, in which case you should stick to the first option.

If an int is 32 bits on your system, then the correct format for printing a 16-bit int is %hu for unsigned and %hd for signed.
Examine the output of:
uint16_t n = (uint16_t)-75;
printf("%d\n", (int)n); // Output: 65461
printf("%hu\n", n); // Output: 65461
printf("%hd\n", n); // Output: -75

#include <inttypes.h>
uint16_t foo = -75;
printf("==> %" PRId16 " <==\n", foo); // type mismatch, undefined behavior

Assuming your friend has somehow correctly put the bit representation for the signed integer into the unsigned integer, the only standard compliant way to extract it back would be to use a union as-
typedef union {
uint16_t input;
int16_t output;
} aliaser;
Now, in your code you can do -
aliaser x;
x.input = foo;
printf("%d", x.output);

Go ahead with the union, as explained by another answer. Just wanted to say that if you are using GCC, there's a very cool feature that allows you to do that sort of "bitwise" casting without writing much code:
printf("%d", ((union {uint16_t in; int16_t out}) foo).out);
See https://gcc.gnu.org/onlinedocs/gcc-4.4.1/gcc/Cast-to-Union.html.

If u is an object of unsigned integer type and a negative number whose magnitude is within range of u's type is stored into it, storing -u to an object of u's type will leave it holding the magnitude of that negative number. This behavior does not depend upon how u is represented. For example, if u and v are 16-bit unsigned short, but int is 32 bits, then storing -60000 into u will leave it holding 5536 [the implementation will behave as though it adds 65536 to the value stored until it's within range of unsigned short]. Evaluating -u will yield -5536, and storing -5536 into v will leave it holding 60000.

Related

Output of a short int and a unsigned short?

So I'm trying to interpret the following output:
short int v = -12345;
unsigned short uv = (unsigned short) v;
printf("v = %d, uv = %u\n", v, uv);
Output:
v = -12345
uv = 53191
So the question is: why is this exact output generated when this program is run on a two's complement machine?
What operations lead to this result when casting the value to unsigned short?
My answer assumes 16-bit two's complement arithmetic.
To find the value of -12345, take 12345, complement it, and add 1.
12345 is 0x3039 is 0011000000111001.
Complementing means changing all the 1's to 0's and all the 0's to 1's:
1100111111000110 is 0xcfc6 is 53190.
Add one: 53191.
So internally, -12345 is represented by 0xcfc7 = 53191.
But if you interpret it as an unsigned number, it's obviously just 53191. (And when you assign a signed value to an unsigned integer of the same size, what typically ends up happening is that you assign the exact bit pattern, without converting anything. Later, however, you will typically interpret that value differently, such as when you print it with %u.)
Another, perhaps easier way to think about this is that 16-bit arithmetic "wraps around" at 216 = 65536. So you can think of 65536 as being another name for 0 (just like 0:00 and 24:00 are both names for midnight). So -12345 is 65536 - 12345 = 53191.
The conversion rules, when converting signed integer to an unsigned integer, defined by C standard requires by repeatedly adding the TYPE_MAX + 1 to the value.
From 6.3.1.3 Signed and unsigned integers:
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
If USHRT_MAX is 65535 and then adding 65535 + 1 + -12345 is 53191.
The output seen does not depend on 2's complement nor 16 or 32- bit int. The output seen is entirely defined and would be the same on a rare 1's complement or sign-magnitude machine. The result does depend on 16-bit unsigned short
-12345 is within the minimum range of a short, so no issues with that assignment. When a short is passed as a ... argument, is goes thought the usual promotion to an int with no change in value as all short are in the range of int. "%d" expects an int, so the output is "-12345"
short int v = -12345;
printf("%d\n", v); // output "-12345\n"
Assigning a negative number to a unsigned type is well defined. With a 16-bit unsigned short, the value of uv is -12345 plus the minimum multiples of USHRT_MAX+1 (65536 in this case) to a final value of 53191.
Passing an unsigned short as an ... argument, the value is converted to int or unsigned, whichever type contains the entire range of unsigned short. IAC, the values does not change. "%u" matches an unsigned. It also matches an int whose values are expressible as either an int or unsigned.
short int v = -12345;
unsigned short uv = (unsigned short) v;
printf("%u\n", v); // output "53191\n"
What operations lead to this result when casting the value to unsigned short?
The casting did not affect the final outcome. The same result would have occurred without the cast. The cast may be useful to quiet warnings.

signed and unsigned integer in C

I have wrote this program as an exercise to understand how the signed and unsigned integer
work in C.
This code should print simply -9 the addition of -4+-5 stored in variable c
#include <stdio.h>
int main (void) {
unsigned int a=-4;
unsigned int b=-5;
unsigned int c=a+b;
printf("result is %u\n",c);
return 0;
}
When this code run it give me an unexpected result 4294967287.
I also have cast c from unsigned to signed integer printf ("result is %u\n",(int)c);
but also doesn't work.
please someone give explanation why the program doesn't give the exact result?
if this is an exercise in c and signed vs unsigned you should start by thinking - what does this mean?
unsigned int a=-4;
should it even compile? It seems like a contradiction.
Use a debugger to inspect the memory stored at he location of a. Do you think it will be the same in this case?
int a=-4;
Does the compiler do different things when its asked to add unsigned x to unsigned y as opposed to signed x and signed y. Ask the compiler to show you the machine code it generated in each case, read up what the instructions do
Explore investigate verify, you have the opportunity to get really interesting insights into how computers really work
You expect this:
printf("result is %u\n",c);
to print -9. That's impossible. c is of type unsigned int, and %u prints a value of type unsigned int (so good work using the right format string for the argument). An unsigned int object cannot store a negative value.
Going back a few line in your program:
unsigned int a=-4;
4 is of type (signed) int, and has the obvious value. Applying unary - to that value yields an int value of -4.
So far, so good.
Now what happens when you store this negative int value in an unsigned int object?
It's converted.
The language specifies what happens when you convert a signed int value to unsigned int: the value is adjusted to it's within the range of unsigned int. If unsigned int is 32 bits, this is done by adding or subtracting 232 as many times as necessary. In this case, the result is -4 + 232, or 4294967292. (That number makes a bit more sense if you show it in hexadecimal: 0xfffffffc.)
(The generated code isn't really going to repeatedly add or subtract 232; it's going to do whatever it needs to do to get the same result. The cool thing about using two's-complement to represent signed integers is that it doesn't have to do anything. The int value -4 and the unsigned int value 4294967292 have exactly the same bit representation. The rules are defined in terms of values, but they're designed so that they can be easily implemented using bitwise operations.)
Similarly, c will have the value -5 + 232, or 4294967291.
Now you add them together. The mathematical result is 8589934583, but that won't fit in an unsigned int. Using rules similar to those for conversion, the result is reduced to a value that's within the range of unsigned int, yielding 4294967287 (or, in hex, 0xfffffff7).
You also tried a cast:
printf ("result is %u\n",(int)c);
Here you're passing an int argument to printf, but you've told it (by using %u) to expect an unsigned int. You've also tried to convert a value that's too big to fit in an int -- and the unsigned-to-signed conversion rules do not define the result of such a conversion when the value is out of range. So don't do that.
That answer is precisely correct for 32-bit ints.
unsigned int a = -4;
sets a to the bit pattern 0xFFFFFFFC, which, interpreted as unsigned, is 4294967292 (232 - 4). Likewise, b is set to 232 - 5. When you add the two, you get 0x1FFFFFFF7 (8589934583), which is wider than 32 bits, so the extra bits are dropped, leaving 4294967287, which, as it happens, is 232 - 9. So if you had done this calculation on signed ints, you would have gotten exactly the same bit patterns, but printf would have rendered the answer as -9.
Using google, one finds the answer in two seconds..
http://en.wikipedia.org/wiki/Signedness
For example, 0xFFFFFFFF gives −1, but 0xFFFFFFFFU gives 4,294,967,295
for 32-bit code
Therefore, your 4294967287 is expected in this case.
However, what exactly do you mean by "cast from unsigned to signed does not work?"

is it safe to subtract between unsigned integers?

Following C code displays the result correctly, -1.
#include <stdio.h>
main()
{
unsigned x = 1;
unsigned y=x-2;
printf("%d", y );
}
But in general, is it always safe to do subtraction involving
unsigned integers?
The reason I ask the question is that I want to do some conditioning
as follows:
unsigned x = 1; // x was defined by someone else as unsigned,
// which I had better not to change.
for (int i=-5; i<5; i++){
if (x+i<0) continue
f(x+i); // f is a function
}
Is it safe to do so?
How are unsigned integers and signed integers different in
representing integers? Thanks!
1: Yes, it is safe to subtract unsigned integers. The definition of arithmetic on unsigned integers includes that if an out-of-range value would be generated, then that value should be adjusted modulo the maximum value for the type, plus one. (This definition is equivalent to truncating high bits).
Your posted code has a bug though: printf("%d", y); causes undefined behaviour because %d expects an int, but you supplied unsigned int. Use %u to correct this.
2: When you write x+i, the i is converted to unsigned. The result of the whole expression is a well-defined unsigned value. Since an unsigned can never be negative, your test will always fail.
You also need to be careful using relational operators because the same implicit conversion will occur. Before I give you a fix for the code in section 2, what do you want to pass to f when x is UINT_MAX or close to it? What is the prototype of f ?
3: Unsigned integers use a "pure binary" representation.
Signed integers have three options. Two can be considered obsolete; the most common one is two's complement. All options require that a positive signed integer value has the same representation as the equivalent unsigned integer value. In two's complement, a negative signed integer is represented the same as the unsigned integer generated by adding UINT_MAX+1, etc.
If you want to inspect the representation, then do unsigned char *p = (unsigned char *)&x; printf("%02X%02X%02X%02X", p[0], p[1], p[2], p[3]);, depending on how many bytes are needed on your system.
Its always safe to subtract unsigned as in
unsigned x = 1;
unsigned y=x-2;
y will take on the value of -1 mod (UINT_MAX + 1) or UINT_MAX.
Is it always safe to do subtraction, addition, multiplication, involving unsigned integers - no UB. The answer will always be the expected mathematical result modded by UINT_MAX+1.
But do not do printf("%d", y ); - that is UB. Instead printf("%u", y);
C11 §6.2.5 9 "A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type."
When unsigned and int are used in +, the int is converted to an unsigned. So x+i has an unsigned result and never is that sum < 0. Safe, but now if (x+i<0) continue is pointless. f(x+i); is safe, but need to see f() prototype to best explain what may happen.
Unsigned integers are always 0 to power(2,N)-1 and have well defined "overflow" results. Signed integers are 2's complement, 1's complement, or sign-magnitude and have UB on overflow. Some compilers take advantage of that and assume it never occurs when making optimized code.
Rather than really answering your questions directly, which has already been done, I'll make some broader observations that really go to the heart of your questions.
The first is that using unsigned in loop bounds where there's any chance that a signed value might crop up will eventually bite you. I've done it a bunch of times over 20 years and it has ultimately bit me every time. I'm now generally opposed to using unsigned for values that will be used for arithmetic (as opposed to being used as bitmasks and such) without an excellent justification. I have seen it cause too many problems when used, usually with the simple and appealing rationale that “in theory, this value is non-negative and I should use the most restrictive type possible”.
I understand that x, in your example, was decided to be unsigned by someone else, and you can't change it, but you want to do something involving x over an interval potentially involving negative numbers.
The “right” way to do this, in my opinion, is first to assess the range of values that x may take. Suppose that the length of an int is 32 bits. Then the length of an unsigned int is the same. If it is guaranteed to be the case that x can never be larger than 2^31-1 (as it often is), then it is safe in principle to cast x to a signed equivalent and use that, i.e. do this:
int y = (int)x;
// Do your stuff with *y*
x = (unsigned)y;
If you have a long that is longer than unsigned, then even if x uses the full unsigned range, you can do this:
long y = (long)x;
// Do your stuff with *y*
x = (unsigned)y;
Now, the problem with either of these approaches is that before assigning back to x (e.g. x=(unsigned)y; in the immediately preceding example), you really must check that y is non-negative. However, these are exactly the cases where working with the unsigned x would have bitten you anyway, so there's no harm at all in something like:
long y = (long)x;
// Do your stuff with *y*
assert( y >= 0L );
x = (unsigned)y;
At least this way, you'll catch the problems and find a solution, rather than having a strange bug that takes hours to find because a loop bound is four billion unexpectedly.
No, it's not safe.
Integers usually are 4 bytes long, which equals to 32 bits. Their difference in representation is:
As far as signed integers is concerned, the most significant bit is used for sign, so they can represent values between -2^31 and 2^31 - 1
Unsigned integers don't use any bit for sign, so they represent values from 0 to 2^32 - 1.
Part 2 isn't safe either for the same reason as Part 1. As int and unsigned types represent integers in a different way, in this case where negative values are used in the calculations, you can't know what the result of x + i will be.
No, it's not safe. Trying to represent negative numbers with unsigned ints smells like bug. Also, you should use %u to print unsigned ints.
If we slightly modify your code to put %u in printf:
#include <stdio.h>
main()
{
unsigned x = 1;
unsigned y=x-2;
printf("%u", y );
}
The number printed is 4294967295
The reason the result is correct is because C doesn't do any overflow checks and you are printing it as a signed int (%d). This, however, does not mean it is safe practice. If you print it as it really is (%u) you won't get the correct answer.
An Unsigned integer type should be thought of not as representing a number, but as a member of something called an "abstract algebraic ring", specifically the equivalence class of integers congruent modulo (MAX_VALUE+1). For purposes of examples, I'll assume "unsigned int" is 16 bits for numerical brevity; the principles would be the same with 32 bits, but all the numbers would be bigger.
Without getting too deep into the abstract-algebraic nitty-gritty, when assigning a number to an unsigned type [abstract algebraic ring], zero maps to the ring's additive identity (so adding zero to a value yields that value), one means the ring's multiplicative identity (so multiplying a value by one yields that value). Adding a positive integer N to a value is equivalent to adding the multiplicative identity, N times; adding a negative integer -N, or subtracting a positive integer N, will yield the value which, when added to +N, would yield the original value.
Thus, assigning -1 to a 16-bit unsigned integer yields 65535, precisely because adding 1 to 65535 will yield 0. Likewise -2 yields 65534, etc.
Note that in an abstract algebraic sense, every integer can be uniquely assigned into to algebraic rings of the indicated form, and a ring member can be uniquely assigned into a smaller ring whose modulus is a factor of its own [e.g. a 16-bit unsigned integer maps uniquely to one 8-bit unsigned integer], but ring members are not uniquely convertible to larger rings or to integers. Unfortunately, C sometimes pretends that ring members are integers, and implicitly converts them; that can lead to some surprising behavior.
Subtracting a value, signed or unsigned, from an unsigned value which is no smaller than int, and no smaller than the value being subtracted, will yield a result according to the rules of algebraic rings, rather than the rules of integer arithmetic. Testing whether the result of such computation is less than zero will be meaningless, because ring values are never less than zero. If you want to operate on unsigned values as though they are numbers, you must first convert them to a type which can represent numbers (i.e. a signed integer type). If the unsigned type can be outside the range that is representable with the same-sized signed type, it will need to be upcast to a larger type.

Unpredictable output using structures in C

#include<stdio.h>
main()
{
struct value
{
int bit1 : 1;
int bit2 : 4;
int bit3 : 4;
}bit={1, 2, 2};
printf("%d %d %d\n",bit.bit1,bit.bit2,bit.bit3);
}
Output of this code is "-1 2 2"
Please clarify the logic behind this output.
Value of bit.bit2 and bit.bit3 is always same as the value assigned to it but bit.bit1 is changing with different integer values. why?
You should use unsigned int. The highest bit, defines whether a number is negative or positive if signed values are used. If you have only one bit, and this is 1, then it it is interpreted as a negative number as the highest bit is set.
If you set the other values to 15 you will also get a negative output.
You could modify the output by using %u in the printf command, but you would still have possibly unwanted effects when assigning and comparing it with other values.
int x : b ; means you are allocating only b bits of memory to x instead of the default sizeof(int) bytes. This kind of declaration is only possible inside a structure.
Range of signed integer in C is -2^(b-1) to 2^(b-1)-1. Where b is number of bits used to store the integer. In all the above cases overflow occurs. A good compiler should give you a warning about overflow.
A signed bit-field of size 1 accepts values in the range [-1 … 0]. This is a consequence of the general formula [-2^(N-1) … 2^(N-1)-1] for determining the range of values that can be stored in N bits with 2's complement representation for N=1.
If you expected bit1 to hold the values 0 or 1, you can declare it as unsigned int bit1 : 1;.
The standard does not specify whether int in bit-fields is signed or unsigned. Instead, it forces you to explicitly specify signedness.
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.

short type variable automatically extended to integer type?

I wanna print the value of b[FFFC] like below,
short var = 0xFFFC;
printf("%d\n", b[var]);
But it actually print the value of b[FFFF FFFC].
Why does it happen ?
My computer is operated by Windows XP in 32-bit architecture.
short is a signed type. It's 16 bits on your implementation. 0xFFFC represents the integer constant 65,532, but when converted to a 16 bit signed value, this is resulting in -4.
So, your line short var = 0xFFFC; sets var to -4 (on your implementation).
0xFFFFFFFC is a 32 bit representation of -4. All that's happening is that your value is being converted from one type to a larger type, in order to use it as an array index. It retains its value, which is -4.
If you actually want to access the 65,533rd element of your array, then you should either:
use a larger type for var. int will suffice on 32 bit Windows, but in general size_t is an unsigned type which is guaranteed big enough for non-negative array indexes.
use an unsigned short, which just gives you enough room for this example, but will go wrong if you want to get another 4 steps forward.
In current compilers we can't use short (16 bit) if write short use 32 bit .
for example i compile same code with gcc4 in Ubuntu Linux 32 bit :
int main(int argc, char** argv)
{
short var = 0xFFFC;
printf("%x\n", var);
printf("%d\n", var);
return (EXIT_SUCCESS);
}
and output is :
fffffffc
-4
you can see cast short to 32bit normal and use sign extension in 2's complement
As a refresher on the C's data types available, have a look here.
There is a rule, and that concerns the usage of C, some datatypes are promoted to their integral type, for instance
char ch = '2';
int j = ch + 1;
Now look at the RHS (Right Hand Side) of the expression and notice that the ch will automatically get promoted as an int in order to produce the desired results on the LHS (LHS) of the expression. What would the value of j be? The ASCII code for '2' is 50 decimal or 0x32 hexadecimal, add 1 on to it and the value of j would be 51 decimal or 0x33 hexadecimal.
It is important to understand that rule and that explains why a data type would be 'promoted' to another data type.
What is the b? That is an array I presume that has 655532 elements correct?
Anyway, using a format specifier %d is for of type int, the value got promoted to an int, firstly, and secondly the array subscript is of type int, hence the usage of the short var got promoted and since the data size of an int is 4 bytes, it got promoted and hence you are seeing the rest of the value 0xFFFF 0xFFFC.
This is where the usage of casting comes in, to tell the compiler to cast a data type to another which explains in conjunction to Gregory Pakosz's answer above.
Hope this helps,
Best regards,
Tom.
use %hx or %hd instead to indicate that you have a short variable, e.g:
printf("short hex: %hx\n", var); /* tell printf that var is short and print out as hex */
EDIT: Uups, I got the question wrong. It was not about printf() as I thought. So this answer might be a little bit OT.
New: Because you are using var as an index to an array you should declare it as unsigned short (instead of short):
unsigned short var = 0xFFFC;
printf("%d\n", b[var]);
The 'short var' could be interpreted as a negative number.
To be more precise:
You are "underflowing" into the negative value range: Values in the range from 0x0000 upto 0x7FFF will be OK. But values from 0x8000 upto 0xFFFF will be negative.
Here are some examples of var used as an index to array b[]:
short var=0x0000;; // leads to b[0] => OK
short var=0x0001; // leads to b[1] => OK
short var=0x7FFF; // leads to b[32767] => OK
short var=0x8000; // leads to b[-32768] => Wrong
short var=0xFFFC; // leads to b[-4] => Wrong
short var=32767; // leads to the same as b[0x7FFF] => OK
short var=32768; // compile warning or error => overflow into 32bit range
You were expecting to store JUST a 16bit variable in a 32bit-aligned memory... you see, each memory address holds a whole 32bit word (hardware).
The extra FFFF comes from the fact that short is a signed value, and when assigned to int (at the printf call), it got signed-extended. When extending two-complements from 16 to 32bit, the extension is done by replicating the last N bit to all other M-N on it's left. Of course, you did not intend that.
So, in this case, you're interested in absolute array positions, so you should declare your indexer as unsigned.
In the subject of your question you have already guessed what is happening here: yes, a value of type short is "automatically extended" to a value of type int. This process is called integral promotion. That's how it always works in C language: every time you use an integral value smaller than int that value is always implicitly promoted to a value of type int (unsigned values can be promoted to unsigned int). The value itself does not change, of course, only the type of the value is changed. In your above example the 16-bit short value represented by pattern 0xFFFC is the same as 32-bit int value represented by pattern 0xFFFFFFFC, which is -4 in decimals. This, BTW, makes the rest of your question sound rather strange: promotion or not, your code is trying to access b[-4]. The promotion to int doesn't change anything.

Resources