Unpredictable output using structures in C

Unpredictable output using structures in C - c

#include<stdio.h>
main()
{
struct value
{
int bit1 : 1;
int bit2 : 4;
int bit3 : 4;
}bit={1, 2, 2};
printf("%d %d %d\n",bit.bit1,bit.bit2,bit.bit3);
}
Output of this code is "-1 2 2"
Please clarify the logic behind this output.
Value of bit.bit2 and bit.bit3 is always same as the value assigned to it but bit.bit1 is changing with different integer values. why?

You should use unsigned int. The highest bit, defines whether a number is negative or positive if signed values are used. If you have only one bit, and this is 1, then it it is interpreted as a negative number as the highest bit is set.
If you set the other values to 15 you will also get a negative output.
You could modify the output by using %u in the printf command, but you would still have possibly unwanted effects when assigning and comparing it with other values.

int x : b ; means you are allocating only b bits of memory to x instead of the default sizeof(int) bytes. This kind of declaration is only possible inside a structure.
Range of signed integer in C is -2^(b-1) to 2^(b-1)-1. Where b is number of bits used to store the integer. In all the above cases overflow occurs. A good compiler should give you a warning about overflow.

A signed bit-field of size 1 accepts values in the range [-1 … 0]. This is a consequence of the general formula [-2^(N-1) … 2^(N-1)-1] for determining the range of values that can be stored in N bits with 2's complement representation for N=1.
If you expected bit1 to hold the values 0 or 1, you can declare it as unsigned int bit1 : 1;.

The standard does not specify whether int in bit-fields is signed or unsigned. Instead, it forces you to explicitly specify signedness.
A bit-field shall have a type that is a qualified or unqualified
version of _Bool, signed int, unsigned int, or some other
implementation-defined type.

Related

Why I can assign a negative value to an unsigned int data type?

I was doing some experiments in a code in order to prove the theory. This is my code:
#include <stdio.h>
int main(){
unsigned int x,y,z;
if(1){
x=-5;
y=5;
z=x+y;
printf("%i",z);
}
return 0;
}
But for what I know the output should have been 10, but instead it prints 0, why this is happening? why I can assign a negative value to an unsigned int data type

From section 6.5.16.1 of the C standard:
In simple assignment (=), the value of the right operand is converted
to the type of the assignment expression and replaces the value stored
in the object designated by the left operand.
From section 6.3.1.3 Signed and unsigned integers:
... Otherwise, if the new type is unsigned, the value is converted
by repeatedly adding or subtracting one more than the maximum value
that can be represented in the new type until the value is in the
range of the new type.
So x=-5 assigns UINT_MAX + 1 - 5 to x.

Signed to unsigned conversion happens as per "mathematic modulus by UINT_MAX+1", from C17 6.3.1.3:
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type
Now as it happens, on 2's complement computers this is the same as taking the binary representation of the signed number and treat it like an unsigned number. In your case -5 is represented as binary 0xFFFFFFFB, so the unsigned number ends up as 4294967291. And 4294967291 + 5 creates an unsigned wrap-around from UINT_MAX = 4294967295 to 0 (which is well-defined, unlike signed overflow).
So -5 does not just discard the sign when converted to unsigned. If that's what you want to happen, use the abs() function from stdlib.h.

It's a basic characteristic of pretty much all integral types that they have a defined range of values they can hold. But then the question is, what happens if you try to set a value outside that range? For C's unsigned types, the answer is that they operate via modular arithmetic.
On a modern machine, type unsigned int probably has a range of 0 to 4294967295. Obviously -5 does not fit into that range. So modular arithmetic says that we add or subtract some multiple of 4294967296 until we get a value that is in the range. Now, -5 + 4294967296 is 4294967291, and that is in range, so that's the value which gets stored in your variable x.
So then x + y will be 4294967296, but that's not in the range of 0 to 4294967295, either. But if we subtract 4294967296, we get 0, and that's in range, so that's our answer.
And along the way we've discovered how two's complement arithmetic works. It turns out that, if we had declared x as a signed int and set it to -5, it would have ended up containing the same bit pattern as 4294967291. And as we've seen, 4294967291 is precisely the bit pattern we want to be able to add to 5 in order to get 0 (after wrapping around, that is). So 4294967291 is a great internal value to use for -5, since you obviously want -5 + 5 to be 0.

Why i can assign a negative value to an unsigned int data type?
Assigning an out-of-range value to an unsigned type is well defined.
But first, in trying to report the value, code invokes undefined behavior (UB) using a mismatched printf() specifier with unsigned:
// printf("%i",z); // Bad
printf("%u\n",z); // Good
Try printf("%u %u %u %u\n", x, y, z, UINT_MAX); to properly see all 4 values.
x=-5; assigns an out-of-range value to an unsigned. With unsigned types, the value is converted to an in-range value by adding/subtracting the max value of the type + 1 until in range. In this case x will have the value of UINT_MAX + 1 - 5.
y=5; is OK.
x+y will then incur unsigned math overflow. The sum is also converted to an in-range value in a like-wise manner.
x+y will have the value of (UINT_MAX + 1 - 5) + 5 --> (UINT_MAX + 1 - 5) + 5 - (UINT_MAX + 1) --> 0.

Printing the actual negative number stored inside unsigned int

I have a strange problem. I have a variable whose actual value is only negative(only negative integers are generated for this variable). But in the legacy code, an old colleague of mine used uint16 instead of signed int to store the values of that variable. Now if i wanted to print the actual negative value of that variable, how can i do that(what format specifier to us)? For example if actual value is -75, when i print using %d its giving me 5 digit positive value(I think its because of two's complement). I want to print it as 75 or -75.

If a 16-bit negative integer was stored in a uint16_t called x, then the original value may be calculated as x-65536.
This can be printed with any of1:
printf("%ld", x-65536L);
printf("%d", (int) (x-65536));
int y = x-65536;
printf("%d", y);
Subtracting 65536 works because:
Per C 2018 6.5.16.1 2, the value of the right operand (a negative 16-bit integer is converted to the type of the assignment expression (which is essentially the type of the left operand).
Per 6.3.1.3 2, the conversion to an unsigned integer operates by adding or subtracting “one more than the maximum value that can be represented in the new type”. For uint16_t, one more than its maximum is 65536.
With a 16-bit negative integer, adding 65536 once brings it into the range of a uint16_t.
Therefore, subtracting 65536 restores the original value.
Footnote
1 65536 will be long or int according to whether int is 16 bits or more, so these statements are careful to handle the type correctly. The first uses 65536L to ensure the type is long. The rest convert the value to int. This is safe because, although the type of x-65536 could be long, its value fits in an int—unless you are executing in a C implementation that limits int to −32767 to +32767, and the original value may be −32768, in which case you should stick to the first option.

If an int is 32 bits on your system, then the correct format for printing a 16-bit int is %hu for unsigned and %hd for signed.
Examine the output of:
uint16_t n = (uint16_t)-75;
printf("%d\n", (int)n); // Output: 65461
printf("%hu\n", n); // Output: 65461
printf("%hd\n", n); // Output: -75

#include <inttypes.h>
uint16_t foo = -75;
printf("==> %" PRId16 " <==\n", foo); // type mismatch, undefined behavior

Assuming your friend has somehow correctly put the bit representation for the signed integer into the unsigned integer, the only standard compliant way to extract it back would be to use a union as-
typedef union {
uint16_t input;
int16_t output;
} aliaser;
Now, in your code you can do -
aliaser x;
x.input = foo;
printf("%d", x.output);

Go ahead with the union, as explained by another answer. Just wanted to say that if you are using GCC, there's a very cool feature that allows you to do that sort of "bitwise" casting without writing much code:
printf("%d", ((union {uint16_t in; int16_t out}) foo).out);
See https://gcc.gnu.org/onlinedocs/gcc-4.4.1/gcc/Cast-to-Union.html.

If u is an object of unsigned integer type and a negative number whose magnitude is within range of u's type is stored into it, storing -u to an object of u's type will leave it holding the magnitude of that negative number. This behavior does not depend upon how u is represented. For example, if u and v are 16-bit unsigned short, but int is 32 bits, then storing -60000 into u will leave it holding 5536 [the implementation will behave as though it adds 65536 to the value stored until it's within range of unsigned short]. Evaluating -u will yield -5536, and storing -5536 into v will leave it holding 60000.

Output of a short int and a unsigned short?

So I'm trying to interpret the following output:
short int v = -12345;
unsigned short uv = (unsigned short) v;
printf("v = %d, uv = %u\n", v, uv);
Output:
v = -12345
uv = 53191
So the question is: why is this exact output generated when this program is run on a two's complement machine?
What operations lead to this result when casting the value to unsigned short?

My answer assumes 16-bit two's complement arithmetic.
To find the value of -12345, take 12345, complement it, and add 1.
12345 is 0x3039 is 0011000000111001.
Complementing means changing all the 1's to 0's and all the 0's to 1's:
1100111111000110 is 0xcfc6 is 53190.
Add one: 53191.
So internally, -12345 is represented by 0xcfc7 = 53191.
But if you interpret it as an unsigned number, it's obviously just 53191. (And when you assign a signed value to an unsigned integer of the same size, what typically ends up happening is that you assign the exact bit pattern, without converting anything. Later, however, you will typically interpret that value differently, such as when you print it with %u.)
Another, perhaps easier way to think about this is that 16-bit arithmetic "wraps around" at 216 = 65536. So you can think of 65536 as being another name for 0 (just like 0:00 and 24:00 are both names for midnight). So -12345 is 65536 - 12345 = 53191.

The conversion rules, when converting signed integer to an unsigned integer, defined by C standard requires by repeatedly adding the TYPE_MAX + 1 to the value.
From 6.3.1.3 Signed and unsigned integers:
Otherwise, if the new type is unsigned, the value is converted by
repeatedly adding or subtracting one more than the maximum value that
can be represented in the new type until the value is in the range of
the new type.
If USHRT_MAX is 65535 and then adding 65535 + 1 + -12345 is 53191.

The output seen does not depend on 2's complement nor 16 or 32- bit int. The output seen is entirely defined and would be the same on a rare 1's complement or sign-magnitude machine. The result does depend on 16-bit unsigned short
-12345 is within the minimum range of a short, so no issues with that assignment. When a short is passed as a ... argument, is goes thought the usual promotion to an int with no change in value as all short are in the range of int. "%d" expects an int, so the output is "-12345"
short int v = -12345;
printf("%d\n", v); // output "-12345\n"
Assigning a negative number to a unsigned type is well defined. With a 16-bit unsigned short, the value of uv is -12345 plus the minimum multiples of USHRT_MAX+1 (65536 in this case) to a final value of 53191.
Passing an unsigned short as an ... argument, the value is converted to int or unsigned, whichever type contains the entire range of unsigned short. IAC, the values does not change. "%u" matches an unsigned. It also matches an int whose values are expressible as either an int or unsigned.
short int v = -12345;
unsigned short uv = (unsigned short) v;
printf("%u\n", v); // output "53191\n"
What operations lead to this result when casting the value to unsigned short?
The casting did not affect the final outcome. The same result would have occurred without the cast. The cast may be useful to quiet warnings.

is it safe to subtract between unsigned integers?

Following C code displays the result correctly, -1.
#include <stdio.h>
main()
{
unsigned x = 1;
unsigned y=x-2;
printf("%d", y );
}
But in general, is it always safe to do subtraction involving
unsigned integers?
The reason I ask the question is that I want to do some conditioning
as follows:
unsigned x = 1; // x was defined by someone else as unsigned,
// which I had better not to change.
for (int i=-5; i<5; i++){
if (x+i<0) continue
f(x+i); // f is a function
}
Is it safe to do so?
How are unsigned integers and signed integers different in
representing integers? Thanks!

1: Yes, it is safe to subtract unsigned integers. The definition of arithmetic on unsigned integers includes that if an out-of-range value would be generated, then that value should be adjusted modulo the maximum value for the type, plus one. (This definition is equivalent to truncating high bits).
Your posted code has a bug though: printf("%d", y); causes undefined behaviour because %d expects an int, but you supplied unsigned int. Use %u to correct this.
2: When you write x+i, the i is converted to unsigned. The result of the whole expression is a well-defined unsigned value. Since an unsigned can never be negative, your test will always fail.
You also need to be careful using relational operators because the same implicit conversion will occur. Before I give you a fix for the code in section 2, what do you want to pass to f when x is UINT_MAX or close to it? What is the prototype of f ?
3: Unsigned integers use a "pure binary" representation.
Signed integers have three options. Two can be considered obsolete; the most common one is two's complement. All options require that a positive signed integer value has the same representation as the equivalent unsigned integer value. In two's complement, a negative signed integer is represented the same as the unsigned integer generated by adding UINT_MAX+1, etc.
If you want to inspect the representation, then do unsigned char *p = (unsigned char *)&x; printf("%02X%02X%02X%02X", p[0], p[1], p[2], p[3]);, depending on how many bytes are needed on your system.

Its always safe to subtract unsigned as in
unsigned x = 1;
unsigned y=x-2;
y will take on the value of -1 mod (UINT_MAX + 1) or UINT_MAX.
Is it always safe to do subtraction, addition, multiplication, involving unsigned integers - no UB. The answer will always be the expected mathematical result modded by UINT_MAX+1.
But do not do printf("%d", y ); - that is UB. Instead printf("%u", y);
C11 §6.2.5 9 "A computation involving unsigned operands can never overflow, because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type."
When unsigned and int are used in +, the int is converted to an unsigned. So x+i has an unsigned result and never is that sum < 0. Safe, but now if (x+i<0) continue is pointless. f(x+i); is safe, but need to see f() prototype to best explain what may happen.
Unsigned integers are always 0 to power(2,N)-1 and have well defined "overflow" results. Signed integers are 2's complement, 1's complement, or sign-magnitude and have UB on overflow. Some compilers take advantage of that and assume it never occurs when making optimized code.

Rather than really answering your questions directly, which has already been done, I'll make some broader observations that really go to the heart of your questions.
The first is that using unsigned in loop bounds where there's any chance that a signed value might crop up will eventually bite you. I've done it a bunch of times over 20 years and it has ultimately bit me every time. I'm now generally opposed to using unsigned for values that will be used for arithmetic (as opposed to being used as bitmasks and such) without an excellent justification. I have seen it cause too many problems when used, usually with the simple and appealing rationale that “in theory, this value is non-negative and I should use the most restrictive type possible”.
I understand that x, in your example, was decided to be unsigned by someone else, and you can't change it, but you want to do something involving x over an interval potentially involving negative numbers.
The “right” way to do this, in my opinion, is first to assess the range of values that x may take. Suppose that the length of an int is 32 bits. Then the length of an unsigned int is the same. If it is guaranteed to be the case that x can never be larger than 2^31-1 (as it often is), then it is safe in principle to cast x to a signed equivalent and use that, i.e. do this:
int y = (int)x;
// Do your stuff with *y*
x = (unsigned)y;
If you have a long that is longer than unsigned, then even if x uses the full unsigned range, you can do this:
long y = (long)x;
// Do your stuff with *y*
x = (unsigned)y;
Now, the problem with either of these approaches is that before assigning back to x (e.g. x=(unsigned)y; in the immediately preceding example), you really must check that y is non-negative. However, these are exactly the cases where working with the unsigned x would have bitten you anyway, so there's no harm at all in something like:
long y = (long)x;
// Do your stuff with *y*
assert( y >= 0L );
x = (unsigned)y;
At least this way, you'll catch the problems and find a solution, rather than having a strange bug that takes hours to find because a loop bound is four billion unexpectedly.

No, it's not safe.
Integers usually are 4 bytes long, which equals to 32 bits. Their difference in representation is:
As far as signed integers is concerned, the most significant bit is used for sign, so they can represent values between -2^31 and 2^31 - 1
Unsigned integers don't use any bit for sign, so they represent values from 0 to 2^32 - 1.
Part 2 isn't safe either for the same reason as Part 1. As int and unsigned types represent integers in a different way, in this case where negative values are used in the calculations, you can't know what the result of x + i will be.

No, it's not safe. Trying to represent negative numbers with unsigned ints smells like bug. Also, you should use %u to print unsigned ints.
If we slightly modify your code to put %u in printf:
#include <stdio.h>
main()
{
unsigned x = 1;
unsigned y=x-2;
printf("%u", y );
}
The number printed is 4294967295

The reason the result is correct is because C doesn't do any overflow checks and you are printing it as a signed int (%d). This, however, does not mean it is safe practice. If you print it as it really is (%u) you won't get the correct answer.

An Unsigned integer type should be thought of not as representing a number, but as a member of something called an "abstract algebraic ring", specifically the equivalence class of integers congruent modulo (MAX_VALUE+1). For purposes of examples, I'll assume "unsigned int" is 16 bits for numerical brevity; the principles would be the same with 32 bits, but all the numbers would be bigger.
Without getting too deep into the abstract-algebraic nitty-gritty, when assigning a number to an unsigned type [abstract algebraic ring], zero maps to the ring's additive identity (so adding zero to a value yields that value), one means the ring's multiplicative identity (so multiplying a value by one yields that value). Adding a positive integer N to a value is equivalent to adding the multiplicative identity, N times; adding a negative integer -N, or subtracting a positive integer N, will yield the value which, when added to +N, would yield the original value.
Thus, assigning -1 to a 16-bit unsigned integer yields 65535, precisely because adding 1 to 65535 will yield 0. Likewise -2 yields 65534, etc.
Note that in an abstract algebraic sense, every integer can be uniquely assigned into to algebraic rings of the indicated form, and a ring member can be uniquely assigned into a smaller ring whose modulus is a factor of its own [e.g. a 16-bit unsigned integer maps uniquely to one 8-bit unsigned integer], but ring members are not uniquely convertible to larger rings or to integers. Unfortunately, C sometimes pretends that ring members are integers, and implicitly converts them; that can lead to some surprising behavior.
Subtracting a value, signed or unsigned, from an unsigned value which is no smaller than int, and no smaller than the value being subtracted, will yield a result according to the rules of algebraic rings, rather than the rules of integer arithmetic. Testing whether the result of such computation is less than zero will be meaningless, because ring values are never less than zero. If you want to operate on unsigned values as though they are numbers, you must first convert them to a type which can represent numbers (i.e. a signed integer type). If the unsigned type can be outside the range that is representable with the same-sized signed type, it will need to be upcast to a larger type.

Initializing unsigned short int to signed value

#include<stdio.h>
int main()
{
unsigned short a=-1;
printf("%d",a);
return 0;
}
This is giving me output 65535. why?
When I increased the value of a in negative side the output is (2^16-1=)65535-a.
I know the range of unsigned short int is 0 to 65535.
But why is rotating in the range 0 to 65535.What is going inside?
#include<stdio.h>
int main()
{
unsigned int a=-1;
printf("%d",a);
return 0;
}
Output is -1.
%d is used for signed decimal integer than why here it is not following the rule of printing the largest value of its(int) range.
Why the output in this part is -1?
I know %u is used for printing unsigned decimal integer.
Why the behavioral is undefined in second code and not in first.?
This I have compiled in gcc compiler. It's a C code
On my machine sizeof short int is 2 bytes and size of int is 4 bytes.

In your implementation, short is 16 bits and int is 32 bits.
unsigned short a=-1;
printf("%d",a);
First, -1 is converted to unsigned short. This results in the value 65535. For the precise definition see the standard "integer conversions". To summarize: the value is taken modulo USHORT_MAX+1.
This value 65535 is assigned to a.
Then for the printf, which uses varargs, the value is promoted back to int. varargs never pass integer types smaller than int, they're always converted to int. This results in the value 65535, which is printed.
unsigned int a=-1;
printf("%d",a);
First line, same as before but modulo UINT_MAX+1. a is 4294967295.
For the printf, a is passed as an unsigned int. Since %d requires an int the behavior is undefined by the C standard. But your implementation appears to have reinterpreted the unsigned value 4294967295, which has all bits set, as as a signed integer with all-bits-set, i.e. the two's-complement value -1. This behavior is common but not guaranteed.

Variable assignment is done to the amount of memory of the type of the variable (e.g., short is 2 bytes, int is 4 bytes, in 32 bit hardware, typically). Sign of the variable is not important in the assignment. What matters here is how you are going to access it. When you assign to a 'short' (signed/unsigned) you assign the value to a '2 bytes' memory. Now if you are going to use '%d' in printf, printf will consider it 'integer' (4 bytes in your hardware) and the two MSBs will be 0 and hence you got [0|0](two MSBs) [-1] (two LSBs). Due to the new MSBs (introduced by %d in printf, migration) your sign bit is hidden in the LSBs and hence printf considers it unsigned (due to the MSBs being 0) and you see the positive value. To get a negative in this you need to use '%hd' in first case. In the second case you assigned to '4 bytes' memory and the MSB got its SIGN bit '1' (means negative) during assignment and hence you see the negative number in '%d' of printf. Hope it explains. For more clarification please comment on the answer.
NB: I used 'MSB' for a shorthand of higher-order byte(s). Please read it according to the context (e.g., 'SIGN bit' will make you read like 'Most Significant Bit'). Thanks.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight