C bit field variables are printing unexpected values - c

struct m
{
int parent:3;
int child:3;
int mother:2;
};
void main()
{
struct m son={2,-6,5};
printf("%d %d %d",son.parent,son.child,son.mother);
}
Can anybody please help in telling why the output of the program is 2 2 1 ?

Taking out all but the significant bits for the fields shown:
parent: 3 bits (1-sign bit + 2 more), value 010, result 2
child: 3 bits (1-sign bit + 2 more), value 010, result 2
mother: 2 bits (1 sign bit + 1 more), value 01, result 1
Details
It bears pointing out that your structure fields are declared as int bit-field values. By C99-§6.7.2,2, the following types are all equivalent: int, signed, or signed int. Therefore, your structure fields are signed. By C99-§6.2.6.2,2, one of your bits shall be consumed in representing the "sign" of the variable (negative or positive). Further, the same section states that excluding the sign-bit, the remaining bits representation must correspond to an associated unsigned type of the remaining bit-count. C99-§6.7.2,1 clearly defines how each of these bits represents a power of 2. Therefore, the only bit that is normally used as the sign-bit is the most significant bit (its the only one that is left, but I'm quite sure if this is an inaccurate interpretation of the standard I'll hear about it in due time). That you are assigning a negative number as one of your test values used for your sample suggests you may be aware of this, but many people newly exposed to bit fields are not. Thus, it bears noting.
The following sections of the C99 standard are referenced in the remainder of this answer. The first deals with promotions to different types, the next the valuation and potential value-change (if any). The last is important in understanding how a bit-fields int-type is determined.
C99-§6.3.1.1: Boolean, characters, and integers
2: If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.
C99-§6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
C99-§6.7.2.1 Structure and Union Specifiers
10: A bit-field is interpreted as having a signed or unsigned integer type consisting of the specified number of bits. If the value 0 or 1 is stored into a nonzero-width bit-field of type _Bool, the value of the bit-field shall compare equal to the value stored; a _Bool bit-field has the semantics of a _Bool.
Consider the regular int bit representation of your test values. The following are on a 32-bit int implementation:
value : s bits
2 : 0 0000000 00000000 00000000 00000010 <== note bottom three bits
-6 : 1 1111111 11111111 11111111 11111010 <== note bottom three bits
5 : 0 0000000 00000000 00000000 00000101 <== note bottom two bits
Walking through each of these, applying the requirements from the standard references above.
int parent:3 :The first field is a 3-bit signed int, and is being assigned the decimal value 2. Does the rvalue type, int, encompass the lvalue type, int:3? Yes, so the types are good. Does the value 2 fit within the range of the lvalue type? Well, 2 can easily fit in an int:3, so no value mucking is required either. The first field works fine.
int child:3: The second field is also a 3-bit signed int, this time being assigned the decimal value -6. Once again, does the rvalue type (int) fully-encompass the lvalue type (int:3)? Yes, so again the types are fine. However, the minimum bit-count require to represent -6, a signed value, is 4 bits. (1010), accounting for the most signnificant bit as the sign-bit. Therefore the value -6 is out of range of the allowable storage of a 3-bit signed bit-field. Therefore, the result is implementation-defined per §6.3.1.3-3.
int mother:2 The final field is a 2-bit signed int, this time being assigned the decimal value 5. Once again, does the rvalue type (int) fully-encompass the lvalue type (int:2)? Yes, so again the types are fine. However, once again we're faced with a value that cannot fit within the target-type. The minimum bit-count need for representing a signed positive 5 is four: (0101). We only have two to work with. Therefore, the result is once again implementation-defined per §6.3.1.3-3.
Therefore, if I take this correctly, the implementation in this case simply hacks off all but the required bits needed to store fill the declared bit depth. And the results of that hackery is what you now have. 2 2 1
Note
It is entirely possible I flipped the order of the promotion incorrectly (it is easy for me to get lost in the standard, as I am dyslexic and flip things in my head periodically). If that is the case I would ask anyone with a stronger interpretation of the standard please point this out to me an I will address the answer accordingly.

The bit fields size for child and mother are too small to contain the constant values you are assigning to them and they're overflowing.

You can remark that in compilation phase you will get the following warnings:
test.c: In function ‘main’:
test.c:18:11: warning: overflow in implicit constant conversion
test.c:18:11: warning: overflow in implicit constant conversion
That because you have defined your variables as 3 bits from an int and not the whole int. Over flow means that is unable to store your values into a 3 bits memory.
so if you use the following structure definition you will avoid the warning in the compilation and you will get the right values:
struct m
{
int parent;
int child;
int mother;
};

You can't represent -6 in only 3 bits; similarly, you can't represent 5 in only two bits.
Is there a reason why parent, child, and mother need to be bit fields?

child requires minimum 4 bits to hold -6 (1010). And mother requires minimum 4 bits to hold 5 (0101).
printf considers only last 3 bits of child so its printing 2 and it considers only last 2 bits of mother so its printing 1.
You may think like child requires only 3 bits for storing -6, but it actually requires 4 bits including sign bit. Negative values use to be stored in 2`s complement mode.
Binary equivalent of 6 is 110
one`s complement of 6 is 001
two`s complement of 6 is 010
sign bit should be added at MSB.
So value of -6 is 1010. printf is omiting the sign bit.

The first of all:
5 is a 101 in binary, so it doesn't fit in 2 bits.
-6 also doesn't fit in 3 bits.

Your bit fields are too small. Also: is this just an exercise, or are you trying to (prematurely) optimize? You're not going to be very happy with the result... it'll be quite slow.

I finally found the reason after applying thought :-)
2 is represented in binary as 0010
-6 as 1010
and five as 0101
Now 2 can be represented by using only 3 bits so it gets stored as 010
-6 would get stored in 3 bits as 010
and five would get stored in 2 bits as 01
So the final output would be 2 2 1
Thanks everyone for your reply !!!!

Related

Unsigned integers

Could anyone help me understand the difference between signed/unsigned int, as well as signed/unsigned char? In this case, if it's unsigned wouldn't the value just never reach a negative number and continue on an infinite loop of 0's?
int main()
{
unsigned int n=3;
while (n>=0)
{
printf ("%d",n);
n=n-1;
}
return 0;
}
Two important things:
At one level, the difference between regular signed, versus unsigned values, is just the way we interpret the bits. If we limit ourselves to 3 bits, we have:
bits
signed
unsigned
000
0
0
001
1
1
010
2
2
011
3
3
100
-4
4
101
-3
5
110
-2
6
111
-1
7
The bit patterns don't change, it's just a matter of interpretation whether we have them represent nonnegative integers from 0 to 2N-1, or signed integers from -2N/2 to 2N/2-1.
The other important thing to know is what operations are defined on a type. For unsigned types, addition and subtraction are defined so that they "wrap around" from 0 to 2N-1. But for signed types, overflow and underflow are undefined. (On some machines they wrap around, but not all.)
Finally, there's the issue of properly matching up your printf formats. For %d, you're supposed to give it a signed integer. But you gave it unsigned instead. Strictly speaking, that results in undefined behavior, too, but in this case (and not too suprisingly), what happened was that it took the same bit pattern and printed it out as if it were signed, rather than unsigned.
wouldn't the value just never reach a negative number
Correct, it can't be negative.
and continue on an infinite loop of 0's
No, it will wrap-around from zero to the largest value of an unsigned int, which is well-defined behavior. If you use the correct conversion specifier %u instead of the incorrect %d, you'll notice this output:
3
2
1
0
4294967295
4294967294
...
Signed number representation is the categorization of positive as well as negative integers while unsigned categorizations are classifications of positive integersو and the code you wrote will run forever because n is an unsigned number and always represents a positive number.
In this case, if it's unsigned wouldn't the value just never reach a negative number ...?
You are right. But in the statement printf ("%d",n); you “deceived” the printf() function — using the type conversion specifier d — that the number in variable n is signed.
Use the type conversion specifier u instead: printf ("%u",n);
... never reach a negative number and continue on an infinite loop of 0's?
No. “Never reaching a negative number” is not the same as “stopping at 0 and resist further decrementing”.
Other people already explained this. Here is my explanation, in the form of analogies:
Imagine yourself a never ending and never beginning sequence of non-negative integers:
..., 0, 1, 2, 3, 0, 1, 2, 3, 0, 1, 2, 3, ... // the biggest is 3 only for simplicity
— or numbers on an analog clock:
                    
You may increase / decrease a number forever, going round and round.
The terms signed and unsigned refer to how the CPU treats sequences of bits.
There are 2 important things to understand here:
How the CPU adds finite sequences of bits to a single finite result
How the CPU differentiates between signed and unsigned operands
Let's start with (1).
Let's take 4-bit nibbles for example.
If we ask the CPU to add 0001 and 0001, the result should be 2, or 0010.
But if we ask it to add 1111 and 0001, the result should be 16, or 10000. But it only has 4 bits to contain the result. The convention is to wrap-around, or circle, back to 0, effectively ignoring the Most Significant Bit. See also integer overflow..
Why is this relevant? Because it produces an interesting result. That is, according to the definition above, if we let x = 1111, then we get x + 1 = 0. Well, x, or 1111, now looks and behaves awfully like -1. This is the birth of signed numbers and operations. And if 1111 can be deemed as -1, then 1111 - 1 = 1110 should be -2, and so on.
Now let's look at (2).
When the C compiler sees you defining an unsigned int, it will use special CPU instructions for dealing with unsigned numbers, where it deems relevant. For example, this is relevant in jump instructions, where the CPU needs to know if you mean it to jump way forward, or slightly backward. For this it needs to know if you mean your operand to be interpreted in a signed, or unsigned way.
The operation of adding two numbers, on the other hand, is fundamentally oblivious to the consequent interpretation. The only thing is that the CPU will turn on a special flag after an addition operation, to tell you whether a wrap-around has occurred, for your own auditing.
But the important thing to understand is that the sequence of bits doesn't change; only its interpretation.
To tie all of this to your example, subtracting 1 from an unsigned 0 will simply wrap-around back to 1111, or 2^32 in your case.
Finally, there are other uses for signed/unsigned. For example, by the very fact it is defined as a different type, this allows functions to be written that define a contract where only unsigned integers, let's say, can be passed to it. Also, it's relevant when you want to display or print the number.

Which sections of the C standard prove the relation between the integer type sizes?

In the late draft of C11 [C11_N1570] and C17 [C17_N2176] I fail to find the proof of the following (which, I believe, is commonly known):
sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
Can anybody refer me to the particular sections?
I'm aware of this reply for C++11. The second part of the reply talks about C, but only touches the ranges of the values. It does not prove the ratio between the type sizes.
Thank you very much everybody who participated in the search of the answer.
Most of the replies have shared what I have already learned, but some of the comments provided very interesting insight.
Below I will summarize what I learned so far (for my own future reference).
Conclusion
Looks like C (as of late draft of C17 [C17_N2176]) does not guarantee that
sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
(as opposed to C++).
What is Guaranteed
Below is my own interpretation/summary of what C does guarantee regarding the integer types (sorry if my terminology is not strict enough).
Multiple Aliases For the Same Type
This topic moves out of my way the multiple aliases for the same type ([C17_N2176], 6.2.5/4 parenthesized sentence referring to 6.7.2/2, thanks #M.M for the reference).
The Number of Bits in a Byte
The number of bits in a byte is implementation-specific and is >= 8. It is determined by CHAR_BIT identifier.
5.2.4.2.1/1 Sizes of integer types <limits.h>
Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
number of bits for smallest object that is not a bit-field (byte)
CHAR_BIT 8
The text below assumes that the byte is 8 bits (keep that in mind on the implementations where byte has a different number of bits).
The sizeof([[un]signed] char)
sizeof(char), sizeof(unsigned char), sizeof(signed char), are 1 byte.
6.5.3.4/2 The sizeof and _Alignof operators
The sizeof operator yields the size (in bytes) of its operand
6.5.3.4/4:
When sizeof is applied to an operand that has type char, unsigned char, or signed char,
(or a qualified version thereof) the result is 1.
The Range of the Values and the Size of the Type
Objects may use not all the bits to store a value
The object representation has value bits, may have padding bits, and for the signed types has exactly one sign bit (6.2.6.2/1/2 Integer types).
E.g. the variable can have size of 4 bytes, but only the 2 least significant bytes may be used to store a value
(the object representation has only 16 value bits), similar to how the bool type has at least 1 value bit, and all other bits are padding bits.
The correspondence between the range of the values and the size of the type (or the number of value bits) is arguable.
On the one hand #eric-postpischil refers to 3.19/1:
value
precise meaning of the contents of an object when interpreted as having a specific type
This makes an impression that every value has a unique bit representation (bit pattern).
On the other hand #language-lawyer states
different values don't have to be represented by different bit patterns. Thus, there can be more values than possible bit patterns.
when there is contradiction between the standard and a committee response (CR), committee response is chosen by implementors.
from DR260 Committee Response follows that a bit pattern in object representation doesn't uniquely determine the value.
Different values may be represented by the same bit pattern. So I think an implementation with CHAR_BIT == 8 and sizeof(int) == 1 is possible.
I didn't claim that an object has multiple values at the same time
#language-lawyer's statements make an impression that multiple values (e.g. 5, 23, -1), probably at different moments of time,
can correspond to the same bit pattern (e.g. 0xFFFF)
of the value bits of a variable. If that's true, then the integer types other than [[un]signed] char (see "The sizeof([[un]signed] char)" section above) can have any byte size >= 1
(they must have at least one value bit, which prevents them from having byte size 0 (paranoidly strictly speaking),
which results in a size of at least one byte),
and the whole range of values (mandated by <limits.h>, see below) can correspond to that "at least one value bit".
To summarize, the relation between sizeof(short), sizeof(int), sizeof(long), sizeof(long long) can be any
(any of these, in byte size, can be greater than or equal to any of the others. Again, somewhat paranoidly strictly speaking).
What Does Not Seem Arguable
What has not been mentioned is 6.2.6.2/1/2 Integer types:
For unsigned integer types .. If there are
N value bits, each bit shall represent a different power of 2 between 1 and 2^(N-1), so that objects of
that type shall be capable of representing values from 0 to 2^N - 1 using a pure binary representation ..
For signed integer types .. Each bit that is a value bit shall have
the same value as the same bit in the object representation of the corresponding unsigned type ..
This makes me believe that each value bit adds a unique value to the overall value of the object. E.g. the least significant value bit (I'll call it a value bit number 0) (regardless of where in the byte(s) it is located) adds a value of 2^0 == 1, and no any other value bit adds that value, i.e. the value is added uniquely.
The value bit number 1 (again, regardless of its position in the byte(s), but position different from the position of any other value bit) uniquely adds a value of 2^1 == 2.
These two value bits together sum up to the overall absolute value of 1 + 2 == 3.
Here I won't dig into whether they add a value when set to 1 or when cleared to 0 or combination of those.
In the text below I assume that they add value if set to 1.
Just in case I'll also quote 6.2.6.2/2 Integer types:
If the sign bit is one, the value shall be modified in one of the following ways:
...
— the sign bit has the value -(2^M) (two’s complement);
Earlier in 6.2.6.2/2 it has been mentioned that M is the number of value bits in the signed type.
Thus, if we are talking about 8-bit signed value with 7 value bits and 1 sign bit, then the sign bit, if set to 1, adds the value of -(2^M) == -(2^7) == -128.
Earlier I considered an example where the two least significant value bits sum up to the overall absolute value of 3.
Together with the sign bit set to 1 for the 8-bit signed value with 7 value bits, the overall signed value will be -128 + 3 == -125.
As an example, that value can have a bit pattern of 0x83 (the sign bit is set to 1 (0x80), the two least significant value bits are set to 1 (0x03), and both value bits add to the overall value if are set to 1, rather than cleared to 0, in the two's complement representation).
This observation makes me think that, very likely, there is a one-to-one correspondence
between the range of values and the number of value bits in an object - every value has a unique pattern of value bits and every pattern of value bits uniquely maps to a single value.
(I realize that this intermediate conclusion can still be not strict enough or wrong or not cover certain cases)
Minimum Number of Value Bits and Bytes
5.2.4.2.1/1 Sizes of integer types <limits.h>:
Important sentence:
Their implementation-defined values shall be equal or greater in magnitude (absolute value) to those shown, with the same sign.
Then:
SHRT_MIN -32767 // -(2^15 - 1)
SHRT_MAX +32767 // 2^15 - 1
USHRT_MAX 65535 // 2^16 - 1
This tells me that
short int has at least 15 value bits (see SHRT_MIN, SHRT_MAX above), i.e. at least 2 bytes (if byte is 8 bits, see "The Number of Bits in a Byte" above).
unsigned short int has at least 16 value bits (USHRT_MAX above), i.e. at least 2 bytes.
Continuing that logic (see 5.2.4.2.1/1):
int has at least 15 value bits (see INT_MIN, INT_MAX), i.e. at least 2 bytes.
unsigned int has at least 16 value bits (see UINT_MAX), i.e. at least 2 bytes.
long int has at least 31 value bits (see LONG_MIN, LONG_MAX), i.e. at least 4 bytes.
unsigned long int has at least 32 value bits (see ULONG_MAX), i.e. at least 4 bytes.
long long int has at least 63 value bits (see LLONG_MIN, LLONG_MAX), i.e. at least 8 bytes.
unsigned long long int has at least 64 value bits (see ULLONG_MAX), i.e. at least 8 bytes.
This proves to me that:
1 == sizeof(char) < any of { sizeof(short), sizeof(int), sizeof(long), sizeof(long long) }.
The sizeof(int)
6.2.5/5 Types
A "plain" int object has the natural size suggested by the architecture of the execution environment (large enough to contain any value in the range INT_MIN to INT_MAX as defined in the header <limits.h>).
This proves to me that:
sizeof(int) == 4 on 32-bit architecture (if byte is 8 bits),
sizeof(int) == 8 on 64-bit architecture (if byte is 8 bits).
The sizeof(unsigned T)
6.2.5/6 Types
For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements.
This proves to me that:
sizeof(unsigned T) == sizoef(signed T).
The Ranges of Values
6.2.5/8 Types
For any two integer types with the same signedness and different integer conversion rank (see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a subrange of the values of the other type.
(See the discussion of 6.3.1.1 below)
I assume that a subrange of values can contain the same or smaller number of values than the range.
I.e. the type with the smaller conversion rank can have the same or smaller number of values than the type with the greater conversion rank.
6.3.1.1/1 Boolean, characters, and integers
— The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char.
— The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.
— The rank of _Bool shall be less than the rank of all other standard integer types.
— The rank of any enumerated type shall equal the rank of the compatible integer type (see 6.7.2.2).
This tells me that:
range_of_values(bool) <= range_of_values(signed char) <= range_of_values(short int) <= range_of_values(int) <= range_of_values(long int) <= range_of_values(long long int).
For the unsigned types the relation between the ranges of values is the same.
This establishes the same relation for the number of value bits in the types.
But still does not prove the same relation between the sizes in bytes of objects of those types.
I.e. C (as of [C17_N2176]) does not guarantee the following statement (as opposed to C++):
sizeof(short) <= sizeof(int) <= sizeof(long) <= sizeof(long long)
6.2.6.2 Integer Types starts by defining value and padding bits for unsigned subtypes (except for unsigned char).
Of padding bits not much is said except that there don't have to be any at all. But there can be more than one, unlike the sign bit for signed types.
There is no common-sense rule against over-padding a short until it gets longer than a long, whether long has more value bits or not.
The direct implicit relation between number of (value) bits and the maximum value also shows in the title 5.2.4.2.1 Sizes of integer types <limits.h>. This defines minimum maximum values, not object sizes (except with CHAR_BIT).
The rest lies in the names themselves and in the hands of the implementation: short and long, not small and large. It is nicer to say "I am a space saving integer" than "I am a integer with reduced maximum value".
From initial examination of ISO/IEC 9899:2017 (your C17 C17_N2176 link):
Section "5.2.4.2.1 Sizes of integer types <limits.h>" has ranges with (+ or -)(2 raised to n) - 1 information (which indicates the number of bits for the type).
Section "6.2.5 Types" point 5 says '... A “plain” int object has the natural size suggested by the architecture of the execution
environment (large enough to contain any value in the range INT_MIN to INT_MAX as defined in the
header <limits.h>).'
This makes me think the ranges specify the smallest size in bits that the type can be. Maybe some architectures allot sizes greater than this smallest size.
The relevant parts are:
The environmental limits and limits.h, from C17 5.2.4.2.1 "Sizes of integer types <limits.h>". If we look at unsigned types only, then the minium values the implementation at least needs to support are:
UCHAR_MAX 255
USHRT_MAX 65535
UINT_MAX 65535
ULONG_MAX 4294967295
ULLONG_MAX 18446744073709551615
Then check the part C17 6.3.1.1 regarding integer conversion rank (also see Implicit type promotion rules):
The rank of long long int shall be greater than the rank of long int, which
shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char.
The rank of any unsigned integer type shall equal the rank of the corresponding
signed integer type, if any.
And then finally C17 6.2.5/8 is the normative text stating that every type with lower conversion rank is a subset of the larger ranked types:
For any two integer types with the same signedness and different integer conversion rank (see 6.3.1.1), the range of values of the type with smaller integer conversion rank is a subrange of the values of the other type.
To satisfy this requirement, we must have:
sizeof(unsigned char) <=
sizeof(unsigned short) <=
sizeof(unsigned int) <=
sizeof(unsigned long) <=
sizeof(unsigned long long)

Why does a one-bit `int` bit-field have values 0 and −1?

Why does this code produce the output shown?
#include<stdio.h>
struct sample{
int x:1;
}a[5]={0,1,2,3,4};
int main(){
int i;
for(i=0;i<5;++i){
printf("%d ",a[i].x);
}
return 0;
}
The output when compiled and executed using OnlineGDB is:
0 -1 0 -1 0
int x:1 acts as a signed one-bit integer. Per C 2018 6.7.2 5, an implementation may treat bit-fields declared as int as unsigned int, but the C implementation you are using appears to treated it as signed.
Additionally, the C implementation uses two’s complement. In two’s complement, an n-bit integer has 1 sign bit and n−1 value bits. Since your integer is one bit, it has one sign bit and zero value bits. Per the specification of two’s complement (C 2018 6.2.6.2 2), the value of the sign bit is −2M, where M is the number of value bits. Since that is zero, the value of the sign bit is −20 = −1.
Thus, the bit-field has value 0 when the bit is zero and value −1 when the bit is one.1
During initialization with ={0,1,2,3,4};, the first bit-field in the array is set to zero. However, the others overflow the values representable in the bit-field. Per C 2018 6.3.1.3 3, either an implementation-defined result is produced or an implementation-defined signal is raised. This C implementation appears to be producing the low bit of the source value as the value for the bit in the bit-field. This produces bits of 0, 1, 0, 1, and 0 in the five bit fields, representing values of 0, −1, 0, −1, and 0. Thus, when they are printed, the values 0, −1, 0, −1, and 0 result.
Footnote
1 This may be seen as a natural consequence of how two’s complement normally works: With eight bits, the range is −128…127. With four bits, the range is −16…15. With fewer and fewer bits, the range is −8…7, −4…3, −2…1, and finally −1…0.
The designation 'int x:1' means it consider only one bit of x (2^0 bit) as a range of this integer.
That explains why the output is 0 and 1.
The negative value is due to 'int' normally reserve the last bit (2^31) to designate if the number is negative or not. your in x:1 has one bit and that bit also indicate it is a negative number.
If you try out 'unsigned x:1' it will output a positive number because the last bit of you 'x:int' doesn't designate a negative number.
Try this program with 'int x:2' and 'unsigned x:2' and you see an interesting behavior.
It clarify why the range of int and unsigned are so different despite the they have the same size of bits:
int -> -2,147,483,648 to 2,147,483,647
unsigned -> 0 to 4,294,967,295
Important to note the bit position used used by ':' to limit the range is up to implementation and may give you other results.
First of all, using GCC 9.2.1, I see several -Woverflow warnings, for each initializer except 0 and 1 (which makes sense).
An initializer of -1 will also compile without warning. This is because it is (defined as) a one-bit field. Integers with all (one) bits set are equal to -1, unless they are unsigned. And the LSB (first bit) is never set for an even number, which explains why only odd numbers are non-zero.

What will be the output in C? [duplicate]

This question already has answers here:
Is char signed or unsigned by default?
(6 answers)
Integer conversions(narrowing, widening), undefined behaviour
(2 answers)
Range of char type values in C
(6 answers)
Closed 5 years ago.
I am having trouble finding the output of this code. Please help me to find out the output of the following output segment.
#include<stdio.h>
int main(){
char c = 4;
c=c*200;
printf("%d\n",c);
return 0;
}
I want to know that why the output is giving 32. Would you please tell me? I want the exact calculations.
Warning, long winded answer ahead. Edited to reference the C standard and to be clearer and more concise with respect to the question being asked.
The correct answer for why you have 32 has been given a few times. Explaining the math using modular arithmetic is completely correct but might make it a little harder to grasp intuitively if you are new to programming. So, in addition to the existing correct answers, here's a visualization.
Your char is an 8 bit type, so it is made up of a series of 8 zeros and ones.
Looking at the raw bits in binary, when unsigned (let's leave signed types out of it for a moment as it will just confuse the point) your variable 'c' can take on values in the following range:
00000000 -> 0
11111111 -> 255
Now, c*200 = 800. This is of course larger than 255. In binary 800 looks like:
00000011 00100000
To represent this in memory you need at least 10 bits (see the two 1's in the upper byte). As an aside, the leading zeros don't explicitly need to be stored since they have no effect on the number. However the next largest data type will be 16 bits and it's easier to show consistently sized groupings of bits anyway, so there it is.
Since the char type is limited to 8 bits and cannot represent the result, there needs to be a conversion. ISO/IEC 9899:1999 section 6.3.1.3 says:
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.
3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
So, if your new type is unsigned, following rule #2 if we subtract one more than the max value of the new type (256) from 800 we eventually end up in the range of the new type with 32. This behaviour also happens to effectively truncate the result, as you can see the higher bits which could not be represented have been discarded.
00100000 -> 32
The existing answers explain using the modulo operation, where 800 % 256 = 32. This is simply math that gives the remainder of a division operation. When we divide 800 by 256 we get 3 (because 256 fits into 800 at most three times) plus a remainder of 32. This is essentially the same as applying rule #2 here.
Hopefully this clarifies why you get a result of 32. However, as has been correctly pointed out, if the destination type is signed we're looking at rule #3, which says the behaviour is implementation-defined. Since the standard also says that the plain char type you are using may be signed or unsigned (and that this is implementation-defined) your particular case is then implementation-defined. However, in practice you will typically see the same behaviour where you lose the higher bits and hence you will still generally get 32.
Extending this a bit, if you were to have a signed 8-bit destination type, and you were to run your code with c=c*250 instead, you would have:
00000011 11101000 -> 1000
and you will probably find that after the conversion to the smaller signed type the result is similarly truncated as:
11101000
which in a signed type is interpreted as -24 for most systems which use two's complement. Indeed this is what happens when I run this on gcc, but again this is not guaranteed by the language itself.

What will happen if I assign negative value to an unsigned char?

In C++ primer it says that "if we assign an out of range value to an object of unsigned type the result is the remainder of the value modulo the number of values the target type can hold."
It gives the example:
int main(){
unsigned char i = -1;
// As per the book the value of i is 255 .
}
Can anybody please explain it to me how this works.
the result is the remainder of the value modulo the number of values the target type can hold
Start with "the number of values the target type can hold". For unsigned char, what is this? The range is from 0 to 255, inclusive, so there are a total of 256 values that can be represented (or "held").
In general, the number of values that can be represented in a particular unsigned integer representation is given by 2n, where n is the number of bits used to store that type.
An unsigned char is an 8-bit type, so 28 == 256, just as we already knew.
Now, we need to perform a modulo operation. In your case of assigning -1 to unsigned char, you would have -1 MOD 256 == 255.
In general, the formula is: x MOD 2n, where x is the value you're attempting to assign and n is the bit width of the type to which you are trying to assign.
More formally, this is laid out in the C++11 language standard (§ 3.9.1/4). It says:
Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2n where n is the number of bits in the value representation of that particular size of integer.*
* This implies that unsigned arithmetic does not overflow because a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting unsigned integer type.
Perhaps an easier way to think about modulo arithmetic (and the description that you'll most commonly see used) is that overflow and underflow wrap around. You started with -1, which underflowed the range of an unsigned char (which is 0–255), so it wrapped around to the maximum representable value (which is 255).
It's equivalent in C to C++, though worded differently:
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented by the new type until the value is in the range of the new type.
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
The literal 1 is of type int. For this explanation, let's assume that sizeof(int) == 4 as it most probably is. So then 1 in binary would look like this:
00000000 00000000 00000000 00000001
Now let's apply the unary minus operator to get the -1. We're assuming two's complement is used as it most probably is (look up two's complement for more explanation). We get:
11111111 11111111 11111111 11111111
Note that in the above numbers the first bit is the sign bit.
As you try to assign this number to unsigned char, for which holds sizeof(unsigned char) == 1, the value would be truncated to:
11111111
Now if you convert this to decimal, you'll get 255. Here the first bit is not seen as a sign bit, as the type is unsigned.
In Stroustrup's words:
If the destination type is unsigned, the resulting value is simply as many bits from the source as will fit in the destination (high-order bits are thrown away if necessary). More precisely, the result is the least unsigned integer congruent to the source integer modulo 2 to the nth, where n is the number of bits used to represent the unsigned type.
Excerpt from C++ standard N3936:
For each of the standard signed integer types, there exists a corresponding (but different) standard unsigned
integer type: “unsigned char”, “unsigned short int”, “unsigned int”, “unsigned long int”,
and “unsigned long long int”, each of which occupies the same amount of storage and has the same
alignment requirements (3.11) as the corresponding signed integer type47; that is, each signed integer type
has the same object representation as its corresponding unsigned integer type.
I was going through the excerpt from C++ primer myself and I think that I have kind of figured out a way to mathematically figure out how those values come out(feel free to correct me if I'm wrong :) ). Taking example of the particular code below.
unsigned char c = -4489;
std::cout << +c << std::endl; // will yield 119 as its output
So how does this answer of 119 come out?
well take the 4489 and divide it by the total number of characters ie 2^8 = 256 which will give you 137 as remainder.
4489 % 256 = 137.
Now just subtract that 137 from 256.
256 - 137 = 119.
That's how we simply derive the mod value. Do try it for yourself on other values as well. Has worked perfectly accurate for me!

Resources