unsigned and int gotcha [duplicate]

unsigned and int gotcha [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Arithmetic operations on unsigned and signed integers
unsigned int b=2;
int a=-2;
if(a>b)
printf("a>b");
else
printf("b>a");
OUTPUT: a>b
int b=2;
int a=-2;
if(a>b)
printf("a>b");
else
printf("b>a");
OUTPUT: b>a
PLEASE, someone explain the output

In the first case both operands are converted to unsigned int, the converted a will be UINT_MAX-1, which is much larger than b and hence the output.
Don't compare signed and unsigned integers unless you understand the semantics of arithematic conversions, the results might surprise you.

When signed and unsigned values are compared, and when the unsigned values can't all be represented in the signed type, then the signed operand is promoted to unsigned. This is done with a formula that amounts to a reinterpretation of the 2-s complement bit pattern.
Well, negative numbers have lots of high bits set...
Since your operands are all of the same rank, it's just a matter of unsigned bit patterns being compared.
And so -2 is represented with 111111..110, one less than the largest possible, and it easily beats 2 when interpreted as unsigned.

The following is taken from The C Programming Language by Kernighan and Ritchie - 2.7 Type Conversions - page 44; the second half of the page explains the same scenario in detail. A small portion is below for your reference.
Conversion rules are complicated when unsigned operands are involved. The problem is that comparison between signed and unsigned values are machine dependent, because they depend on the sizes of the various integer types. For example, suppose that int is 16 bits long and long is 32 bits. Then -1L < 1U, because 1U, which is an int, is promoted to a signed long. But -1L > 1UL, because -1L is promoted to unsigned long and thus appears to be a larger positive number.

You need to learn the operation of the operators in C and the C promotion and conversion rules. They are explained in the C standard. Some excerpts from it plus my comments:
6.5.8 Relational operators
Syntax
1 relational-expression:
shift-expression
relational-expression < shift-expression
relational-expression > shift-expression
relational-expression <= shift-expression
relational-expression >= shift-expression
Semantics
3 If both of the operands have arithmetic type, the usual arithmetic conversions are
performed.
Most operators include this "usual arithmetic conversions" step before the actual operation (addition, multiplication, comparison, etc etc). - Alex
6.3.1.8 Usual arithmetic conversions
1 Many operators that expect operands of arithmetic type cause conversions and yield result
types in a similar way. The purpose is to determine a common real type for the operands
and result. For the specified operands, each operand is converted, without change of type
domain, to a type whose corresponding real type is the common real type. Unless
explicitly stated otherwise, the common real type is also the corresponding real type of
the result, whose type domain is the type domain of the operands if they are the same,
and complex otherwise. This pattern is called the usual arithmetic conversions:
First, if the corresponding real type of either operand is long double, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is long double.
Otherwise, if the corresponding real type of either operand is double, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is double.
Otherwise, if the corresponding real type of either operand is float, the other
operand is converted, without change of type domain, to a type whose
corresponding real type is float.
Otherwise, the integer promotions are performed on both operands. Then the
following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is unchanged.
Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type. (The rules describe arithmetic on the mathematical value, not the value of a given type of expression.)
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
So, in your a>b (with a being an int and b being an unsigned int), per the above rules you get a converted to unsigned int before the comparison. Since a is negative (-2), the unsigned value becomes UINT_MAX+1+a (this is the repeatedly adding or
subtracting one more than the maximum value bit). And UINT_MAX+1+a in your case is UINT_MAX+1-2 = UINT_MAX-1, which is a huge positive number compared to the value of b (2). And so a>b yields "true".
Forget the math you learned at school. Learn how C does it.

In this first case you get the unsigned a converted into a signed int. Then these two are compared.
Type conversion ranks between signed and unsigned types could have the same rank in C99. This is when a unsigned and a signed type have the corresponding types, when this happens the result is up to the compiler.
Here is a summary of the rules.

Related

Result of an operator if different data types as arguments? (C)

I haven't been able to find an answer to the following question.
My question is:
What is the result of an operator when different data types (like int, or float) are being operated on?
For example,
float * int = ?
float / int = ?
We know that operations on same data types give results of the same data type. For instance,
float * float = float
But I wanted to know what happens in this other case?
This question has probably already been discussed here, but it has been hard for me to find something similar.
Thanks.

When operating on values if differing types, the operands undergo the usual arithmetic conversions.. These are specified in section 6.3.1.8 of the C standard.
First, if the corresponding real type of either operand is long double
, the other operand is converted, without change of type
domain, to a type whose corresponding real type is long double
.
Otherwise, if the corresponding real type of either operand
is double , the other operand is converted, without change
of type domain, to a type whose corresponding real type is
double .
Otherwise, if the corresponding real type of either operand
is float , the other operand is converted, without change
of type domain, to a type whose corresponding real type is
float .
Otherwise, the integer promotions are performed on both
operands. Then the following rules are applied to the promoted
operands:
If both operands have the same type, then no further conversion is
needed.
Otherwise, if both operands have signed integer types or both have
unsigned integer types, the operand with the type of lesser
integer conversion rank is converted to the type of the operand
with greater rank.
Otherwise, if the operand that has unsigned integer type has
rank greater or equal to the rank of the type of the other
operand, then the operand with signed integer type is
converted to the type of the operand with unsigned integer
type.
Otherwise, if the type of the operand with signed integer type can
represent all of the values of the type of the operand with unsigned
integer type, then the operand with unsigned integer type is
converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned
integer type corresponding to the type of the operand with signed
integer type.
In the case of a float and an int as operands to * or /, the int operand will be converted to float.

Before performing the arithmetic operation, the compiler arranges for the "usual arithmetic conversions" to be performed.
The precise rules are a slightly complicated, and well documented in the provided link, but the basic idea is:
If either argument is a floating point type, both arguments are converted to the more precise floating point between the two arguments.
Otherwise, if both arguments are integer types, they are first promoted to at least int, and then if the two arguments are not the same width, the narrower one is converted to the type of the other one.
The real rules are more complicated for a couple of reasons:
A modern C compiler may implement complex (and imaginary) types, which play into the conversions.
Conversion between signed and unsigned types can be counter-intuitive. If you haven't read the precise rules, it is best to avoid this case.

integer promotion and unsigned interpretation

int_8 int8 = ~0;
uint_16 uInt16 = (uint_16) int8;
Regarding the typecast above; where in C standard can I find reference to an indication for the following behaviour?
- sign extension to the larger type before the unsigned interpretation (uInt16=0xFFFF) rather than unsigned interpretation followed by 0 extension to the larger type (uInt16=0xFF).
From C99 6.3.1.8
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Above statement is clear about which variable needs to be converted however it is not very clear about how the conversation should actually be performed hence my question asking for a reference from the standard.
Thanks

As per the standard:
6.3.1.3 Signed and unsigned integers
......
2. Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
And the footnote to avoid the confusion when interpreting the above:
The rules describe arithmetic on the mathematical value, not the value of a given type of expression.
I.e. if your int8 has a value of -1 (assuming the negatives representations is 2's complement, it does in your example), when converted into uint16_t, the value (0xFFFF + 1) will be added to it (which one more than the max value that can be represented by uint16_t), which yields the result of 0xFFFF + 1 - 1 = 0xFFFF.

Answer I believe is actually part of 6.3.1.8 as well:
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
....
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand.....
meaning that integer promotions are performed first before the conversion to unsigned using the rule 6.3.1.3.

Bit width of bit shift and arithmetic in C

I am working on a program that mixes 64 bit (for some calculations) and 32 bit (for space saving storage) unsigned integers, so it is very important I keep them sorted out during arithmetic to avoid overflows
Here's an example problem
I want to bit shift 1 to the left by an unsigned long n, but I want the result to be an unsigned long long. This will be used in an comparison operation in an if statement, so there is no assignment going on. I'll give you some code.
void example(unsigned long shift, unsigned long long compare)
{
if((1<<shift)>compare)
{
do_stuff;
}
}
I suspect that this would NOT do what I want, so would the following do what I want?
void example(unsigned long shift, unsigned long long compare)
{
if(((unsigned long long)1<<shift)>compare)
{
do_stuff;
}
}
How do I micromanage the bit width of these things? Which operand determines the bit width that the operation is performed with, or is it the larger of the two?
Also, I would like a run down of how this works for other operations too if possible, such as + * / % etc.
Perhaps a reference to a resource with this information would be good, I cannot seem to find a clear statement of this information anywhere. Or perhaps the rules are simple enough to just post. I am not sure.

Which operand determines the bit width that the operation is performed with, or is it the larger of the two?
For the bit-shifts, it's the left operand (the one to be shifted) that determines the type that the operation is performed with. If the integer promotions convert it to int or unsigned int, the operation is performed at that type, otherwise at the type of the left operand.
For the comparison, the result of the shift may then be converted to the type of the other operand. In your example code, the integer constant 1 has type int, hence the shift would be performed at type int, and the result of that converted to unsigned long long for the comparison. Casting works, since the result has a type that is not changed by the integer promotions, as would using a suffixed literal 1ull.
For the other listed operations, the arithmetic operations (as for comparisons), the type at which the operation is performed is determined by both operands as follows:
Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

It will do exactly what you want. However, this may be achieved (in this particular case) just by using a literal constant of type long long: 1LL

What you want is a long long literal. To do this, use 1LL instead of 1.

K&R chapter 2: Difference between u and l integer suffix in C

I am a student, going through the book by Kerningham and Ritchie for C.
A line in the book says that -1l is less than 1u because in that case unsigned int is promoted to signed long. But -1l > 1ul because in this case -1l is promoted to unsigned long.
I can't really understand the promotion properly. What will be the value of -1l when it is promoted to unsigned long? It'll be great if anyone can help.
Thanks.

In -1l > 1ul the -1l is promoted to unsigned long, and by definition and Standard, -1 cast to an unsigned type will be the largets value representable by that unsigned type.
I got my inspiration from memory of this answer here to a quite relevant question.
And after looking at the C99 draft I have lingering around, see for example 6.3.1.3(2), where it says the maximum value representable by the type will be added or subtracted from the original value until it fits in the new type. I must warn you that char, although it is an integer type, is treated special: it is implementation defined if char is signed or unsigned. But that is, strictly, beside the question at hand.

Implicit promotions are one of the most difficult things in the C language. If you have a C code expression looking like
if(-1l > 1ul)
then no "integer promotions" take place. Both types are of the same size, but different signedness. -1l will then be converted to unsigned long with a very large value. This is one of the rules in the "usual arithmetic conversions".

This is actually a conversion. Promotions go from types with less rank than an integer to integer.
The rules for integer conversions in C are somewhat complex. They are, as per ISO C99 §6.3.1.8 ¶1:
Otherwise, the integer promotions are
performed on both operands. Then the
following rules are applied to the
promoted operands:
If both operands have the same type, then no further conversion is
needed.
Otherwise, if both operands have signed integer types or both have
unsigned
integer types, the operand with the type of lesser integer conversion
rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank
greater or
equal to the rank of the type of the other operand, then the operand
with
signed integer type is converted to the type of the operand with
unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can
represent
all of the values of the type of the operand with unsigned integer
type, then
the operand with unsigned integer type is converted to the type
of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
I'll try to explain them:
Try to convert to the larger type. When there is conflict between signed and unsigned, if the larger (including the case where the two types have the same rank) type is unsigned, go with unsigned. Otherwise, go with signed only in the case it can represent all the values of both types.

When you're learning C, if you have a question, just write yourself a simple program:
#include <stdio.h>
main() {
int si = -1;
unsigned int ui = 1;
if ( si > ui ) printf("-1l > 1u\n");
else printf("-1l <= 1u\n");
}
You'll see that -1l > 1u is shown for the output.
Because both si and ui have the same rank (they're both ints), the rule says that the negative value will be promoted to unsigned at set to UINT_MAX which is the largest possible unsigned value.

In a C expression where unsigned int and signed int are present, which type will be promoted to what type?

I have a query about data type promotion rules in C language standard.
The C99 says that:
C integer promotions also require that "if an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int."
My questions is in case of a C language expression where unsigned int and signed int are present, which type will be promoted to what type?
E.g. int cannot represent all the values of the unsigned int (values larger than MAX_INT values) whereas unsigned int cannot represent the -ve values, so what type is promoted to what in such cases?

I think you are confusing two things. Promotion is the process by which values of integer type "smaller" that int/unsigned int are converted either to int or unsigned int. The rules are expressed somewhat strangely (mostly for the benefit of handling adequately char) but ensure that value and sign are conserved.
Then there is the different concept of usual arithmetic conversion by which operands of arithmetic operators are converted to a common type. It begins by promoting the operand (to either int or unsigned) if they are of a type smaller than int and then choosing a target type by the following process (for integer types, 6.3.1.8/1)
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned
integer types, the operand with the type of lesser integer conversion rank is
converted to the type of the operand with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or
equal to the rank of the type of the other operand, then the operand with
signed integer type is converted to the type of the operand with unsigned
integer type.
Otherwise, if the type of the operand with signed integer type can represent
all of the values of the type of the operand with unsigned integer type, then
the operand with unsigned integer type is converted to the type of the
operand with signed integer type.
Otherwise, both operands are converted to the unsigned integer type
corresponding to the type of the operand with signed integer type.
(Note that ISTR that those rules have changed slightly between C89 and C99)

I think the following answers your question:
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is
converted to another integer type
other than _Bool, if the value can be
represented by the new type, it is
unchanged.
2 Otherwise, if the new
type is unsigned, the value is
converted by repeatedly adding or
subtracting one more than the maximum
value that can be represented in the
new type until the value is in the
range of the new type.
3 Otherwise,
the new type is signed and the value
cannot be represented in it; either
the result is implementation-defined
or an implementation-defined signal is
raised.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight