Is the following guaranteed to work or implementation defined?
unsigned int a = 4294967294;
signed int b = a;
The value of b is -2 on gcc.
From C99
(§6.3.1.3/3) Otherwise, the new type is signed and the value cannot be
represented in it; either the result is implementation-defined or an
implementation-defined signal is raised.
The conversion of a value to signed int is implementation-defined (as you correctly mentioned because of 6.3.1.3p3) . On some systems for example it can be INT_MAX (saturating conversion).
For gcc the implementation behavior is defined here:
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 6.3.1.3).
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
http://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html
#ouah's answer tells you that its implementation-defined, but doesn't explain how your implementation yields (-2) specifically. I'll answer that:
1) Your implementation seems to have 32-bit wide int type and 2's complement representation.
2) 4294967294 is (UINT_MAX - 1) = 0xfffffffe.
In your implementation, (UINT_MAX - 1) is converted to signed int as follows:
0xfffffffe is converted as ~(0xfffffffe) + 1 = (1 in binary) + 1 = 10 in binary = 2 in decimal.
Notice that before this conversion the most significant bit was 1 (in 0xfffffffe), so the final number is interpreted to be a negative number after the aforementioned conversion. Thus, you get (-2) as your final answer after the conversion.
Hope this helped.
Related
What language in the standard makes this code work, printing '-1'?
unsigned int u = UINT_MAX;
signed int s = u;
printf("%d", s);
https://en.cppreference.com/w/c/language/conversion
otherwise, if the target type is signed, the behavior is implementation-defined (which may include raising a signal)
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation
GCC supports only two’s complement integer types, and all bit patterns are ordinary values.
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3):
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
To me it seems like converting UINT_MAX to an int would therefore mean dividing UINT_MAX by 2^(CHAR_BIT * sizeof(int)). For the sake of argument, with 32 bit ints, 0xFFFFFFFF / 2^32 = 0xFFFFFFFF. So this doesnt really explain how the value '-1' ends up in the int.
Is there some language somewhere else that says after the modulo division we just reinterpret the bits? Or some other part of the standard that takes precedence before the parts I have referenced?
No part of the C standard guarantees that your code shall print -1 in general. As it says, the result of the conversion is implementation-defined. However, the GCC documentation does promise that if you compile with their implementation, then your code will print -1. It's nothing to do with bit patterns, just math.
The clearly intended reading of "reduced modulo 2^N" in the GCC manual is that the result should be the unique number in the range of signed int that is congruent mod 2^N to the input. This is a precise mathematical way of defining the "wrapping" behavior that you expect, which happens to coincide with what you would get by reinterpreting the bits.
Assuming 32 bits, UINT_MAX has the value 4294967295. This is congruent mod 4294967296 to -1. That is, the difference between 4294967295 and -1 is a multiple of 4294967296, namely 4294967296 itself. Moreover, this is necessarily the unique such number in [-2147483648, 2147483647]. (Any other number congruent to -1 would be at least -1 + 4294967296 = 4294967295, or at most -1 - 4294967296 = -4294967297). So -1 is the result of the conversion.
In other words, add or subtract 4294967296 repeatedly until you get a number that's in the range of signed int. There's guaranteed to be exactly one such number, and in this case it's -1.
How does the computer convert between differently sized signed intgers?
For example when i convert the long long int value 12000000000 to int how does it reduce the value? And how does it handle negative numbers?
how does it reduce the value?
From C11 standard 6.3.1.3p3:
When a value with integer type is converted to another integer type [...]
[...]
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
It is not defined how to convert the value - instead, each compiler may have different behavior, but it has to have some documented behavior. Nowadays we live in twos-complement world - it's everywhere the same. Let's take a look at gcc compiler - from ex. gcc documentation integers implementation-defined behavior:
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
So we have:
long long int value 12000000000 to int
Let's assume long long int has 64 bits and int has 32 bits and byte has 8 bits and we use twos-complement, so INT_MIN = -2147483648 INT_MAX = 2147483647 and N is 32, and 2^N is, well, 4294967296. You should take a peek at modular arithmetic and we know that:
12000000000 = 3 * 4294967296 + -884901888
So it will be converted to -884901888. That is irrelevant to what format is used to store the number - it can be in any format it wishes.
Now, gcc is smart, and while the documentation states the mathematical description of the algorithm in modulo arithmetic, you can note that:
$ printf("%16llx\n%16x\n", 12000000000ll, (int)12000000000ll);
2cb417800
cb417800
Ie the mathematical operation of "modulo 2^32" is equal in binary to doing an AND mask with all bits set num & 0xffffffff.
And how does it handle negative numbers?
Exactly the same way, there's just a minus. For example -12000000000ll :
-12000000000ll = -3 * 4294967296 + 884901888
So (int)-12000000000ll will be converted to 884901888. Note that in binary it's just:
$ printf("%16llx\n%16x\n", -12000000000ll, (int)-12000000000ll);'
fffffffd34be8800
34be8800
Attempting to convert an integer representation to a smaller, signed type in which it cannot be properly represented (such as your example of trying to convert 12000000000 to a 32-bit int) is implementation-defined behaviour. From this C11 Draft Standard (the third paragraph being relevant here):
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if the value can be represented by the new type, it is
unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.60)
3 Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised
I found a solution, that works with 2s complement. It can convert integers up and down in width and works with positive and negative numbers.
a is the number,
oldp is the old position of the sign bit
newp is the new position of the sign bit
uint64_t shsbc(uint64_t a, uint32_t oldp, uint32_t newp) {
if (!(a>>oldp&1)) {
if (oldp > newp) {
return(a & UINT64_MAX>>(64 - (newp + 1)));
} else {
return(a & UINT64_MAX>>(64 - (oldp + 1)));
}
}
if (oldp > newp) {
a &= UINT64_MAX>>((oldp - newp) + 1 + (63 - oldp));
a |= ((uint64_t) 1)<<newp;;
return(a);
} else {
a &= UINT64_MAX>>((newp - oldp) + 1 + (63 - newp));
a |= UINT64_MAX>>(63 - (newp - oldp))<<(newp - oldp - 1);
return(a);
}
}
Signed Short (Signed Int16) Multiplication Explanation?
short ss = -32768; // 0x8000 SHRT_MIN
ss *= (short) -1;
printf ("%d", (int)ss); // Prints -32768
What are mechanics of how an unsigned short with a value of -32768 times -1 can be itself? My guess is that (int)32768 ---> overflows and wraps back around to -32768 but nowhere here is asking for promotion to integer or any larger datatype.
Looking for the part of the C specification that defines this behavior.
There is no undefined behavior here -- assuming that int is wider than 16 bits.
This:
ss *= (short) -1;
is equivalent to:
ss = ss * (short)-1;
Both operands of * are promoted from short to int (by the integer promotions), and the multiplication is done in type int. It yields the int value 32768 (again, unless INT_MAX == 32767, which is legal but rare in modern non-embedded systems). (C has no arithmetic operations on integer types narrower than int and unsigned int.)
That int value is converted back to short. Unlike arithmetic operations, a conversion of an integer value to a signed integer result, when the value does not fit in the target type, yields an implementation-defined result (or raises an implementation-defined signal, but I don't think any implementations do that).
Converting 32768 to type short will probably yield -32768.
The behavior of signed conversion is specified in N1570 6.3.1.3p3:
Otherwise, the new type is signed and the value cannot be represented
in it; either the result is implementation-defined or an
implementation-defined signal is raised.
The integer promotions are described in 6.3.1.1p2:
If an int can represent all values of the original type (as restricted
by the width, for a bit-field), the value is converted to an int;
otherwise, it is converted to an unsigned int. These are called the
integer promotions. All other types are unchanged by the integer promotions.
Is the following code undefined behavior according to GCC in C99 mode:
signed char c = CHAR_MAX; // assume CHAR_MAX < INT_MAX
c = c + 1;
printf("%d", c);
signed char overflow does cause undefined behavior, but that is not what happens in the posted code.
With c = c + 1, the integer promotions are performed before the addition, so c is promoted to int in the expression on the right. Since 128 is less than INT_MAX, this addition occurs without incident. Note that char is typically narrower than int, but on rare systems char and int may be the same width. In either case a char is promoted to int in arithmetic expressions.
When the assignment to c is then made, if plain char is unsigned on the system in question, the result of the addition is less than UCHAR_MAX (which must be at least 255) and this value remains unchanged in the conversion and assignment to c.
If instead plain char is signed, the result of the addition is converted to a signed char value before assignment. Here, if the result of the addition can't be represented in a signed char the conversion "is implementation-defined, or an implementation-defined signal is raised," according to §6.3.1.3/3 of the Standard. SCHAR_MAX must be at least 127, and if this is the case then the behavior is implementation-defined for the values in the posted code when plain char is signed.
The behavior is not undefined for the code in question, but is implementation-defined.
No, it has implementation-defined behavior, either storing an implementation-defined result or possibly raising a signal.
Firstly, the usual arithmetic conversions are applied to the operands. This converts the operands to type int and so the computation is performed in type int. The result value 128 is guaranteed to be representable in int, since INT_MAX is guaranteed to be at least 32767 (5.2.4.2.1 Sizes of integer types), so next a value 128 in type int must be converted to type char to be stored in c. If char is unsigned, CHAR_MAX is guaranteed to be at least 255; otherwise, if SCHAR_MAX takes its minimal value of 127:
6.3.1.3 Signed and unsigned integers
When a value with integer type is converted to another integer type, [if] the new type is signed and the value cannot be represented in it[,] either the
result is implementation-defined or an implementation-defined signal is raised.
In particular, gcc can be configured to treat char as either signed or unsigned (-f\[un\]signed-char); by default it will pick the appropriate configuration for the target platform ABI, if any. If a signed char is selected, all current gcc target platforms that I am aware of have an 8-bit byte (some obsolete targets such as AT&T DSP1600 had a 16-bit byte), so it will have range [-128, 127] (8-bit, two's complement) and gcc will apply modulo arithmetic yielding -128 as the result:
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
I understand that character variable holds from (signed)-128 to 127 and (unsigned)0 to 255
char x;
x = 128;
printf("%d\n", x);
But how does it work? Why do I get -128 for x?
printf is a variadic function, only providing an exact type for the first argument.
That means the default promotions are applied to the following arguments, so all integers of rank less than int are promoted to int or unsigned int, and all floating values of rank smaller double are promoted to double.
If your implementation has CHAR_BIT of 8, and simple char is signed and you have an obliging 2s-complement implementation, you thus get
128 (literal) to -128 (char/signed char) to -128 (int) printed as int => -128
If all the listed condition but obliging 2s complement implementation are fulfilled, you get a signal or some implementation-defined value.
Otherwise you get output of 128, because 128 fits in char / unsigned char.
Standard quote for case 2 (Thanks to Matt for unearthing the right reference):
6.3.1.3 Signed and unsigned integers
1 When a value with integer type is converted to another integer type other than _Bool, if
the value can be represented by the new type, it is unchanged.
2 Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or
subtracting one more than the maximum value that can be represented in the new type
until the value is in the range of the new type.60)
3 Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
This all has nothing to do with variadic functions, default argument promotions etc.
Assuming your system has signed chars, then x = 128; is performing an out-of-range assignment. The behaviour of this is implementation-defined ; meaning that the compiler may choose an action but it must document what it does (and therefore, do it reliably). This action is allowed to include raising a signal.
The usual behaviour that modern compilers do for out-of-range assignment is to truncate the representation of the value to fit in the destination type.
In binary representation, 128 is 000....00010000000.
Truncating this into a signed char gives the signed char of binary representation 10000000. In two's complement representation, which is used by all modern C systems for negative numbers, this is the representation of the value -128. (For historical curiousity: in one's complement this is -127, and in sign-magnitude, this is -0 which may be a trap representation and thus raise a signal).
Finally, printf accurately prints out this char's value of -128. The %d modifier works for char because of the default argument promotions and the facts that INT_MIN <= CHAR_MIN and INT_MAX >= CHAR_MAX.; this behaviour is guaranteed except on systems which have plain char as unsigned, and sizeof(int)==1 (which do exist but you'd know about it if you were on one).
Lets look at the binary representation of 128 when stored into 8 bits:
1000 0000
And now let's look at the binary representation of -128 when stored into 8 bits:
1000 0000
The standard for char with your current setup looks to be a signed char (note this isn't in the c standard, look here if you don't believe me) and thus when you're assigning the value of 128 to x you're assigning it the value 1000 0000 and thus when you compile and print it out it's printing out the signed value of that binary representation (meaning -128).
It turns out my environment is the same in assuming char is actually signed char. As expected if I cast x to be an unsigned char then I get the expected output of 128:
#include <stdio.h>
#include <stdlib.h>
int main() {
char x;
x = 128;
printf("%d %d\n", x, (unsigned char)x);
return 0;
}
gives me the output of -128 128
Hope this helps!