Issue with absolute value of 64 bit integer - c

This C code tries to find the absolute value of a negative number but the output also is negative. Can anyone tell me how to overcome this?
#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <inttypes.h>
int main() {
int64_t a = 0x8000000000000000;
a = llabs(a);
printf("%" PRId64 "\n", a);
return 0;
}
Output
-9223372036854775808
UPDATE:
Thanks for all your answers. I understand that this is a non-standard value and that is why I am unable to perform an absolute operation on it. However, I did encounter this in an actual codebase that is a Genetic Programming simulation. The "organisms" in this do not know about the C standard and insist on generating this value :) Can anyone tell me an efficient way of working around this? Thanks again.

If the result of llabs() cannot be represented in the type long long, then the behaviour is undefined. We can infer that this is what's happening here - the out-of-range value 0x8000000000000000 is being converted to the value -9223372036854775808 when converted to int64_t, and your long long value is 64 bits wide, so the value 9223372036854775808 is unrepresentable.
In order for your program to have defined behaviour, you must ensure that the value passed to llabs() is not less than -LLONG_MAX. How you do this is up to you - either modify the "organisms" so that they cannot generate this value (eg. filter out those that create the out-of-range value as immediately unfit) or clamp the value before you pass it to llabs().

Basically, you can't.
The range of representable values for int64_t is -263 to +263-1. (And the standard requires int64_t to have a pure 2's-complement representation; if that's not supported, an implementation just won't define int64_t.)
That extra negative value has no corresponding representable positive value.
So unless your system has an integer type bigger than 64 bits, you're just not going to be able to represent the absolute value of 0x8000000000000000 as an integer.
In fact, your program's behavior is undefined according to the ISO C standard. Quoting section 7.22.6.1 of the N1570 draft of the 2011 ISO C standard:
The abs, labs, and llabs functions compute the absolute
value of an integer j. If the result cannot be represented, the
behavior is undefined.
For that matter, the result of
int64_t a = 0x8000000000000000;
is implementation-defined. Assuming long long is 64 bits, that constant is of type unsigned long long. It's implicitly converted to int64_t. It's very likely, but not guaranteed, that the stored value will be -263, or -9223372036854775808. (It's even permitted for the conversion to raise an implementation-defined signal, but that's not likely.)
(It's also theoretically possible for your program's behavior to be merely implementation-defined rather than undefined. If long long is wider than 64 bits, then the evaluation of llabs(a) is not undefined, but the conversion of the result back to int64_t is implementation-defined. In practice, I've never seen a C compiler with long long wider than 64 bits.)
If you really need to represent integer values that large, you might consider a multi-precision arithmetic package such as GNU GMP.

0x8000000000000000 is the smallest number that can be represented by a signed 64-bit integer. Because of quirks in two's complement, this is the only 64-bit integer with an absolute value that cannot be represented as a 64-bit signed integer.
This is because 0x8000000000000000 = -2^63, while the maximum representable 64-bit integer is 0x7FFFFFFFFFFFFFFF = 2^63-1.
Because of this, taking the absolute value of this is undefined behaviour that will generally result in the same value.

A signed 64-bit integer ranges from −(2^63) to 2^63 − 1, The absolute value of 0x8000000000000000, or −(2^63), is 2^63, is bigger than the max 64-bit integer.

No signed integer with its highest bit set high and all other bits low is representable in the same type as absolute value of that integer.
Observe an 8-bit integer
int8_t x = 0x80; // binary 1000_0000, decimal -128
An 8-bit signed integer can hold values between -128 and +127 inclusive, so the value +128 is out of range.
For a 16-bit integer this hols as well
int16_t = 0x8000; // binary 1000_0000_0000_0000, decimal -32,768
A 16-bit integer can hold values between -32,768 and +32,767 inclusive.
This pattern holds for any size integer as long as it is represented in two's complement, as is the de-facto representation for integers in computers. Two's complement holds 0 as all bits low and -1 as all bits high.
So an N-bit signed integer can hold values between 2^(N-1) and 2^(N-1)-1 inclusive, an unsigned integer can hold values between 0 and 2^N-1 inclusive.

Interestingly:
int64_t value = std::numeric_limits<int64_t>::max();
std::out << abs(value) << std::endl;
yields a value of 1 on gcc-9.
Frustrating!

Related

How does C casting unsigned to signed work?

What language in the standard makes this code work, printing '-1'?
unsigned int u = UINT_MAX;
signed int s = u;
printf("%d", s);
https://en.cppreference.com/w/c/language/conversion
otherwise, if the target type is signed, the behavior is implementation-defined (which may include raising a signal)
https://gcc.gnu.org/onlinedocs/gcc/Integers-implementation.html#Integers-implementation
GCC supports only two’s complement integer types, and all bit patterns are ordinary values.
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3):
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
To me it seems like converting UINT_MAX to an int would therefore mean dividing UINT_MAX by 2^(CHAR_BIT * sizeof(int)). For the sake of argument, with 32 bit ints, 0xFFFFFFFF / 2^32 = 0xFFFFFFFF. So this doesnt really explain how the value '-1' ends up in the int.
Is there some language somewhere else that says after the modulo division we just reinterpret the bits? Or some other part of the standard that takes precedence before the parts I have referenced?
No part of the C standard guarantees that your code shall print -1 in general. As it says, the result of the conversion is implementation-defined. However, the GCC documentation does promise that if you compile with their implementation, then your code will print -1. It's nothing to do with bit patterns, just math.
The clearly intended reading of "reduced modulo 2^N" in the GCC manual is that the result should be the unique number in the range of signed int that is congruent mod 2^N to the input. This is a precise mathematical way of defining the "wrapping" behavior that you expect, which happens to coincide with what you would get by reinterpreting the bits.
Assuming 32 bits, UINT_MAX has the value 4294967295. This is congruent mod 4294967296 to -1. That is, the difference between 4294967295 and -1 is a multiple of 4294967296, namely 4294967296 itself. Moreover, this is necessarily the unique such number in [-2147483648, 2147483647]. (Any other number congruent to -1 would be at least -1 + 4294967296 = 4294967295, or at most -1 - 4294967296 = -4294967297). So -1 is the result of the conversion.
In other words, add or subtract 4294967296 repeatedly until you get a number that's in the range of signed int. There's guaranteed to be exactly one such number, and in this case it's -1.

Convert overflow to negative

Is there an easy way to replace an overflow at counting down with a negative value?
For example a 32 bit variable. Possible values are 0x00000000 - 0xFFFFFFFF. When I subtract 1 from the lowest possible value (0x00000000 - 1), the result is 0xFFFFFFFF. How can the operation be changed to give a result of -1?
Is there an easy way to replace an overflow at counting down with a negative value?
Use wider math is a direct approach.
"example a 32 bit variable. Possible values are 0x00000000 - 0xFFFFFFFF" implies that the variable is some unsigned type like uint32_t.
Subtracting 1 from (uint32_t)0 is (uint32_t)0xFFFFFFFF as OP reported. So instead use a wider signed math like long long (which is at least 64-bit) or int64_t
// Insure subtraction is done using `long long` math with 1LL
long long result = var_uint32_bit - 1LL;
Alternatively code could stick with a corresponding same width signed type.
// Only non-implementation defined for values 0...0x7FFFFFFF
int32_t result = (int32_t)var_uint32_bit - 1;
The (int32_t)var_uint32_bit has a limitation concerning conversion of an unsigned type to a signed integer type.
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised. C11dr §6.3.1.3 3

Why is -(-2147483648) = - 2147483648 in a 32-bit machine?

I think the question is self explanatory, I guess it probably has something to do with overflow but still I do not quite get it. What is happening, bitwise, under the hood?
Why does -(-2147483648) = -2147483648 (at least while compiling in C)?
Negating an (unsuffixed) integer constant:
The expression -(-2147483648) is perfectly defined in C, however it may be not obvious why it is this way.
When you write -2147483648, it is formed as unary minus operator applied to integer constant. If 2147483648 can't be expressed as int, then it s is represented as long or long long* (whichever fits first), where the latter type is guaranteed by the C Standard to cover that value†.
To confirm that, you could examine it by:
printf("%zu\n", sizeof(-2147483648));
which yields 8 on my machine.
The next step is to apply second - operator, in which case the final value is 2147483648L (assuming that it was eventually represented as long). If you try to assign it to int object, as follows:
int n = -(-2147483648);
then the actual behavior is implementation-defined. Referring to the Standard:
C11 §6.3.1.3/3 Signed and unsigned integers
Otherwise, the new type is signed and the value cannot be represented
in it; either the result is implementation-defined or an
implementation-defined signal is raised.
The most common way is to simply cut-off the higher bits. For instance, GCC documents it as:
For conversion to a type of width N, the value is reduced modulo 2^N
to be within range of the type; no signal is raised.
Conceptually, the conversion to type of width 32 can be illustrated by bitwise AND operation:
value & (2^32 - 1) // preserve 32 least significant bits
In accordance with two's complement arithmetic, the value of n is formed with all zeros and MSB (sign) bit set, which represents value of -2^31, that is -2147483648.
Negating an int object:
If you try to negate int object, that holds value of -2147483648, then assuming two's complement machine, the program will exhibit undefined behavior:
n = -n; // UB if n == INT_MIN and INT_MAX == 2147483647
C11 §6.5/5 Expressions
If an exceptional condition occurs during the evaluation of an
expression (that is, if the result is not mathematically defined or
not in the range of representable values for its type), the behavior
is undefined.
Additional references:
INT32-C. Ensure that operations on signed integers do not result in overflow
*) In withdrawed C90 Standard, there was no long long type and the rules were different. Specifically, sequence for unsuffixed decimal was int, long int, unsigned long int (C90 §6.1.3.2 Integer constants).
†) This is due to LLONG_MAX, which must be at least +9223372036854775807 (C11 §5.2.4.2.1/1).
Note: this answer does not apply as such on the obsolete ISO C90 standard that is still used by many compilers
First of all, on C99, C11, the expression -(-2147483648) == -2147483648 is in fact false:
int is_it_true = (-(-2147483648) == -2147483648);
printf("%d\n", is_it_true);
prints
0
So how it is possible that this evaluates to true?
The machine is using 32-bit two's complement integers. The 2147483648 is an integer constant that quite doesn't fit in 32 bits, thus it will be either long int or long long int depending on whichever is the first where it fits. This negated will result in -2147483648 - and again, even though the number -2147483648 can fit in a 32-bit integer, the expression -2147483648 consists of a >32-bit positive integer preceded with unary -!
You can try the following program:
#include <stdio.h>
int main() {
printf("%zu\n", sizeof(2147483647));
printf("%zu\n", sizeof(2147483648));
printf("%zu\n", sizeof(-2147483648));
}
The output on such machine most probably would be 4, 8 and 8.
Now, -2147483648 negated will again result in +214783648, which is still of type long int or long long int, and everything is fine.
In C99, C11, the integer constant expression -(-2147483648) is well-defined on all conforming implementations.
Now, when this value is assigned to a variable of type int, with 32 bits and two's complement representation, the value is not representable in it - the values on 32-bit 2's complement would range from -2147483648 to 2147483647.
The C11 standard 6.3.1.3p3 says the following of integer conversions:
[When] the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
That is, the C standard doesn't actually define what the value in this case would be, or doesn't preclude the possibility that the execution of the program stops due to a signal being raised, but leaves it to the implementations (i.e. compilers) to decide how to handle it (C11 3.4.1):
implementation-defined behavior
unspecified behavior where each implementation documents how the choice is made
and (3.19.1):
implementation-defined value
unspecified value where each implementation documents how the choice is made
In your case, the implementation-defined behaviour is that the value is the 32 lowest-order bits [*]. Due to the 2's complement, the (long) long int value 0x80000000 has the bit 31 set and all other bits cleared. In 32-bit two's complement integers the bit 31 is the sign bit - meaning that the number is negative; all value bits zeroed means that the value is the minimum representable number, i.e. INT_MIN.
[*] GCC documents its implementation-defined behaviour in this case as follows:
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
This is not a C question, for on a C implementation featuring 32-bit two's complement representation for type int, the effect of applying the unary negation operator to an int having the value -2147483648 is undefined. That is, the C language specifically disavows designating the result of evaluating such an operation.
Consider more generally, however, how the unary - operator is defined in two's complement arithmetic: the inverse of a positive number x is formed by flipping all the bits of its binary representation and adding 1. This same definition serves as well for any negative number that has at least one bit other than its sign bit set.
Minor problems arise, however, for the two numbers that have no value bits set: 0, which has no bits set at all, and the number that has only its sign bit set (-2147483648 in 32-bit representation). When you flip all the bits of either of these, you end up with all value bits set. Therefore, when you subsequently add 1, the result overflows the value bits. If you imagine performing the addition as if the number were unsigned, treating the sign bit as a value bit, then you get
-2147483648 (decimal representation)
--> 0x80000000 (convert to hex)
--> 0x7fffffff (flip bits)
--> 0x80000000 (add one)
--> -2147483648 (convert to decimal)
Similar applies to inverting zero, but in that case the overflow upon adding 1 overflows the erstwhile sign bit, too. If the overflow is ignored, the resulting 32 low-order bits are all zero, hence -0 == 0.
I'm gonna use a 4-bit number, just to make maths simple, but the idea is the same.
In a 4-bit number, the possible values are between 0000 and 1111. That would be 0 to 15, but if you wanna represent negative numbers, the first bit is used to indicate the sign (0 for positive and 1 for negative).
So 1111 is not 15. As the first bit is 1, it's a negative number. To know its value, we use the two-complement method as already described in previous answers: "invert the bits and add 1":
inverting the bits: 0000
adding 1: 0001
0001 in binary is 1 in decimal, so 1111 is -1.
The two-complement method goes both ways, so if you use it with any number, it will give you the binary representation of that number with the inverted sign.
Now let's see 1000. The first bit is 1, so it's a negative number. Using the two-complement method:
invert the bits : 0111
add 1: 1000 (8 in decimal)
So 1000 is -8. If we do -(-8), in binary it means -(1000), which actually means using the two-complement method in 1000. As we saw above, the result is also 1000.
So, in a 4-bit number, -(-8) is equals -8.
In a 32-bit number, -2147483648 in binary is 1000..(31 zeroes), but if you use the two-complement method, you'll end up with the same value (the result is the same number).
That's why in 32-bit number -(-2147483648) is equals -2147483648
It depends on the version of C, the specifics of the implementation and whether we are talking about variables or literals values.
The first thing to understand is that there are no negative integer literals in C "-2147483648" is a unary minus operation followed by a positive integer literal.
Lets assume that we are running on a typical 32-bit platform where int and long are both 32 bits and long long is 64 bits and consider the expression.
(-(-2147483648) == -2147483648 )
The compiler needs to find a type that can hold 2147483648, on a comforming C99 compiler it will use type "long long" but a C90 compiler can use type "unsigned long".
If the compiler uses type long long then nothing overflows and the comparision is false. If the compiler uses unsigned long then the unsigned wraparound rules come into play and the comparision is true.
For the same reason that winding a tape deck counter 500 steps forward from 000 (through 001 002 003 ...) will show 500, and winding it backward 500 steps backward from 000 (through 999 998 997 ...) will also show 500.
This is two's complement notation. Of course, since 2's complement sign convention is to consider the topmost bit the sign bit, the result overflows the representable range, just like 2000000000+2000000000 overflows the representable range.
As a result, the processor's "overflow" bit will be set (seeing this requires access to the machine's arithmetic flags, generally not the case in most programming languages outside of assembler). This is the only value which will set the "overflow" bit when negating a 2's complement number: any other value's negation lies in the range representable by 2's complement.

About integer constants in the book "C: A reference manual"

In section 2.7.1 Integer constants, it says:
To illustrate some of the subtleties of integer constants, assume that
type int uses a 16-bit twos-complement representation, type long uses
a 32-bit twos-complement representation, and type long long uses a
64-bit twos-complement representation. We list in Table 2-6 some
interesting integer constants...
An interesting point to note from this table is that integers in the
range 2^15 through 2^16 - 1 will have positive values when written as
decimal constants but negative values when written as octal or
hexadecimal constants (and cast to type int).
But, as far as I know, integers in the range 2^15 - 2^16-1 written as hex/octal constants also have positive values when cast to type unsigned. Is the book wrong?
In the described setup, decimal literals in the range [32768,65535] have type long int, and hexadecimal literals in that range have type unsigned int.
So, the constant 0xFFFF is an unsigned int with value 65535, and the constant 65535 is a signed long int with value 65535.
I think your text is trying to discuss the cases:
(int)0xFFFF
(int)65535
Now, since int cannot represent the value 65535 both of these cause out-of-range conversion which is implementation-defined (or may raise an implementation-defined signal).
Most commonly (in fact, all 2's complement systems I've ever heard of), it will use a combination of truncation and reinterpretation in both of those cases, giving a value of -1.
So the last paragraph of your quote is a bit strange. 65535 and 0xFFFF are both large positive numbers; (int)0xFFFF and (int)65535 are (probably) both negative numbers; but if you cast one and don't cast the other then you get a discrepancy which is not surprising.

Maximum value of typedefed signed type

I was reading John Regehr's blog on how he gives his students an assignment about saturating arithmetic. The interesting part is that the code has to compile as-is while using typedefs to specify different integer types, see the following excerpt of the full header:
typedef signed int mysint;
//typedef signed long int mysint;
mysint sat_signed_add (mysint, mysint);
mysint sat_signed_sub (mysint, mysint);
The corresponding unsigned version is simple to implement (although I'm actually not sure if padding bits wouldn't make that problematic too), but I actually don't see how I can get the maximum (or minimum) value of an unknown signed type in C, without using macros for MAX_ und MIN_ or causing undefined behavior.
Am I missing something here or is the assignment just flawed (or more likely I'm missing some crucial information he gave his students)?
I don't see any way to do this without making assumptions or invoking implementation-defined (not necessarily undefined) behavior. If you assume that there are no padding bits in the representation of mysint or of uintmax_t, however, then you can compute the maximum value like this:
mysint mysint_max = (mysint)
((~(uintmax_t)0) >> (1 + CHAR_BITS * (sizeof(uintmax_t) - sizeof(mysint))));
The minimum value is then either -mysint_max (sign/magnitude or ones' complement) or -mysint_max - 1 (two's complement), but it is a bit tricky to determine which. You don't know a priori which bit is the sign bit, and there are possible trap representations that differ for different representations styles. You also must be careful about evaluating expressions, because of the possibility of "the usual arithmetic conversions" converting values to a type whose representation has different properties than those of the one you are trying to probe.
Nevertheless, you can distinguish the type of negative-value representation by computing the bitwise negation of the mysint representation of -1. For two's complement the mysint value of the result is 0, for ones' complement it is 1, and for sign/magnitude it is mysint_max - 1.
If you add the assumption that all signed integer types have the same kind of negative-value representation then you can simply perform such a test using an ordinary expression on default int literals. You don't need to make that assumption, however. Instead, you can perform the operation directly on the type representation's bit pattern, via a union:
union mysint_bits {
mysint i;
unsigned char bits[sizeof(mysint)];
} msib;
int counter = 0;
for (msib.i = -1; counter < sizeof(mysint); counter += 1) {
msib.bits[counter] = ~msib.bits[counter];
}
As long as the initial assumption holds (that there are no padding bits in the representation of type mysint) msib.i must then be a valid representation of the desired result.
I don't see a way to determine the largest and smallest representable values for an unknown signed integer type in C, without knowing something more. (In C++, you have std::numeric_limits available, so it is trivial.)
The largest representable value for an unsigned integer type is (myuint)(-1). That is guaranteed to work independent of padding bits, because (§ 6.3.1.3/1-2):
When a value with integer type is converted to another integer type… if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.
So to convert -1 to an unsigned type, you add one more than the maximum representable value to it, and that result must be the maximum representable value. (The standard makes it clear that the meaning of "repeatedly adding or subtracting" is mathematical.)
Now, if you knew that the number of padding bits in the signed type was the same as the number of padding bits in the unsigned type [but see below], you could compute the largest representable signed value from the largest representable unsigned value:
(mysint)( (myuint)(-1) / (myuint)2 )
Unfortunately, that's not enough to compute the minimum representable signed value, because the standard permits the minimum to be either one less than the negative of the maximum (2's-complement representation) or exactly the negative of the maximum (1's-complement or sign/magnitude representations).
Moreover, the standard does not actually guarantee that the number of padding bits in the signed type is the same as the number of padding bits in the unsigned type. All it guarantees is that the number of value bits in the signed type be no greater than the number of value bits in the unsigned type. In particular, it would be legal for the unsigned type to have one more padding bit than the corresponding signed type, in which case they would have the same number of value bits and the maximum representable values would be the same. [Note: a value bit is neither a padding bit nor the sign bit.]
In short, if you knew (for example by being told) that the architecture were 2's-complement and that corresponding signed and unsigned types had the same number of padding bits, then you could certainly compute both signed min and max:
myuint max_myuint = (myuint)(-1);
mysint max_mysint = (mysint)(max_myuint / (my_uint)2);
mysint min_mysint = (-max_mysint) - (mysint)1;
Finally, casting an out-of-range unsigned integer to a signed integer is not undefined behaviour, although most other signed overflows are. The conversion, as indicated by §6.3.1.3/3, is implementation-defined behaviour:
Otherwise, the new type is signed and the value cannot be represented in it; either the
result is implementation-defined or an implementation-defined signal is raised.
Implementation-defined behaviour is required to be documented by the implementation. So, suppose we knew that the implementation was gcc. Then we could examine the gcc documentation, where we would read the following, in the section "C Implementation-defined behaviour":
Whether signed integer types are represented using sign and
magnitude, two's complement, or one's complement, and whether the
extraordinary value is a trap representation or an ordinary value
(C99 6.2.6.2).
GCC supports only two's complement integer types, and all bit
patterns are ordinary values.
The result of, or the signal raised by, converting an integer to a
signed integer type when the value cannot be represented in an
object of that type (C90 6.2.1.2, C99 6.3.1.3).
For conversion to a type of width N, the value is reduced modulo
2^N to be within range of the type; no signal is raised.
Knowing that signed integers are 2s-complement and that unsigned to signed conversions will not trap, but will produce the expected pattern of low-order bits, we can find the maximum and minimum values for any signed type starting with the maximum representable value for the widest unsigned type, uintmax_t:
uintmax_t umax = (uintmax_t)(-1);
while ( (mysint)(umax) < 0 ) umax >>= 1;
mysint max_mysint = (mysint)(umax);
mysint min_mysint = (-max_mysint) - (mysint)1;
This is a suggestion for getting the MAX value of a specific type set with typedef without using any library
typedef signed int mysint;
mysint size; // will give the size of the type
size=sizeof(mysint)*(mysint)8-(mysint)1; // as it is signed so a bit
// will be deleted for sign bit
mysint max=1;//start with first bit
while(--size)
{
mysint temp;
temp=(max<<(mysint)1)|(mysint)1;// set all bit to 1
max=temp;
}
/// max will contain the max value of the type mysint
If you assume eight-bit chars and a two's complement representation (both reasonable on all modern hardware, with the exception of some embedded DSP stuff), then you just need to form an unsigned integer (use uintmax_t to make sure it's big enough) with sizeof(mysint)*8 - 1 1's in the bottom bits, then cast it to mysint. For the minimum value, negate the maximum value and subtract one.
If you don't want to assume those things, then it's still possible, but you'll need to do some more digging through limits.h to compensate for the size of chars and the sign representation.
I guess this should work irrespective of negative number representation
// MSB is 1 and rests are zero is minimum number in both 2's and 1's
// compliments representations.
mysint min = (1 << (sizeof(mysint) * 8 - 1));
mysint max = ~x;

Resources