why do integers have different behaviors when they overflow [duplicate] - c

This question already has answers here:
Why is unsigned integer overflow defined behavior but signed integer overflow isn't?
(6 answers)
Closed 7 years ago.
A c programming book that I'm reading(c programming, a modern approach 2nd edition) says that when an "overflow occurs during an operation on unsigned integers, though, the result is defined."
Here is a small code example
#include <stdio.h>
int main()
{
unsigned short int x = 65535; // The unsigned short int is at the maximum possible range
x += 1; // If I add one to it will overflow.
printf("%u", x); // the output will be zero or one if decide to add plus one again to x
return 0;
}
He then goes to say that "for signed integers, the behaviors for these integers are not defined". Meaning the program can either print out the incorrect result or it can crash the program.
Why is this so?

It comes down to hardware representation, and there being more than one way to represent signed integral types in binary (sign magnitude, ones complement, twos complement) and operations on them. Those have quite different implications when an overflow occurs (e.g. triggering a hardware trap, working with modulo, etc).
All of the obvious means of representing unsigned integral values in binary and implementing numerical operations on such values have the same consequence - essentially that numeric operations in hardware work with a modulo arithmetic.
For basic types (and other things) the standard generally allows freedom to compiler vendors when there is more than one feasible way of implementing something, and those options have different consequences. There are multiple ways with signed integral types, and real-world hardware that uses each approach. They are different enough to warrant the behaviour being undefined (as that term is defined in the standard).

Related

Effect of type casting on printf function

Here is a question from my book,
Actually, I don't know what will be the effect on printf function, so I tried the statements in the original system of C lang. Here is my code:
#include <stdio.h>
void main() {
int x = 4;
printf("%hi\n", x);
printf("%hu\n", x);
printf("%i\n", x);
printf("%u\n", x);
printf("%li\n", x);
printf("%lu\n", x);
}
Try it online!
So, the output is very simple. But, is this really the solution to above problem?
There are numerous problems in this question that make it unsuitable for teaching C.
First, to work on this problem at all, we have to assume a non-standard C implementation is used. In standard C, %x is a complete conversion specification, so %xu and %xd cannot be; the conversion specification has already ended before the u or d. And the uses of z in a conversion specification interferes with its standard use for size_t.
Nonetheless, let’s assume this C variant does not have those standard conversion specifications and instead uses the ones shown in the table but that this C variant otherwise conforms to the C standard with minimal changes.
Our next problem is that, in Y num = 42;, we have a plain Y, not the signed Y or unsigned Y shown in the table. Let’s assume signed Y is intended.
Then num is a signed four-bit integer. The greatest value it can represent is 01112 = 710. So it cannot represent 42. Attempting to initialize it with 42 results in a conversion specified by C 2018 6.3.1.3, which says, in part:
Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
The result is we do not know what value is in num or even whether the program continues to execute; it may trap and terminate.
Well, let’s assume this implementation just takes the low bits of the value. 42 is 1010102, so its low four bits are 1010. So if the bits in num are 1010, it is negative. The C standard permits several methods of representation for negative numbers, but we will assume the overwhelmingly most common one, two’s complement, so the bits 1010 in num represent −6.
Now, we get to the printf statements. Except the problem text shows Printf, which is not defined by the C standard. (Are you sure this problem relates to C code at all?) Let’s assume it means printf.
In printf("%xu",num);, if the conversion specification is supposed to work like the ones in standard C, then the corresponding argument should be an unsigned X value that has been promoted to int for the function call. As a two-bit unsigned integer, an unsigned X can represent 0, 1, 2, or 3. Passing it −6 is not defined. So we do not know what the program will print. It might take just the low two bits, 10, and print “2”. Or it might use all the bits and print “-6”. Both of those would be consistent with the requirement that the printf behave as specified for values that are in the range representable by unsigned X.
In printf("%xd",num); and printf("%yu",num);, the same problem exists.
In printf("%yd",num);, we are correctly passing a signed Y value for a signed Y conversion specification, so “-6” is printed.
Then printf("%zu",num); has the same problem with the value mismatched for the type.
Finally, in printf("%zd",num);, the value is again in the correct range, and “-6” is printed.
From all the assumptions we had to make and all the points where the behavior is undefined, you can see this is a terrible exercise. You should question the quality of the book it is in and of any school using it.

Why shifting a negative value with literal is giving [-Wshift-negative-value] warning

I am doing a bitwise left shift operation on negative number.
int main(void) {
int count = 2;
printf("%d\n", ~0<<count);
printf("%d\n", ~0<<2); // warning:shifting a negative signed value is undefined [-Wshift-negative-value]
return 0;
}
My doubt is why the warning is coming on compiling above code when integer literal is used in shifting and not when variable is used.
Under C89, ones'-complement and sign-magnitude implementations were required to process left shifts of negative values in ways that may not have been the most logical on those platforms. For example, on a ones'-complement platform, C89 defined -1<<1 as -3. The authors of the Standard decided to correct this problem by allowing compiler writers to handle left shifts of negative numbers in any way they saw fit. The fact that they allowed that flexibility to all implementations including two's-complement ones shouldn't be taken to imply that they intended that two's-complement implementations to deviate from the C89 behavior. Much more likely, they intended and expected that the sensible behavior on two's-complement platforms would be sufficiently obvious that compiler writers would figure it out with or without a mandate.
Compilers often squawk about left-shifting negative constants by other constants because x<<y can be simplified when both x and y are constants, but such simplification would require performing the shift at compile time whether or not the code containing the shift is executed. By contrast, given someConstant << nonConstant, no simplification would usually be possible and thus the compiler would simply generate code that does the shift at run-time.

Defending "U" suffix after Hex literals

There is some debate between my colleague and I about the U suffix after hexadecimally represented literals. Note, this is not a question about the meaning of this suffix or about what it does. I have found several of those topics here, but I have not found an answer to my question.
Some background information:
We're trying to come to a set of rules that we both agree on, to use that as our style from that point on. We have a copy of the 2004 Misra C rules and decided to use that as a starting point. We're not interested in being fully Misra C compliant; we're cherry picking the rules that we think will most increase efficiency and robustness.
Rule 10.6 from the aforementioned guidelines states:
A “U” suffix shall be applied to all constants of unsigned type.
I personally think this is a good rule. It takes little effort, looks better than explicit casts and more explicitly shows the intention of a constant. To me it makes sense to use it for all unsigned contants, not just numerics, since enforcing a rule doesn't happen by allowing exceptions, especially for a commonly used representation of constants.
My colleague, however, feels that the hexadecimal representation doesn't need the suffix. Mostly because we almost exclusively use it to set micro-controller registers, and signedness doesn't matter when setting registers to hex constants.
My Question
My question is not one about who is right or wrong. It is about finding out whether there are cases where the absence or presence of the suffix changes the outcome of an operation. Are there any such cases, or is it a matter of consistency?
Edit: for clarification; Specifically about setting micro-controller registers by assigning hexadecimal values to them. Would there be a case where the suffix could make a difference there? I feel like it wouldn't. As an example, the Freescale Processor Expert generates all register assignments as unsigned.
Appending a U suffix to all hexadecimal constants makes them unsigned as you already mentioned. This may have undesirable side-effects when these constants are used in operations along with signed values, especially comparisons.
Here is a pathological example:
#define MY_INT_MAX 0x7FFFFFFFU // blindly applying the rule
if (-1 < MY_INT_MAX) {
printf("OK\n");
} else {
printf("OOPS!\n");
}
The C rules for signed/unsigned conversions are precisely specified, but somewhat counter-intuitive so the above code will indeed print OOPS.
The MISRA-C rule is precise as it states A “U” suffix shall be applied to all constants of unsigned type. The word unsigned has far reaching consequences and indeed most constants should not really be considered unsigned.
Furthermore, the C Standard makes a subtile difference between decimal and hexadecimal constants:
A hexadecimal constant is considered unsigned if its value can be represented by the unsigned integer type and not the signed integer type of the same size for types int and larger.
This means that on 32-bit 2's complement systems, 2147483648 is a long or a long long whereas 0x80000000 is an unsigned int. Appending a U suffix may make this more explicit in this case but the real precaution to avoid potential problems is to mandate the compiler to reject signed/unsigned comparisons altogether: gcc -Wall -Wextra -Werror or clang -Weverything -Werror are life savers.
Here is how bad it can get:
if (-1 < 0x8000) {
printf("OK\n");
} else {
printf("OOPS!\n");
}
The above code should print OK on 32-bit systems and OOPS on 16-bit systems. To make things even worse, it is still quite common to see embedded projects use obsolete compilers which do not even implement the Standard semantics for this issue.
For your specific question, the defined values for micro-processor registers used specifically to set them via assignment (assuming these registers are memory-mapped), need not have the U suffix at all. The register lvalue should have an unsigned type and the hex value will be signed or unsigned depending on its value, but the operation will proceed the same. The opcode for setting a signed number or an unsigned number is the same on your target architecture and on any architectures I have ever seen.
With all integer-constants
Appending u/U insures the integer-constant will be some unsigned type.
Without a u/U
For a decimal-constant, the integer-constant will be some signed type.
For a hexadecimal/octal-constant, the integer-constant will be signed or unsigned type, depending of value and integer type ranges.
Note: All integer-constants have positive values.
// +-------- unary operator
// |+-+----- integer-constant
int x = -123;
absence or presence of the suffix changes the outcome of an operation?
When is this important?
With various expressions, the sign-ness and width of the math needs to be controlled and preferable not surprising.
// Examples: assume 32-bit `unsigned`, `long`, 64-bit `long long`
// Bad signed int overflow (UB)
unsigned a = 4000 * 1000 * 1000;
// OK
unsigned b = 4000u * 1000 * 1000;
// undefined behavior
unsigned c = 1 << 31
// OK
unsigned d = 1u << 31
printf("Size %zu\n", sizeof(0xFFFFFFFF)); // 8 type is `long long`
printf("Size %zu\n", sizeof(0xFFFFFFFFu)); // 4 type is `unsigned`
// 2 ** 63
long long e = -9223372036854775808; // C99: bad "9223372036854775808" not representable
long long f = -9223372036854775807 - 1; // ok
long long g = -9223372036854775808u; // implementation defined behavior **
some_unsigned_type h_max = -1; OK, max value for the target type.
some_unsigned_type i_max = -1u; OK, but not max value for wide unsigned types
// when negating a negative `int`
unsigned j = 0 - INT_MIN; // typically int overflow or UB
unsigned k = 0u - INT_MIN; // Never UB
** or an implementation-defined signal is raised.
For the specific question, which was loading register(s), then the U makes it an unsigned value, but whether the compiler treats the n-bit word pattern as a signed or unsigned value it will move the same bit pattern, assuming there isn't any size extension that would propagate an MSB. The difference that might matter is if the register load operation will set any processor condition flags based on a signed or unsigned loading. As an overall guide if the processor supports storing a constant to configuration register or a memory address then loading a peripheral register is unlikely to set the processor's NEG condition flag. Loading a general purpose register connected to an ALU, one that can be the target of an arithmetic operation like add increment or decrement, might set a negative flag on loading so that e.g. a trailing "branch (if) negative" opcode would execute the branch. You would want to check the processor's references to be sure. Small instruction set processors tend to have only a load register instruction, while larger instruction sets are more likely to have a load unsigned variant of the load instruction that doesn't set the NEG bit in the processor's flags, but again, check the processor's references. if you don't have access to the processor's errata (the boo-boo list) and need a specific flag state. All of this only tends to come up when an optimizing compiler re-aranges code with an inline assembly instruction and other uncommon situations. Examine the generate assembly code, turn off some or all compiler optimizations for the module when needed, etc.

C long double in golang

I am porting an algorithm from C to Go. And I got a little bit confused. This is the C function:
void gauss_gen_cdf(uint64_t cdf[], long double sigma, int n)
{
int i;
long double s, d, e;
//Calculations ...
for (i = 1; i < n - 1; i++) {
cdf[i] = s;
}
}
And in the for loop value "s" is assigned to element "x" the array cdf. How is this possible? As far as I know, a long double is a float64 (in the Go context). So I shouldn't be able to compile the C code because I am assigning an long double to an array which just contains uint64 elements. But the C code is working fine.
So can someone please explain why this is working?
Thank you very much.
UPDATE:
The original C code of the function can be found here: https://github.com/mjosaarinen/hilabliss/blob/master/distribution.c#L22
The assignment cdf[i] = s performs an implicit conversion to uint64_t. It's hard to tell if this is intended without the calculations you omitted.
In practice, long double as a type has considerable variance across architectures. Whether Go's float64 is an appropriate replacement depends on the architecture you are porting from. For example, on x86, long double is an 80-byte extended precision type, but Windows systems are usually configured in such a way to compute results only with the 53-bit mantissa, which means that float64 could still be equivalent for your purposes.
EDIT In this particular case, the values computed by the sources appear to be static and independent of the input. I would just use float64 on the Go side and see if the computed values are identical to those of the C version, when run on a x86 machine under real GNU/Linux (virtualization should be okay), to work around the Windows FPU issues. The choice of x86 is just a guess because it is likely what the original author used. I do not understand the underlying cryptography, so I can't say whether a difference in the computed values impact the security. (Also note that the C code does not seem to properly seed its PRNG.)
C long double in golang
The title suggests an interest in whether of not Go has an extended precision floating-point type similar to long double in C.
The answer is:
Not as a primitive, see Basic types.
But arbitrary precision is supported by the math/big library.
Why this is working?
long double s = some_calculation();
uint64_t a = s;
It compiles because, unlike Go, C allows for certain implicit type conversions. The integer portion of the floating-point value of s will be copied. Presumably the s value has been scaled such that it can be interpreted as a fixed-point value where, based on the linked library source, 0xFFFFFFFFFFFFFFFF (2^64-1) represents the value 1.0. In order to make the most of such assignments, it may be worthwhile to have used an extended floating-point type with 64 precision bits.
If I had to guess, I would say that the (crypto-related) library is using fixed-point here because they want to ensure deterministic results, see: How can floating point calculations be made deterministic?. And since the extended-precision floating point is only being used for initializing a lookup table, using the (presumably slow) math/big library would likely perform perfectly fine in this context.

Integer type with floating point semantics for C or D

I'm looking for an existing implementation for C or D, or advice in implementing, signed and/or unsigned integer types with floating point semantics.
That is to say, an integer type that behaves as floating point types do when doing arithmetic: Overflow produces infinity (-infinity for signed underflow) rather than wrapping around or having undefined behavior, undefined operations produce NaN, etc.
In essence, a version of floating point where the distribution of presentable numbers falls evenly on the number line, instead of conglomerating around 0.
In addition, all operations should be deterministic; any given two's complement 32-bit architecture should produce the exact same result for the same computation, regardless of its implementation (whereas floating point may, and often will, produce slightly differing results).
Finally, performance is a concern, which has me worried about potential "bignum" (arbitrary-precision) solutions.
See also: Fixed-point and saturation arithmetic.
I do not know of any existing implementations of this.
But I would imagine implementing it would be a matter of (in D):
enum CheckedIntState : ubyte
{
ok,
overflow,
underflow,
nan,
}
struct CheckedInt(T)
if (isIntegral!T)
{
private T _value;
private CheckedIntState _state;
// Constructors, getters, conversion helper methods, etc.
// And a bunch of operator overloads that check the
// result on every operation and yield a CheckedInt!T
// with an appropriate state.
// You'll also want to overload opEquals and opCmp and
// make them check the state of the operands so that
// NaNs compare equal and so on.
}
Saturating arithmetic does what you want except for the part where undefined operations produce NaN; this is going to turn out to be problematic, because most saturating implementations use the full number range, and so there are not values left over to reserve for NaN. Thus, you probably can't easily build this on the back of saturating hardware instructions unless you have an additional "is this value NaN" field, which is rather wasteful.
Assuming that you're wedded to the idea of NaN values, all of the edge case detection will probably need to happen in software. For most integer operations, this is pretty straightforward, especially if you have a wider type available (let's assume long long is strictly larger than whatever integer type underlies myType):
myType add(myType x, myType y) {
if (x == positiveInfinity && y == negativeInfinity ||
x == negativeInfinity && y == positiveInfinity)
return notANumber;
long long wideResult = x + y;
if (wideResult >= positiveInfinity) return positiveInfinity;
if (wideResult <= negativeInfinity) return negativeInfinity;
return (myType)wideResult;
}
One solution might be to implement multiple-precision arithmetic with abstract data types. The book C Interfaces and Implementations by David Hanson has a chapter (interface and implementation) of MP arithmetic.
Doing calculations using scaled integers is also a possibility. You might be able to use his arbitrary-precision arithmetic, although I believe this implementation can't overflow. You could run out of memory, but that's a different problem.
In either case, you might need to tweak the code to return exactly what you want on overflow and such.
Source code (MIT license)
That page also has a link to buy the book from amazon.com.
Half of the requirements are satisfied in saturating arithmetic, which are implemented in e.g. ARM instructions, MMX and SSE.
As also pointed out by Stephen Canon, one needs additional elements to check overflow / NaN. Some instruction sets (Atmel at least) btw have a sticking flag to test for overflows (could be used to differentiate inf from max_int). And perhaps "Q" + 0 could mark for NaN.

Resources