In the following code, x and y are int32_t variables. In this simplified example, they always differ by 1. When they span the int32_t overflow boundary (0x7FFFFFFF, the max 2's compliment 32-bit positive number, to 0x80000000, the largest magnitude negative number), subtracting them seems to give different results when it is done inside the conditional of the if statement (Method 1) than it does if the result is stored in a temporary variable (Method 2). Why don't they give the same result?
I would think that subtracting two int32_t variables would yield a result of type int32_t, so using a temporary of that type shouldn't change anything. I tried explicitly typecasting inside the if statement conditional; that didn't change anything. FWIW, Method 2 gives the result I would expect.
The code:
int32_t x = (0x80000000 - 3);
int i;
for( i = 0; i < 5; ++i )
{
int32_t y = x + 1; // this may cause rollover from 0x7fffffff (positive) to 0x80000000 (negative)
UARTprintf("\n" "x = 0x%08X, y = 0x%08X", x, y );
if( ( y - x ) >= 1 ) // Method 1
UARTprintf(" - true ");
else
UARTprintf(" - FALSE");
int32_t z = ( y - x ); // Method 2
if( ( z ) >= 1 )
UARTprintf(" - true ");
else
UARTprintf(" - false");
++x;
}
Output:
x = 0x7ffffffd, y = 0x7ffffffe - true - true
x = 0x7ffffffe, y = 0x7fffffff - true - true
x = 0x7fffffff, y = 0x80000000 - FALSE - true
x = 0x80000000, y = 0x80000001 - true - true
x = 0x80000001, y = 0x80000002 - true - true
In my actual application (not this simplified example), y is incremented by a hardware timer and x is a record of when some code was last executed. The test is intended to make some code run at intervals. Considering that y represents time and the application may run for a very long time before it is restarted, just not letting it overflow isn't an option.
Noting, as several of you did, that the standard does not define the behavior when signed integer overflow occurs tells me that I don't have a right to complain that I can't count on it working the way I want it to, but it doesn't give me a solution I can count on. Even using a temporary variable, which seems to work with my current compiler version and settings, might quit working when one of those things changes. Do I have any trustworthy options short of resorting to assembly code?
Given that signed integer overflow leads to undefined behaviour - you better not try to explain it.
Because your assumptions are based on "common sense", not the standard.
Otherwise - check assembly and try to debug it, but again, the outcome would not be scalable: you won't be able to apply the new knowledge to some other case (but with no doubt it would be fun to do).
The question I didn't know enough to ask originally is, "How can I avoid undefined behavior when doing subtraction on integers that might have overflowed?" Please correct me if I am wrong, but it appears that the answer to that question would be "use unsigned rather than signed integers" because the results are well defined (per C11 6.2.5/9) "a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type."
In this case, that is enough to come up with a working solution because the elapsed time will always be zero or a small positive number therefore the result of the subtraction will always be positive. So the result of the subtraction could be kept as an unsigned number and compared ( ">=1" ) or converted back to a signed int for comparison (C11 6.3.1.3 "When a value with integer type is converted to another integer type ... if the value can be represented by the new type, it is unchanged." This code works, and I believe does not rely on any undefined behavior: "if( ( (int32_t)(uint32_t)y - (uint32_t)x ) >= 1 )"
In the more general case, however, converting to unsigned to do the subtraction, then converting back to a signed result (which might be negative) is not well defined. C11 6.3.1.3 says regarding converting to another integer type that if "the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised." So I can still imagine a scenario in which assembly code would be needed to achieve well-defined results; this just isn't one of them.
Related
This question already has answers here:
Comparison operation on unsigned and signed integers
(7 answers)
Closed 5 years ago.
I have a "C"code snippet as below
int32_t A = 5;
uint32_t B = 8;
if ( A >= B )
{
printf("Test");
}
When i build this i received an remark/warning as "comparison between signed and unsigned operands.Can any one address this issue?
Everything is ok while A is positive and B is less than 2^31.
But, if A is less than 0, then unexpected behavior occurs.
A = -1, in memory it will be saved as 0xFFFFFFFF.
B = 5, in memory it will be saved as 0x00000005.
When you do
if (A < B) {
//Something, you are expecting to be here
}
Compiler will compare them as unsigned 32-bit integer and your if will be expanded to:
if (0xFFFFFFFF < 0x00000005) {
//Do something, it will fail.
}
Compiler warns you about this possible problem.
Comparison operation on unsigned and signed integers
Good, very good! You are reading and paying attention to your compiler warnings.
In your code:
int32_t A = 5;
uint32_t B = 8;
if ( A >= B )
{
printf("Test");
}
You have 'A' as a signed int32_t value with min/max values of -2147483648/2147483647 and you have and unsigned uint32_t with min/max of 0/4294967295, respectively. The compiler generates the warning to guard against cases that are always true or false based on the types involved. Here A can never be greater than B for any values in the allowable range of B from 2147483648 - 4294967295. That whole swath of numbers will provide False regardless of the individual values involved.
Another great example would be if ( A < B ) which produces a TRUE for all values of A from -2147483648 - -1 because the unsigned type can never be less than zero.
The compiler warnings are there to warn that testing with these types may not provide valid comparisons for certain ranges of numbers -- that you might not have anticipated.
In the real world, if you know A is only holding values from 0 - 900, then you can simply tell the compiler that 1) you understand the warning and by your cast will 2) guarantee the values will provide valid tests, e.g.
int32_t A = 5;
uint32_t B = 8;
if (A >= 0 ) {
if ( (uint32_t)A >= B )
printf("Test");
}
else
/* handle error */
If you cannot make the guarantees for 1) & 2), then it is time to go rewrite the code in a way you are not faced with the warning.
Two good things happened here. You had compiler warnings enabled, and you took the time to read and understand what the compiler was telling you. This will come up time and time again. Now you know how to approach a determination of what can/should be done.
For example, there are 3 variables of long type, we add a and b and get s:
long a, b, s;
...
s = a + b
Now what does ((s^a) < 0 && (s^b) < 0) mean?
I saw a check like this in the source code of Python:
if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
/* INLINE: int + int */
register long a, b, i;
a = PyInt_AS_LONG(v);
b = PyInt_AS_LONG(w);
i = a + b;
if ((i^a) < 0 && (i^b) < 0)
goto slow_iadd;
x = PyInt_FromLong(i);
}
This code is wrong.
Assuming the usual 2's-complement rules of bitwise XOR for signed integers, then
(s^a) < 0
is the case if s and a have their sign bits set to opposite values. Thus,
((s^a) < 0 && (s^b) < 0)
indicates that s has sign different from both a and b, which must then have equal sign (pretending 0 is positive). If you added two integers of equal sign and got a result of different sign, there must have been an overflow, so this is an overflow check.
If we assume that signed overflow wraps, then s has opposite sign from a and b exactly when overflow has occurred. However, signed overflow is undefined behavior. Computing s is already wrong; we need to check whether the overflow would occur without actually performing the operation.
Python isn't supposed to do this. You can see what it's supposed to do in the Python 2 source code for int.__add__:
/* casts in the line below avoid undefined behaviour on overflow */
x = (long)((unsigned long)a + b);
if ((x^a) >= 0 || (x^b) >= 0)
It's supposed to cast to unsigned to get defined overflow behavior. The cast-to-unsigned fix was introduced in 5 different places as a result of issue 7406 on the Python bug tracker, but it looks like they missed a spot, or perhaps INPLACE_ADD was changed since then. I've left a message on the tracker.
I don't see how this is possible.
If (s^a) < 0, then the sign bit of either s or a must be a 1, so either s or a (but not both) must be negative. Same for s and b. So either s is negative and both a and b are positive, or s is positive and both a and b are negative. Both situations seem impossible.
Unless you count integer overflow/underflow, of course.
I have encountered following code in FORTRAN77
(http://www-thphys.physics.ox.ac.uk/people/SubirSarkar/bbn/fastbbn.f):
Update = 1.
do k=1,12
Update = Update + alpha(i,k,x,effN)*(R(k)-1.)/1.
enddo
Y = Y * Update
I am wondering about the division by 1.! Whats the reason?
I have translated to C as follows:
double Update = 1.;
for ( int k = 0; k < 12; ++k )
Update += alpha(i+1, k+1, x, effN) * (R[k]-1.) /*/ 1.*/; // CHECK!
Y *= Update;
Is that correct?
remark: due to different array indexing in C, there is a shift of +1 or -1 in the arrax index in comparison to the original code (I wanted to keep the same value as in the original code for the definition of the index and so for the index passed as arguments to function)
Thank you for your help!
Alain
The division by 1. has no effect that I can discern. Any type promotions that it might otherwise require are already required by the -1. in the dividend.
It is conceivable that on some specific platforms the division triggers some kind of desired behavior when the dividend has an exceptional value (i.e. an infinity or NaN), but that would be highly platform-specific.
It is also conceivable that the division is a holdover from some earlier version of the code where it actually had some effect.
Either way, your translation appears to be equivalent to the Fortran version, EXCEPT that nothing in what you presented justifies changing function alpha()'s first argument from i to i+1.
I'm just starting to learn C at school, I'm trying to get a hold of the basic concepts.
Our homework has a question,
for every int x: x+1 > x
Determine whether true or false, give reasoning if true and counterexample if false.
I'm confused because we were taught that the type int is of 32-bits and basically that means the integer is in binary format. Is x+1 adding 1 to the decimal value of 1?
x + 1 > x
is 1 for every int value except for value INT_MAX where INT_MAX + 1 is an overflow and therefore x + 1 > x expression is undefined behavior for x value of INT_MAX.
This actually means a compiler has the right to optimize out the expression:
x + 1 > x
by
1
As INT_MAX + 1 is undefined behavior, the compiler has the right to say that for this specific > expression INT_MAX + 1 is > INT_MAX.
As the x + 1 > x expression is undefined behavior for x == INT_MAX, it is also not safe to assume x + 1 > x can be false (0).
Note that if x was declared as an unsigned int instead of an int the situation is completely different. unsigned int operands never overflow (they wrap around): UINT_MAX + 1 == 0 and therefore x + 1 > x is 0 for x == UINT_MAX and 1 for all the other x values.
Modern compilers (like gcc) usually take the opportunity to optimize this expression and replace it with 1.
For the record, there was some serious security issues with known server programs using code like:
if (ptr + offset < ptr)
The code was meant to trigger a safety condition but the compiler would optimize out the if statement (by replacing the expression with 0) and it allowed an attacker to gain privilege escalation in the server program (by opening the possibility of an exploitable buffer overflow if I remember correctly).
Note for 32-bit number range is [-2147483648, 2147483647] that is equals to [-231, 231 -1 ].
So for expression x+1 > x is true for [-2147483648, 2147483646]
But not for 2147483647 because adding to 2147483647 in 32-bit size number causes bit overflow many implementations it makes x + 1 to -2147483648 But really behavior is
Undefined in C standard.
So,
x + 1 > x True for x in [-2147483648, 2147483646] only
x + 1 > x , for x = 2147483647 is Undefined value may be True or False depends on compiler. If a compiler calculates = -2147483648 value will be False.
I don't want to hand you the answer, so I'll reply with a question that should get you on the right track.
What is x + 1 when x is the largest possible value that can be stored in a 32-bit signed integer? (2,147,483,647)
Yes, x + 1 adds to the decimal value of 1.
This will be true almost all of the time. But if you add 1 to INT_MAX (which is 215 - 1 or greater), you might flip the sign. Think about the decimal representation of 0111111 versus 11111111. (Obviously not 32 bits, but the ideas hold.)
Look up two's complement if you're confused about why it flips. It's a pretty clever implementation of integers that makes addition easy.
EDIT: INT_MAX + 1 is undefined behavior. Doesn't necessarily become INT_MIN. But since x + 1 is not necessarily > x when x == INT_MAX, then the answer is clearly false!
Following is the program that raised the mentioned doubt for me.
#include <stdio.h>
int main() {
int g = 300000*300000/300000;
printf("%d",g);
return 0;
}
When the * is evaluated the result would be 90000000000. Then is divided by 300000.
I expected the first expression result to be stored somewhere then divided by 300000. So output would be 300000.
But it is giving me -647.
Does this mean it is evaluated as :
g = 300000*300000;
g = g / 300000;
Regardless of where it's stored, it's still of type int. Assuming int is 32-bits on your machine, you're getting integer overflow with 300000*300000.
300000*300000 -> 90000000000 -> -194313216 (integer overflow)
-194313216 / 300000 -> -647
Basically, temporaries (or intermediates) don't magically allow you to get around overflow.
*Note that signed integer overflow is technically undefined behavior. But in this case it happens to wrap-around the way you'd expect.
Both 3000000 are of type int, so the result is tried to be matched into an int as well. And that leads to a register overflow. The / 300000 afterwards doesn't help any longer.
You might use 3000000ll for one of the factors to make it a long long.