This question already has answers here:
Comparison operation on unsigned and signed integers
(7 answers)
Closed 5 years ago.
I have a "C"code snippet as below
int32_t A = 5;
uint32_t B = 8;
if ( A >= B )
{
printf("Test");
}
When i build this i received an remark/warning as "comparison between signed and unsigned operands.Can any one address this issue?
Everything is ok while A is positive and B is less than 2^31.
But, if A is less than 0, then unexpected behavior occurs.
A = -1, in memory it will be saved as 0xFFFFFFFF.
B = 5, in memory it will be saved as 0x00000005.
When you do
if (A < B) {
//Something, you are expecting to be here
}
Compiler will compare them as unsigned 32-bit integer and your if will be expanded to:
if (0xFFFFFFFF < 0x00000005) {
//Do something, it will fail.
}
Compiler warns you about this possible problem.
Comparison operation on unsigned and signed integers
Good, very good! You are reading and paying attention to your compiler warnings.
In your code:
int32_t A = 5;
uint32_t B = 8;
if ( A >= B )
{
printf("Test");
}
You have 'A' as a signed int32_t value with min/max values of -2147483648/2147483647 and you have and unsigned uint32_t with min/max of 0/4294967295, respectively. The compiler generates the warning to guard against cases that are always true or false based on the types involved. Here A can never be greater than B for any values in the allowable range of B from 2147483648 - 4294967295. That whole swath of numbers will provide False regardless of the individual values involved.
Another great example would be if ( A < B ) which produces a TRUE for all values of A from -2147483648 - -1 because the unsigned type can never be less than zero.
The compiler warnings are there to warn that testing with these types may not provide valid comparisons for certain ranges of numbers -- that you might not have anticipated.
In the real world, if you know A is only holding values from 0 - 900, then you can simply tell the compiler that 1) you understand the warning and by your cast will 2) guarantee the values will provide valid tests, e.g.
int32_t A = 5;
uint32_t B = 8;
if (A >= 0 ) {
if ( (uint32_t)A >= B )
printf("Test");
}
else
/* handle error */
If you cannot make the guarantees for 1) & 2), then it is time to go rewrite the code in a way you are not faced with the warning.
Two good things happened here. You had compiler warnings enabled, and you took the time to read and understand what the compiler was telling you. This will come up time and time again. Now you know how to approach a determination of what can/should be done.
Related
I'm making a function that takes a value using scanf_s and converts that into a binary value. The function works perfectly... until I put in a really high value.
I'm also doing this on VS 2019 in x64 in C
And in case it matters, I'm using
main(int argc, char* argv[])
for the main function.
Since I'm not sure what on earth is happening, here's the whole code I guess.
BinaryGet()
{
// Declaring lots of stuff
int x, y, z, d, b, c;
int counter = 0;
int doubler = 1;
int getb;
int binarray[2000] = { 0 };
// I only have to change things to 1 now, am't I smart?
int binappend[2000] = { 0 };
// Get number
printf("Gimme a number\n");
scanf_s("%d", &getb);
// Because why not
printf("\n");
// Get the amount of binary places to be used (how many times getb divides by 2)
x = getb;
while (x > 1)
{
d = x;
counter += 1;
// Tried x /= 2, gave me infinity loop ;(
x = d / 2;
}
// Fill the array with binary values (i.e. 1, 2, 4, 8, 16, 32, etc)
for (b = 1; b <= counter; b++)
{
binarray[b] = doubler * 2;
doubler *= 2;
}
// Compare the value of getb to binary values, subtract and repeat until getb = 0)
c = getb;
for (y = counter; c >= 1; y--)
{
// Printing c at each subtraction
printf("\n%d\n", c);
// If the value of c (a temp variable) compares right to the binary value, subtract that binary value
// and put a 1 in that spot in binappend, the 1 and 0 list
if (c >= binarray[y])
{
c -= binarray[y];
binappend[y] += 1;
}
// Prevents buffer under? runs
if (y <= 0)
{
break;
}
}
// Print the result
for (z = 0; z <= counter; z++)
{
printf("%d", binappend[z]);
}
}
The problem is that when I put in the value 999999999999999999 (18 digits) it just prints 0 once and ends the function. The value of the digits doesn't matter though, 18 ones will have the same result.
However, when I put in 17 digits, it gives me this:
99999999999999999
// This is the input value after each subtraction
1569325055
495583231
495583231
227147775
92930047
25821183
25821183
9043967
655359
655359
655359
655359
131071
131071
131071
65535
32767
16383
8191
4095
2047
1023
511
255
127
63
31
15
7
3
1
// This is the binary
1111111111111111100100011011101
The binary value it gives me is 31 digits. I thought that it was weird that at 32, a convenient number, it gimps out, so I put in the value of the 32nd binary place minus 1 (2,147,483,647) and it worked. But adding 1 to that gives me 0.
Changing the type of array (unsigned int and long) didn't change this. Neither did changing the value in the brackets of the arrays. I tried searching to see if it's a limit of scanf_s, but found nothing.
I know for sure (I think) it's not the arrays, but probably something dumb I'm doing with the function. Can anyone help please? I'll give you a long-distance high five.
The problem is indeed related to the power-of-two size of the number you've noticed, but it's in this call:
scanf_s("%d", &getb);
The %d argument means it is reading into a signed integer, which on your platform is probably 32 bits, and since it's signed it means it can go up to 2³¹-1 in the positive direction.
The conversion specifiers used by scanf() and related functions can accept larger sizes of data types though. For example %ld will accept a long int, and %lld will accept a long long int. Check the data type sizes for your platform, because a long int and an int might actually be the same size (32 bits) eg. on Windows.
So if you use %lld instead, you should be able to read larger numbers, up to the range of a long long int, but make sure you change the target (getb) to match! Also if you're not interested in negative numbers, let the type system help you out and use an unsigned type: %llu for an unsigned long long.
Some details:
If scanf or its friends fail, the value in getb is indeterminate ie. uninitialised, and reading from it is undefined behaviour (UB). UB is an extremely common source of bugs in C, and you want to avoid it. Make sure your code only reads from getb if scanf tells you it worked.
In fact, in general it is not possible to avoid UB with scanf unless you're in complete control of the input (eg. you wrote it out previously with some other, bug free, software). While you can check the return value of scanf and related functions (it will return the number of fields it converts), its behaviour is undefined if, say, a field is too large to fit into the data type you have for it.
There's a lot more detail on scanf etc. here.
To avoid problems with not knowing what size an int is, or if a long int is different on this platform or that, there is also the header stdint.h which defines integer types of a specific width eg. int64_t. These also have macros for use with scanf() like SCNd64. These are available from C99 onwards, but note that Windows' support of C99 in its compilers is incomplete and may not include this.
Don't be so hard on yourself, you're not dumb, C is a hard language to master and doesn't follow modern idioms that have developed since it was first designed.
I want the difference between two unbounded integers, each represented by a uint32_t value which is the unbounded integer taken modulo 2^32. As in, for example, TCP sequence numbers. Note that the modulo 2^32 representation can wrap around 0, unlike more restricted questions that do not allow wrapping around 0.
Assume that the difference between the underlying unbounded integers are in the range of a normal int. I want this signed difference value. In other words, return a value within the normal int range that is equivalent to the difference of the two uint32_t inputs modulo 2^32.
For example, 0 - 0xffffffff = 1 because we assume that the underlying unbounded integers are in int range. Proof: if A mod 2^32 = 0 and B mod 2^32 = 0xffffffff, then (A=0, B=-1) (mod 2^32) and therefore (A-B=1) (mod 2^32) and in the int range this modulo class has the single representative 1.
I have used the following code:
static inline int sub_tcp_sn(uint32_t a, uint32_t b)
{
uint32_t delta = a - b;
// this would work on most systems
return delta;
// what is the language-safe way to do this?
}
This works on most systems because they use modulo-2^32 representations for both uint and int, and a normal modulo-2^32 subtraction is the only reasonable assembly code to generate here.
However, I believe that the C standard only defines the result of the above code if delta>=0. For example on this question one answer says:
If we assign an out-of-range value to an object of signed type, the
result is undefined. The program might appear to work, it might crash,
or it might produce garbage values.
How should a modulo-2^32 conversion from uint to int be done according to the C standard?
Note: I would prefer the answer code not to involve conditional expressions, unless you can prove it's required. (case analysis in the explanation of the code is OK).
There must be a standard function that does this... but in the meantime:
#include <stdint.h> // uint32_t
#include <limits.h> // INT_MAX
#include <assert.h> // assert
static inline int sub_tcp_sn(uint32_t a, uint32_t b)
{
uint32_t delta = a - b;
return delta <= INT_MAX ? delta : -(int)~delta - 1;
}
Note that it is UB in the case that the result is not representable, but the question said that was OK.
If the system has a 64-bit long long type, then the range can easily be customized and checked as well:
typedef long long sint64_t;
static inline sint64_t sub_tcp_sn_custom_range(uint32_t a, uint32_t b,
sint64_t out_min, sint64_t out_max)
{
assert(sizeof(sint64_t) == 8);
uint32_t delta = a - b;
sint64_t result = delta <= out_max ? delta : -(sint64_t)-delta;
assert(result >= out_min && result <= out_max);
return result;
}
For example, sub_tcp_sn_custom_range(0x10000000, 0, -0xf0000000LL, 0x0fffffffLL) == -0xf00000000.
With the range customization, this solution minimizes range loss in all situations, assuming timestamps behave linearly (for example, no special meaning to wrapping around 0) and a singed 64-bit type is available.
In the following code, x and y are int32_t variables. In this simplified example, they always differ by 1. When they span the int32_t overflow boundary (0x7FFFFFFF, the max 2's compliment 32-bit positive number, to 0x80000000, the largest magnitude negative number), subtracting them seems to give different results when it is done inside the conditional of the if statement (Method 1) than it does if the result is stored in a temporary variable (Method 2). Why don't they give the same result?
I would think that subtracting two int32_t variables would yield a result of type int32_t, so using a temporary of that type shouldn't change anything. I tried explicitly typecasting inside the if statement conditional; that didn't change anything. FWIW, Method 2 gives the result I would expect.
The code:
int32_t x = (0x80000000 - 3);
int i;
for( i = 0; i < 5; ++i )
{
int32_t y = x + 1; // this may cause rollover from 0x7fffffff (positive) to 0x80000000 (negative)
UARTprintf("\n" "x = 0x%08X, y = 0x%08X", x, y );
if( ( y - x ) >= 1 ) // Method 1
UARTprintf(" - true ");
else
UARTprintf(" - FALSE");
int32_t z = ( y - x ); // Method 2
if( ( z ) >= 1 )
UARTprintf(" - true ");
else
UARTprintf(" - false");
++x;
}
Output:
x = 0x7ffffffd, y = 0x7ffffffe - true - true
x = 0x7ffffffe, y = 0x7fffffff - true - true
x = 0x7fffffff, y = 0x80000000 - FALSE - true
x = 0x80000000, y = 0x80000001 - true - true
x = 0x80000001, y = 0x80000002 - true - true
In my actual application (not this simplified example), y is incremented by a hardware timer and x is a record of when some code was last executed. The test is intended to make some code run at intervals. Considering that y represents time and the application may run for a very long time before it is restarted, just not letting it overflow isn't an option.
Noting, as several of you did, that the standard does not define the behavior when signed integer overflow occurs tells me that I don't have a right to complain that I can't count on it working the way I want it to, but it doesn't give me a solution I can count on. Even using a temporary variable, which seems to work with my current compiler version and settings, might quit working when one of those things changes. Do I have any trustworthy options short of resorting to assembly code?
Given that signed integer overflow leads to undefined behaviour - you better not try to explain it.
Because your assumptions are based on "common sense", not the standard.
Otherwise - check assembly and try to debug it, but again, the outcome would not be scalable: you won't be able to apply the new knowledge to some other case (but with no doubt it would be fun to do).
The question I didn't know enough to ask originally is, "How can I avoid undefined behavior when doing subtraction on integers that might have overflowed?" Please correct me if I am wrong, but it appears that the answer to that question would be "use unsigned rather than signed integers" because the results are well defined (per C11 6.2.5/9) "a result that cannot be represented by the resulting unsigned integer type is reduced modulo the number that is one greater than the largest value that can be represented by the resulting type."
In this case, that is enough to come up with a working solution because the elapsed time will always be zero or a small positive number therefore the result of the subtraction will always be positive. So the result of the subtraction could be kept as an unsigned number and compared ( ">=1" ) or converted back to a signed int for comparison (C11 6.3.1.3 "When a value with integer type is converted to another integer type ... if the value can be represented by the new type, it is unchanged." This code works, and I believe does not rely on any undefined behavior: "if( ( (int32_t)(uint32_t)y - (uint32_t)x ) >= 1 )"
In the more general case, however, converting to unsigned to do the subtraction, then converting back to a signed result (which might be negative) is not well defined. C11 6.3.1.3 says regarding converting to another integer type that if "the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised." So I can still imagine a scenario in which assembly code would be needed to achieve well-defined results; this just isn't one of them.
For example, there are 3 variables of long type, we add a and b and get s:
long a, b, s;
...
s = a + b
Now what does ((s^a) < 0 && (s^b) < 0) mean?
I saw a check like this in the source code of Python:
if (PyInt_CheckExact(v) && PyInt_CheckExact(w)) {
/* INLINE: int + int */
register long a, b, i;
a = PyInt_AS_LONG(v);
b = PyInt_AS_LONG(w);
i = a + b;
if ((i^a) < 0 && (i^b) < 0)
goto slow_iadd;
x = PyInt_FromLong(i);
}
This code is wrong.
Assuming the usual 2's-complement rules of bitwise XOR for signed integers, then
(s^a) < 0
is the case if s and a have their sign bits set to opposite values. Thus,
((s^a) < 0 && (s^b) < 0)
indicates that s has sign different from both a and b, which must then have equal sign (pretending 0 is positive). If you added two integers of equal sign and got a result of different sign, there must have been an overflow, so this is an overflow check.
If we assume that signed overflow wraps, then s has opposite sign from a and b exactly when overflow has occurred. However, signed overflow is undefined behavior. Computing s is already wrong; we need to check whether the overflow would occur without actually performing the operation.
Python isn't supposed to do this. You can see what it's supposed to do in the Python 2 source code for int.__add__:
/* casts in the line below avoid undefined behaviour on overflow */
x = (long)((unsigned long)a + b);
if ((x^a) >= 0 || (x^b) >= 0)
It's supposed to cast to unsigned to get defined overflow behavior. The cast-to-unsigned fix was introduced in 5 different places as a result of issue 7406 on the Python bug tracker, but it looks like they missed a spot, or perhaps INPLACE_ADD was changed since then. I've left a message on the tracker.
I don't see how this is possible.
If (s^a) < 0, then the sign bit of either s or a must be a 1, so either s or a (but not both) must be negative. Same for s and b. So either s is negative and both a and b are positive, or s is positive and both a and b are negative. Both situations seem impossible.
Unless you count integer overflow/underflow, of course.
John Regehr's blog post A Guide to Undefined Behavior in C and C++, Part 1 contains the following "safe" function for "performing integer division without executing undefined behavior":
int32_t safe_div_int32_t (int32_t a, int32_t b) {
if ((b == 0) || ((a == INT32_MIN) && (b == -1))) {
report_integer_math_error();
return 0;
} else {
return a / b;
}
}
I'm wondering what is wrong with the division (a/b) when a = INT32_MIN and b = -1. Is it undefined? If so why?
I think it's because the absolute value of INT32_MIN is 1 larger than INT32_MAX. So INT32_MIN/-1 actually equals INT32_MAX + 1 which would overflow.
So for 32-bit integers, there are 4,294,967,296 values.
There are 2,147,483,648 values for negative numbers (-2,147,483,648 to -1).
There is 1 value for zero (0).
There are 2,147,483,647 values for positive numbers (1 to 2,147,483,647) because 0 took 1 value away from the positive numbers.
This is because int32_t is represented using two's-complement, and numbers with N bits in two's-complement range from −2^(N−1) to 2^(N−1)−1. Therefore, when you carry out the division, you get: -2^(31) / -1 = 2^(N-1). Notice that the result is larger than 2^(N-1)-1, meaning you get an overflow!
The other posters are correct about the causes of the overflow. The implication of the overflow on most machines is that INT_MIN / -1 => INT_ MIN. The same thing happens when multiplying by -1. This is an unexpected and possibly dangerous result. I've seen a fixed-point motor controller go out of control because it didn't check for this condition.
Because INT32_MIN is defined as (-INT32_MAX-1) = -(INT32_MAX+1) and when divided by -1, this would be (INT32+MAX) => there is an integer overflow. I must say, that is a nice way to check for overflows. Thoughtfully written code. +1 to the developer.