"comparison between signed and unsigned integer expressions" with only unsigned integers - c

This warning should not appear for this code should it?
#include <stdio.h>
int main(void) {
unsigned char x = 5;
unsigned char y = 4;
unsigned int z = 3;
puts((z >= x - y) ? "A" : "B");
return 0;
}
z is a different size but it is the same signedness. Is there something about integer conversions that I'm not aware about? Here's the gcc output:
$ gcc -o test test.c -Wsign-compare
test.c: In function ‘main’:
test.c:10:10: warning: comparison between signed and unsigned integer expressions [-Wsign-compare]
puts((z >= x - y) ? "A" : "B");
^
$ gcc --version
gcc (Debian 4.9.1-15) 4.9.1
If z is an unsigned char I do not get the error.

The issue is that additive operators perform the usual arithmetic conversions on arithmetic types which. In this case it results in the integer promotions being performed on the operands, which results in unsigned char being converted to int since signed int can represent all the values of the type of unsigned char.
A related thread Why must a short be converted to an int before arithmetic operations in C and C++? explains the rationale for promotions.

C has this concept called "Integer Promotion".
Basically it means that all maths is done in signed int unless you really insist otherwise, or it doesn't fit.
If I put in the implicit conversions, your example actually reads like this:
puts((z >= (int)x - (int)y) ? "A" : "B");
So, now you see the signed/unsigned mismatch.
Unfortunately, you can't safely correct this problem using casts alone. There are a few options:
puts((z >= (unsigned int)(x - y)) ? "A" : "B");
or
puts((z >= (unsigned int)x - (unsigned int)y) ? "A" : "B");
or
puts(((int)z >= x - y) ? "A" : "B");
But they all suffer from the same problem: what if y is larger than x, and what if z is larger than INTMAX (not that it will in the example)?
A properly correct solution might look like this:
puts((y > x || z >= (unsigned)(x - y)) ? "A" : "B")
In the end, unless you really need the extra bit, it usually best to avoid unsigned integers.

Related

How to implement wrapping signed int addition in C

This is a complete rewrite of the question. Hopefully it is clearer now.
I want to implement in C a function that performs addition of signed ints with wrapping in case of overflow.
I want to target mainly the x86-64 architecture, but of course the more portable the implementation is the better. I'm also concerned mostly about producing decent assembly code through gcc, clang, icc, and whatever is used on Windows.
The goal is twofold:
write correct C code that doesn't fall into the undefined behavior blackhole;
write code that gets compiled to decent machine code.
By decent machine code I mean a single leal or a single addl instruction on machines which natively support the operation.
I'm able to satisfy either of the two requisites, but not both.
Attempt 1
The first implementation that comes to mind is
int add_wrap(int x, int y) {
return (unsigned) x + (unsigned) y;
}
This seems to work with gcc, clang and icc. However, as far as I know, the C standard doesn't specify the cast from unsigned int to signed int, leaving freedom to the implementations (see also here).
Otherwise, if the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.
I believe most (all?) major compilers do the expected conversion from unsigned to int, meaning that they take the correct representative modulus 2^N, where N is the number of bits, but it's not mandated by the standard so it cannot be relied upon (stupid C standard hits again). Also, while this is the simplest thing to do on two's complement machines, it is impossible on ones' complement machines, because there is a class which is not representable: 2^(N/2).
Attempt 2
According to the clang docs, one can use __builtin_add_overflow like this
int add_wrap(int x, int y) {
int res;
__builtin_add_overflow(x, y, &res);
return res;
}
and this should do the trick with clang, because the docs clearly say
If possible, the result will be equal to mathematically-correct result and the builtin will return 0. Otherwise, the builtin will return 1 and the result will be equal to the unique value that is equivalent to the mathematically-correct result modulo two raised to the k power, where k is the number of bits in the result type.
The problem is that in the GCC docs they say
These built-in functions promote the first two operands into infinite precision signed type and perform addition on those promoted operands. The result is then cast to the type the third pointer argument points to and stored there.
As far as I know, casting from long int to int is implementation specific, so I don't see any guarantee that this will result in the wrapping behavior.
As you can see [here][godbolt], GCC will also generate the expected code, but I wanted to be sure that this is not by chance ans is indeed part of the specification of __builtin_add_overflow.
icc also seems to produce something reasonable.
This produces decent assembly, but relies on intrinsics, so it's not really standard compliant C.
Attempt 3
Follow the suggestions of those pedantic guys from SEI CERT C Coding Standard.
In their CERT INT32-C recommendation they explain how to check in advance for potential overflow. Here is what comes out following their advice:
#include <limits.h>
int add_wrap(int x, int y) {
if ((x > 0) && (y > INT_MAX - x))
return (x + INT_MIN) + (y + INT_MIN);
else if ((x < 0) && (y < INT_MIN - x))
return (x - INT_MIN) + (y - INT_MIN);
else
return x + y;
}
The code performs the correct checks and compiles to leal with gcc, but not with clang or icc.
The whole CERT INT32-C recommendation is complete garbage, because it tries to transform C into a "safe" language by forcing the programmers to perform checks that should be part of the definition of the language in the first place. And in doing so it forces also the programmer to write code which the compiler can no longer optimize, so what is the reason to use C anymore?!
Edit
The contrast is between compatibility and decency of the assembly generated.
For instance, with both gcc and clang the two following functions which are supposed to do the same get compiled to different assembly.
f is bad in both cases, g is good in both cases (addl+jo or addl+cmovnol). I don't know if jo is better than cmovnol, but the function g is consistently better than f.
#include <limits.h>
signed int f(signed int si_a, signed int si_b) {
signed int sum;
if (((si_b > 0) && (si_a > (INT_MAX - si_b))) ||
((si_b < 0) && (si_a < (INT_MIN - si_b)))) {
return 0;
} else {
return si_a + si_b;
}
}
signed int g(signed int si_a, signed int si_b) {
signed int sum;
if (__builtin_add_overflow(si_a, si_b, &sum)) {
return 0;
} else {
return sum;
}
}
A bit like #Andrew's answer without the memcpy().
Use a union to negate the need for memcpy(). With C2x, we are sure that int is 2's compliment.
int add_wrap(int x, int y) {
union {
unsigned un;
int in;
} u = {.un = (unsigned) x + (unsigned) y};
return u.in;
}
For those who like 1-liners, use a compound literal.
int add_wrap2(int x, int y) {
return ( union { unsigned un; int in; }) {.un = (unsigned) x + (unsigned) y}.in;
}
I'm not so sure because of the rules for casting from unsigned to signed
You exactly quoted the rules. If you convert from a unsigned value to a signed one, then the result is implementation-defined or a signal is raised. In simple words, what will happen is described by your compiler.
For example the gcc9.2.0 compiler has the following to in it's documentation about implementation defined behavior of integers:
The result of, or the signal raised by, converting an integer to a signed integer type when the value cannot be represented in an object of that type (C90 6.2.1.2, C99 and C11 6.3.1.3).
For conversion to a type of width N, the value is reduced modulo 2^N to be within range of the type; no signal is raised.
I had to do something similar; however, I was working with known width types from stdint.h and needed to handle wrapping 32-bit signed integer operations. The implementation below works because stdint types are required to be 2's complement. I was trying to emulate the behaviour in Java, so I had some Java code generate a bunch of test cases and have tested on clang, gcc and MSVC.
inline int32_t add_wrap_i32(int32_t a, int32_t b)
{
const int64_t a_widened = a;
const int64_t b_widened = b;
const int64_t sum = a_widened + b_widened;
return (int32_t)(sum & INT64_C(0xFFFFFFFF));
}
inline int32_t sub_wrap_i32(int32_t a, int32_t b)
{
const int64_t a_widened = a;
const int64_t b_widened = b;
const int64_t difference = a_widened - b_widened;
return (int32_t)(difference & INT64_C(0xFFFFFFFF));
}
inline int32_t mul_wrap_i32(int32_t a, int32_t b)
{
const int64_t a_widened = a;
const int64_t b_widened = b;
const int64_t product = a_widened * b_widened;
return (int32_t)(product & INT64_C(0xFFFFFFFF));
}
It seems ridiculous, but I think that the recommended method is to use memcpy. Apparently all modern compilers optimize the memcpy away and it ends up doing just what you're hoping in the first place -- preserving the bit pattern from the unsigned addition.
int a;
int b;
unsigned u = (unsigned)a + b;
int result;
memcpy(&result, &u, sizeof(result));
On x86 clang with optimization, this is a single instruction if the destination is a register.

Why this conditional expression has the size of a float?

I expect the output to be "short int" but the output is "float".
#include <stdio.h>
int main(void)
{
int x = 1;
short int i = 2;
float f = 3;
if (sizeof((x == 2) ? f : i) == sizeof(float))
printf("float\n");
else if (sizeof((x == 2) ? f : i) == sizeof(short int))
printf("short int\n");
}
You expect (x == 2) ? f : i to have a type based on the value of x. But that is not how the C type system operates. The conditional operator is an expression, and all* expressions in C have a fixed type at compile time. It is this type that sizeof operates on. The value of the expression will depend on the value of x, but the type depends on f and i alone.
In this case, the type is the determined by the usual arithmetic conversions, which nominate float as the type of the result, same as if you had written f + i, where the result would unsurprisingly be a float too.
(*) - VLA's produce exemptions to this rule, but your question is not about one, so it's irrelevant.
You are asking the compiler to compute the size of (x == 2) ? f : i and that expression is a float.
Remember that sizeof is a compile-time operator, and that the ?: ternary conditional operator will have as type something which is convertible from both the "then" and the "else" case.
For details, refer to some C reference and to the C11 standard n1570

Integer overflow test operator? (&+)

This question could just be another case of interpreting the operator incorrectly. But a while ago, I saw someone tweeting about an operator that allegedly can be used to check for integer overflow in C. Namely the &+ (ampersand-plus) operator, and it could be used simply like so:
#include <stdio.h>
#include <stdint.h>
int main()
{
uint32_t x, y;
x = 0xFFFFFFFF;
y = 1;
if (x &+ y) {
printf("Integer overflow!\n");
} else {
printf("No overflow\n");
}
return 0;
}
It does seem to work as one would expect, and GCC 6 doesn't throw me any warnings or errors when compiling it with these parameters: gcc -Wall -Wextra -Werror of.c
But oddly enough, I have yet to find any documentation about this operator, and I never saw it used anywhere. Could someone please explain how this works?
The expression
x &+ y
is parsed as
x & (+y)
using the unary plus operator, which has no effect (in this case) and just returns y. That means the expression is equivalent to
x & y
which does not test for integer overflow and instead just checks if x and y have any bits in common. Try changing x and y to 1 and see what happens; it'll report an overflow even though none will occur.
That was probably a joke, there's no such a thing as the &+ operator. If you write x&+y, it is interpreted as x & (+y), where & is the binary bitwise and operator, and + is the unary plus operator which does nothing besides possibly performing arithmetic promotion (e.g. if y is a short or char it gets promoted to an int; in your case it does nothing).
Anyhow, this expression doesn't really have a strict relation with checking for overflow.
Your probably want to use __builtin_add_overflow (fully generic, and somewhat less common) or __builtin_uadd_overflow (for unsigned ints)
#include <stdio.h>
#include <stdint.h>
int main()
{
uint32_t x, y;
x = 0xFFFFFFFF;
y = 1;
if (__builtin_add_overflow(x, y, &x)) {
printf("Integer overflow!\n");
} else {
printf("No overflow\n");
}
return 0;
}
There are no builtin-checking operators in gcc/clang, AFAIK, and &+ is just & followed by a unary +.
I'm personally using a wrapper macro that uses these builtins, if they're available, or falls back to a builtin-less solution inspired by the overflow checking code that's available at
https://www.securecoding.cert.org/confluence/display/c/INT32-C.+Ensure+that+operations+on+signed+integers+do+not+result+in+overflow

C modulus returning negative number

I have data type unsigned __int128 data; so I don't think this is a type issue, but I have no idea why it is occuring
#include <stdio.h>
int main(int argc, char *argv[]) {
unsigned __int128 z = 1911602146;
unsigned __int128 n = 4003562209;
//case 1
unsigned __int128 result = fmod((pow(z, 2) * 2), n);
printf("%d\n", result);
//case 2
unsigned __int128 result_2 = fmod(pow(z, 2), n);
printf("%d\n", result_2);
}
returns:
-669207835 => this is the correct option and it should be 7629321670
-480306461
printf("%d\n", result);
// ^^
%d expects an int. You're passing it an unsigned __int128 instead, resulting in undefined behavior. Most likely, printf is taking part of the representation of result and interpreting it as an int.
I don't know what the right format specifier would be, but you should find the right one and use it. Also, you shouldn't be using floating-point functions on your data; you're losing precision there.
First avoid the floating-point trip during the calculation.
Then, only for printing, convert the result (which is < 10^10) to double in order to use the printf function
unsigned __int128 z = 1911602146;
unsigned __int128 n = 4003562209;
unsigned __int128 result = (z * z * 2) % n;
printf("%.0lf\n", (double)result);
unsigned __int128 result_2 = (z * z) % n;
printf("%.0lf\n", (double)result_2);
That should give you
3625759213
3814660711
(you cannot get 7629321670 anyway as a result, since it is bigger than the modulo operand, 'n')
First of all, __int128 is a GNU CC extension, and thus there's no portable way to handle them, neither a portable way to print them.
As it happens to be, there's (ironically...) no support, not even from Glibc, for printfing(), __int128s nor unsigned __int128s.
The only alternative you have is to write your own functions to print them out in decimal, or better yet, in hexadecimal, because this kind of large integers can get pretty unreadable in decimal too easily.
BTW, this is undefined behaviour:
printf("%d\n", result);
Because the "%d" specifier expects an int as argument, nothing less, nothing more, nothing else.
I hope this has led some light on you!

warning: comparison of unsigned expression >= 0 is always true

I have the following error when compiling a C file:
t_memmove.c: In function ‘ft_memmove’:
ft_memmove.c:19: warning: comparison of unsigned expression >= 0 is always true
Here's the full code, via cat ft_memmove.c:
#include "libft.h"
#include <string.h>
void *ft_memmove(void *s1, const void *s2, size_t n)
{
char *s1c;
char *s2c;
size_t i;
if (!s1 || !s2 || !n)
{
return s1;
}
i = 0;
s1c = (char *) s1;
s2c = (char *) s2;
if (s1c > s2c)
{
while (n - i >= 0) // this triggers the error
{
s1c[n - i] = s2c[n - i];
++i;
}
}
else
{
while (i < n)
{
s1c[i] = s2c[i];
++i;
}
}
return s1;
}
I do understand that size_t is unsigned and that both integers will be >= 0 because of that. But since I'm subtracting one from the other, I don't get it. Why does this error come up?
If you subtract two unsigned integers in C, the result will be interpreted as unsigned. It doesn't automatically treat it as signed just because you subtracted. One way to fix that is use n >= i instead of n - i >= 0.
consider this loop:
for(unsigned int i=5;i>=0;i--)
{
}
This loop will be infinite because whenever i becomes -1 it'll be interprated as a very large possitive value as sign bit is absent in unsigned int.
This is the reason a warning is generated here
According to section 6.3.1.8 of the draft C99 standard Usual arithmetic conversions, since they are both of the same type, the result will also be size_t. The section states:
[...]Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result[...]
and later on says:
If both operands have the same type, then no further conversion is needed.
mathematically you can just move the i over to the other side of the expression like so:
n >= i
Arithmetic on unsigned results in an unsigned and that's why you are getting this warning. Better to change n - i >= 0 to n >= i.
Operations with unsigned operands are performed in the domain of unsigned type. Unsigned arithmetic follows the rules of modular arithmetic. This means that the result will never be negative, even if you are subtracting something from something. For example 1u - 5u does not produce -4. If produces UINT_MAX - 3, which is a huge positive value congruent to -4 modulo UINT_MAX + 1.

Resources