Finding the output of 2**32 % x in arc4random.c

Finding the output of 2**32 % x in arc4random.c - c

I saw some code (in arc4random.c of libbsd) calculating 2**32 % x. A cleaned up version is below:
uint32_t x;
...
if (x >= 2) {
/* Calculate (2**32 % x) avoiding 64-bit math */
if (x > 0x80000000)
mod_res = 1 + ~x; /* 2**32 - x */
else {
/* (2**32 - (x * 2)) % x == 2**32 % x when x <= 2**31 */
mod_res = ((0xffffffff - (x * 2)) + 1) % x;
}
}
While the reasoning makes sense, my question is whether are there some obscure reasons not to use a simpler:
uint32_t x;
...
if (x >= 2) {
/* Calculate (2**32 % x) avoiding 64-bit math */
mod_res = -x % x;
}

Your code won't work on a machine where int is larger than 32 bits. In this case, in the expression -x, the operand would be promoted to int type, and thus become signed. This would cause the result of the expression -x % x to always be zero.
This behavior is due to C's integer promotion rules, which state that if an int can represent all values of an operand, then that operand will be promoted to an int. While this always preserves value, it may change the signedness of the type.
On a compiler with 32-bit ints it would work correctly, because unsigned int would not be promoted to int, and so -x would be equal to 2**32 - x.
Your version can be fixed by casting the promoted value back to unsigned:
mod_res = ((uint32_t) -x) % x;
Here is an example demonstrating this with a 16-bit type on a machine with 32-bit ints.

Related

How to "blend" two values without overflow?

Consider the following function:
// Return a blended value of x and y:
// blend(100, 200, 1, 1) -> 150
// blend(100, 200, 2, 1) -> 133
uint8_t blend(uint8_t x, uint8_t y, uint8_t parts_x, uint8_t parts_y) {
uint32_t big_parts_x = parts_x;
uint32_t big_parts_y = parts_y;
return (uint8_t) ((big_parts_x * x + big_parts_y * y) /
(big_parts_x + big_parts_y));
}
Is there a way to get close to appropriate return values without requiring any allocations greater than uint8_t? You could break it up (less rounding) into an addition of two uint16_t easily by performing two divisions. Can you do it with only uint8_t?

A standards compliant C implementation is guaranteed to perform arithmetic operations with at least 16 bits.
Section 6.3.1.1p2 of the C standard states:
The following may be used in an expression wherever an int or unsigned
int may be used:
An object or expression with an integer type (other than int or unsigned int ) whose integer conversion rank is less than
or equal to the rank of int and unsigned int .
A bit-field of type
_Bool , int , signed int ,or unsigned int .
If an int can represent all values of the original type (as
restricted by the width, for a bit-field), the value is
converted to an int ; otherwise, it is converted to an unsigned
int . These are called the integer promotions. All other
types are unchanged by the integer promotions.
Section E.1 also states that an int must be able to support values at least in the range -32767 to 32767, and an unsigned int must support values in at least the range 0 to 65535.
Since a uint8_t has lower rank than an int, the former will always be promoted to the latter when it is the subject of most operators, including +, -, * and /.
Given that, you can safely compute the value with the following slight modification:
uint8_t blend(uint8_t x, uint8_t y, uint8_t parts_x, uint8_t parts_y) {
return ((1u*parts_x*x) / (parts_x + parts_y)) + ((1u*parts_y*y) / (parts_x + parts_y));
}
The expressions parts_x*x and parts_y*y will have a maximum value of 65025. This is too big for a 16 bit int but not a 16 bit unsigned int, so each is multiplied by 1u to force the values to be converted to unsigned int as per the usual arithmetic conversions specified in section 6.3.1.8:
the integer promotions are performed on both operands. Then the
following rules are applied to the promoted operands:
If both operands have the same type, then no further conversion is needed.
Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser
integer conversion rank is converted to the type of the operand
with greater rank.
Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the
other operand, then the operand with signed integer type is
converted to the type of the operand with unsigned integer
type.
Note also that we divide each part by the sum total separately. If we added both parts first before dividing, the numerator could exceed 65535. By doing the division first, this brings each subexpession back down into the range of a uint8_t. Then we can add the two parts which will again be in the range of a uint8_t.
So the above expression is guaranteed to return a correct exact answer on a compiler that is compliant with the C standard.

The below will combine without any additional allocations.
Works even if int/unsigned is 16 bit.
return (uint8_t) ((1u*parts_x*x + 1u*parts_y*y) / (0u + parts_x + parts_y));

Is there a way to get close to appropriate return values without requiring any allocations greater than uint8_t?
In theory, yes:
uint8_t blend(uint8_t x, uint8_t y, uint8_t parts_x, uint8_t parts_y) {
return lookup_table[x][y][parts_x][parts_y];
}
In practice that's going to cost 4 GiB of RAM for the lookup table, so it's probably not a great idea.
Apart from that, it depends on what you mean by "close" (how large "acceptable worst case error" can be) and what range of values are valid (especially for parts_x and parts_y).
For example (if parts_x and parts_y have a range from 1 to 15 only):
uint8_t blend(uint8_t x, uint8_t y, uint8_t parts_x, uint8_t parts_y) {
uint8_t scaleX = (parts_x << 4) / (parts_x + parts_y);
uint8_t scaleY = (parts_y << 4) / (parts_x + parts_y);
return (x >> 4) * scaleX + (y >> 4) * scaleY;
}
Of course in this case "close" means:
blend(100, 200, 1, 1) = 6*8 + 12*8 = 144 (not 150)
blend(100, 200, 2, 1) = 6*10 + 12*5 = 120 (not 133)
Note that (in general) multiplication is "expanding". What I mean is that if a has M bits of range and b has N bits of range, then a*b will have M+N bits of range. In other words (using full range) to avoid overflow uint8_t * uint8_t = uint16_t. Division is significantly worse (e.g. to avoid precision loss, 1/3 needs infinite bits), some precision loss is impossible to avoid, the number of bits in the result determines how much precision loss, and 8 bits of precision is "not much".
Also note that the simple example I've shown above can be improved for some cases by adding extra code for those cases. For example:
uint8_t blend(uint8_t x, uint8_t y, uint8_t parts_x, uint8_t parts_y) {
if(parts_x < parts_y) {
return blend(y, x, parts_y, parts_x);
}
// parts_x <= parts_y now
if(parts_x == parts_y*2) {
return 2*(x/3) + y/3;
} else if(parts_x == parts_y*3) {
return 3*(x/4) + y/4;
} else if(parts_x == parts_y*4) {
return 4*(x/5) + y/5;
} else if(parts_x == parts_y*5) {
return 5*(x/6) + y/6;
} else if( (x > 16) && (y > 16) ){
uint8_t scaleX = (parts_x << 4) / (parts_x + parts_y);
uint8_t scaleY = (parts_y << 4) / (parts_x + parts_y);
return (x * scaleX + y * scaleY) >> 4;
} else {
uint8_t scaleX = (parts_x << 4) / (parts_x + parts_y);
uint8_t scaleY = (parts_y << 4) / (parts_x + parts_y);
return (x >> 4) * scaleX + (y >> 4) * scaleY;
}
}
Of course it's significantly easier and faster to use something larger than uint8_t, so...

Greater than function in C

I know this is an age old question and you probably have come across this aswell, but there's a bug in my solution and I don't know how to solve it. I need to write a function that compares two integers. I am only allowed to use the operations (!,~,&,^,|,+,>>,<<) and also no control structures(if,else loops etc).
isGreater(int x, int y) {
//returns 1 if x > y.
return ((y+(~x+1))>>31)&1;
}
my idea is simple, we compute y-x, we shift by 31 to get the sign bit, if it's negative, then we return zero else we return 1. This fails when x is negative and falsly returns 1 although it should return zero. I'm stuck at this and don't know how to proceed.
We assume that integer is 32bits and uses two's complement representation. This question is NOT about portability.
Some help would be much appreciated.
Thanks in advance

Hacker's Delight has a chapter Comparison Predicates, which is exactly what we need here.
One of the things it writes is:
x < y: (x - y) ^ ((x ^ y) & ((x - y) ^ x))
Which we can use almost directly, except that x and y should be swapped, the subtractions must be replaced by something legal, and the result appears in the top bit instead of the lowest bit. Fortunately a - b == ~(~a + b) so that's not too hard. First applying those transformations:
// swap x <-> y
(y - x) ^ ((y ^ x) & ((y - x) ^ y))
// rewrite subtraction
~(~y + x) ^ ((y ^ x) & (~(~y + x) ^ y))
// get answer in lsb
((~(~y + x) ^ ((y ^ x) & (~(~y + x) ^ y))) >> 31) & 1
I have a website here that says it works.
If local variables are allowed it can be simplified a bit by factoring out the subexpression
~(~y + x):
int diff = ~(~y + x);
return ((diff ^ ((y ^ x) & (diff ^ y))) >> 31) & 1;

First of all let's clarify that we assume:
negative integers are represented in 2's complement
int is exactly 32 bits wide and long long is exactly 64 bits wide
right shifting a negative number is an arithmetic shift
There is a problem with the (~x+1) part in your solution which is supposed to return -x. The problem is that the absolute value of INT_MIN is greater than the absolute value of INT_MAX, thus when x is INT_MIN then (~x+1) yields INT_MIN instead of -INT_MIN as you expected.
There's also a problem with overflows in the y+(-x) part of your solution (second step).
Now if you're allowed to use other types than int, we can solve both of these problems by casting the values to long long before the conversion, assuming that it's a 64-bit type, so that (~x+1) would return the expected result -x and y+(-x) would not cause any overflows. Then, obviously, we will have to change the >>31 bit to >>63.
The end solution is as follows:
static bool isGreater(int x, int y) {
long long llx = x;
long long lly = y;
long long result = ((lly+(~llx+1))>>63)&1;
return result;
}
It's feasible to test it with some corner-cases, such as x == INT_MIN, x == 0 and x == INT_MAX:
int main(void) {
int x = INT_MIN;
for (long long y = INT_MIN; y <= INT_MAX; ++y) {
assert(isGreater(x, y) == (x > y));
}
x = INT_MAX;
for (long long y = INT_MIN; y <= INT_MAX; ++y) {
assert(isGreater(x, y) == (x > y));
}
x = 0;
for (long long y = INT_MIN; y <= INT_MAX; ++y) {
assert(isGreater(x, y) == (x > y));
}
}
This was successful on my particular machine with my particular compiler. The testing took 163 seconds.
But again, this depends on being able to use other types than int (but then again with more work you could emulate long long with int).
This whole thing could be more portable if you used int32_t and int64_t instead of int and long long, accordingly. However, it still would not be portable:
ISO/IEC 9899:2011 §6.5.7 Bitwise shift operators
5 The result of E1 >> E2is E1 right-shifted E2 bit positions. If E1 has an unsigned type or if E1 has a signed type and a nonnegative value, the value of the result is the integral part of the quotient of E1 / 2E2. If E1 has a signed type and a negative value, the resulting value is implementation-defined.

C programming type casting and fixed point

How should you implement this function in C code?
U16 newValue function(U16 value, S16 x, U16 y){
newValue = min((((value - x) * y) >> 10) >> 4, 4095)
return newValue
}
y is fixed point with 10 fractional bits
If x is greater then value the final result should be 0.
My concern is the mix between different types especially and that overflow does not occur. Also how to write it in a clean why if there will be a lot of type casts.

You need to code the fucntion for all possible values of the parameters given in input. Take the expression (value - x). If value is equal to 2^16 and x is equal to 2^(-15), then the result of (value - x) would be 98304, bigger than a U16. Therefore, I would cast value to S32 before this operation.
Let's collapse the expression (value - x) to its maximum value 98304. The maximum value of the expression ((value - x) * y) would then be 98304 * 2^16 which is equal to 6442450944, which is a bigger value than a 32 bits integer can hold. Therefore, you'd need to compute this expression as an U64. You can simply replace the initial U32 cast to a S64 cast, since you'll need it anyway.
Right bit shift operations only reduce the number of significant bits. Therefore, this does not require a bigger number of bits to be computed.
The min call ensures that the result cannot be bigger than 4095, which can be held in a U16; no more cast should be necessary.
Final function:
uint16_t newValue(uint16_t value, int16_t x, uint16_t y){
int64_t newValue = (int64_t)(value);
newValue -= x;
newValue *= y;
newValue >>= 10;
newValue >>= 4;
newValue = min(newValue, 4095);
// Or as a one liner.
// uint64_t newValue = min(((((int64_t)value - x) * y) >> 10) >> 4, 4095);
return (uint16_t) newValue;
}

Here it is
unsigned int function(unsigned int value, signed int x, unsigned int y){
if((((value - x) * y) >> 10) >> 4<4095)
return (((value - x) * y) >> 10) >> 4;
else return 4095;
}

How can i determine if I can compute x+y without overflow in C? [duplicate]

This question already has answers here:
Detecting signed overflow in C/C++
(13 answers)
Closed 7 years ago.
I can only use the operations ! ~ & ^ ! + << >>, and I'm having trouble grasping overflow, could use any tips or help!

It depends on whether the numbers are signed or unsigned.
If both operands are unsigned, overflow will wrap back around to 0.
If one or both operands are signed, the behavior is implementation defined, however most implementations represent signed integers in 2's complement, so in those cases positive overflow will wrap around to the negative side, and negative overflow will wrap around to the positive side.
In the case of unsigned overflow, the result will be less than at least one operand, so you can test for it this way:
if ((x + y < x) || (x + y < y) {
printf("overflow\n");
}
In the signed case, you first need to check whether both are positive (and check for negative wraparound) or both are negative (and check for positive wraparound):
if ((x > 0) && (y > 0) && ((x + y < x) || (x + y < y))) {
printf("negative overflow\n");
}
if ((x < 0) && (y < 0) && ((x + y > x) || (x + y > y))) {
printf("positive overflow\n");
}
As I mentioned before, the signed case is implementation defined, and the above will only work if signed integers are represented as 2's complement. In practice however, this will typically be the case.
This should give you an idea of how overflow works, although it doesn't use only the specific operators you mentioned. With this, you should be able to figure out how to use those other operators to achieve what the expressions above do.

With signed integer math, unless you have access to the limits like INT_MAX INT_MIN, there is no answer that gets around undefined behavior.
#include <limits.h>
int is_overflow_add_signed(int a, int b) {
// This uses -, so does not meet OP's goal.
// Available as a guide
return (a < 0) ? (b < INT_MIN - a) : (b > INT_MAX - a);
}
With unsigned math, simply see if the result "wrapped" around.
int is_overflow_add_unsigned(unsigned a, unsigned b) {
return (a + b) < a;
}

As pointed by many peoples, it is not right for signed...
So I changed it for unsigned first.
You need to calculate part by part.
Since you didn't tell us the data type, I assumed it is 4 byte unsigned data.
unsigned long x, unsigned long y;
// x = ...
// y = ...
unsigned long first_byte_x = (x & 0xFF000000) >> 24;
unsigned long first_byte_y = (y & 0xFF000000) >> 24;
unsigned long other_bytes_x = x & 0x00FFFFFF;
unsigned long other_bytes_y = y & 0x00FFFFFF;
unsigned long other_bytes_sum = other_bytes_x + other_bytes_y;
unsigned long carry = (other_bytes_sum & 0xFF000000) >> 24;
unsigned long first_byte_sum = first_byte_x + first_byte_y + carry;
if (first_byte_sum > 0xFF)
// overflow
else
// not overflow
If you can use mod(%), then it will be more simple.
*It looks like a homework so I hoped you considered enough before your asking...

C standard on negative zero (1's complement and signed magnitude)

All of these functions gives the expected result on my machine. Do they all work on other platforms?
More specifically, if x has the bit representation 0xffffffff on 1's complement machines or 0x80000000 on signed magnitude machines what does the standard says about the representation of (unsigned)x ?
Also, I think the (unsigned) cast in v2, v2a, v3, v4 is redundant. Is this correct?
Assume sizeof(int) = 4 and CHAR_BIT = 8
int logicalrightshift_v1 (int x, int n) {
return (unsigned)x >> n;
}
int logicalrightshift_v2 (int x, int n) {
int msb = 0x4000000 << 1;
return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}
int logicalrightshift_v2a (int x, int n) {
return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}
int logicalrightshift_v3 (int x, int n) {
return ((x & 0x7fffffff) >> n) | (x < 0 ? (unsigned)0x80000000 >> n : 0);
}
int logicalrightshift_v4 (int x, int n) {
return ((x & 0x7fffffff) >> n) | (((unsigned)x & 0x80000000) >> n);
}
int logicalrightshift_v5 (int x, int n) {
unsigned y;
*(int *)&y = x;
y >>= n;
*(unsigned *)&x = y;
return x;
}
int logicalrightshift_v6 (int x, int n) {
unsigned y;
memcpy (&y, &x, sizeof (x));
y >>= n;
memcpy (&x, &y, sizeof (x));
return x;
}

If x has the bit representation 0xffffffff on 1's
complement machines or 0x80000000 on signed magnitude machines what
does the standard says about the representation of (unsigned)x ?
The conversion to unsigned is specified in terms of values, not representations. If you convert -1 to unsigned, you always get UINT_MAX (so if your unsigned is 32 bits, you always get 4294967295). This happens regardless of the representation of signed numbers that your implementation uses.
Likewise, if you convert -0 to unsigned then you always get 0. -0 is numerically equal to 0.
Note that a ones complement or sign-magnitude implementation is not required to support negative zeroes; if it does not, then accessing such a representation causes the program to have undefined behaviour.
Going through your functions one-by-one:
int logicalrightshift_v1(int x, int n)
{
return (unsigned)x >> n;
}
The result of this function for negative values of x will depend on UINT_MAX, and will further be implementation-defined if (unsigned)x >> n is not within the range of int. For example, logicalrightshift_v1(-1, 1) will return the value UINT_MAX / 2 regardless of what representation the machine uses for signed numbers.
int logicalrightshift_v2(int x, int n)
{
int msb = 0x4000000 << 1;
return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}
Almost everything about this is could be implementation-defined. Assuming that you are attempting to create a value in msb with 1 in the sign bit and zeroes in the value bits, you cannot do this portably by use of shifts - you can use ~INT_MAX, but this is allowed to have undefined behaviour on a sign-magnitude machine that does not allow negative zeroes, and is allowed to give an implementation-defined result on two's complement machines.
The types of 0x7fffffff and 0x80000000 will depend on the ranges of the various types, which will affect how other values in this expression are promoted.
int logicalrightshift_v2a(int x, int n)
{
return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}
If you create an unsigned value that is not in the range of int (for example, given a 32bit int, values > 0x7fffffff) then the implicit conversion in the return statement produces an implementation-defined value. The same applies to v3 and v4.
int logicalrightshift_v5(int x, int n)
{
unsigned y;
*(int *)&y = x;
y >>= n;
*(unsigned *)&x = y;
return x;
}
This is still implementation defined, because it is unspecified whether the sign bit in the representation of int corresponds to a value bit or a padding bit in the representation of unsigned. If it corresponds to a padding bit it could be a trap representation, in which case the behaviour is undefined.
int logicalrightshift_v6(int x, int n)
{
unsigned y;
memcpy (&y, &x, sizeof (x));
y >>= n;
memcpy (&x, &y, sizeof (x));
return x;
}
The same comments applying to v5 apply to this.
Also, I think the (unsigned) cast in v2, v2a, v3, v4 is redundant. Is
this correct?
It depends. As a hex constant, 0x80000000 will have type int if that value is within the range of int; otherwise unsigned if that value is within the range of unsigned; otherwise long if that value is within the range of long; otherwise unsigned long (because that value is within the minimum allowed range of unsigned long).
If you wish to ensure that it has unsigned type, then suffix the constant with a U, to 0x80000000U.
Summary:
Converting a number greater than INT_MAX to int gives an implementation-defined result (or indeed, allows an implementation-defined signal to be raised).
Converting an out-of-range number to unsigned is done by repeated addition or subtraction of UINT_MAX + 1, which means it depends on the mathematical value, not the representation.
Inspecting a negative int representation as unsigned is not portable (positive int representations are OK, though).
Generating a negative zero through use of bitwise operators and trying to use the resulting value is not portable.
If you want "logical shifts", then you should be using unsigned types everywhere. The signed types are designed for dealing with algorithms where the value is what matters, not the representation.

If you follow the standard to the word, none of these are guaranteed to be the same on all platforms.
In v5, you violate strict-aliasing, which is undefined behavior.
In v2 - v4, you have signed right-shift, which is implementation defined. (see comments for more details)
In v1, you have signed to unsigned cast, which is implementation defined when the number is out of range.
EDIT:
v6 might actually work given the following assumptions:
'int' is either 2's or 1's complement.
unsigned and int are exactly the same size (in both bytes and bits, and are densely packed).
The endian of unsigned matches that of int.
The padding and bit-layout is the same: (See caf's comment for more details.)

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight