C standard on negative zero (1's complement and signed magnitude) - c

All of these functions gives the expected result on my machine. Do they all work on other platforms?
More specifically, if x has the bit representation 0xffffffff on 1's complement machines or 0x80000000 on signed magnitude machines what does the standard says about the representation of (unsigned)x ?
Also, I think the (unsigned) cast in v2, v2a, v3, v4 is redundant. Is this correct?
Assume sizeof(int) = 4 and CHAR_BIT = 8
int logicalrightshift_v1 (int x, int n) {
return (unsigned)x >> n;
}
int logicalrightshift_v2 (int x, int n) {
int msb = 0x4000000 << 1;
return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}
int logicalrightshift_v2a (int x, int n) {
return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}
int logicalrightshift_v3 (int x, int n) {
return ((x & 0x7fffffff) >> n) | (x < 0 ? (unsigned)0x80000000 >> n : 0);
}
int logicalrightshift_v4 (int x, int n) {
return ((x & 0x7fffffff) >> n) | (((unsigned)x & 0x80000000) >> n);
}
int logicalrightshift_v5 (int x, int n) {
unsigned y;
*(int *)&y = x;
y >>= n;
*(unsigned *)&x = y;
return x;
}
int logicalrightshift_v6 (int x, int n) {
unsigned y;
memcpy (&y, &x, sizeof (x));
y >>= n;
memcpy (&x, &y, sizeof (x));
return x;
}

If x has the bit representation 0xffffffff on 1's
complement machines or 0x80000000 on signed magnitude machines what
does the standard says about the representation of (unsigned)x ?
The conversion to unsigned is specified in terms of values, not representations. If you convert -1 to unsigned, you always get UINT_MAX (so if your unsigned is 32 bits, you always get 4294967295). This happens regardless of the representation of signed numbers that your implementation uses.
Likewise, if you convert -0 to unsigned then you always get 0. -0 is numerically equal to 0.
Note that a ones complement or sign-magnitude implementation is not required to support negative zeroes; if it does not, then accessing such a representation causes the program to have undefined behaviour.
Going through your functions one-by-one:
int logicalrightshift_v1(int x, int n)
{
return (unsigned)x >> n;
}
The result of this function for negative values of x will depend on UINT_MAX, and will further be implementation-defined if (unsigned)x >> n is not within the range of int. For example, logicalrightshift_v1(-1, 1) will return the value UINT_MAX / 2 regardless of what representation the machine uses for signed numbers.
int logicalrightshift_v2(int x, int n)
{
int msb = 0x4000000 << 1;
return ((x & 0x7fffffff) >> n) | (x & msb ? (unsigned)0x80000000 >> n : 0);
}
Almost everything about this is could be implementation-defined. Assuming that you are attempting to create a value in msb with 1 in the sign bit and zeroes in the value bits, you cannot do this portably by use of shifts - you can use ~INT_MAX, but this is allowed to have undefined behaviour on a sign-magnitude machine that does not allow negative zeroes, and is allowed to give an implementation-defined result on two's complement machines.
The types of 0x7fffffff and 0x80000000 will depend on the ranges of the various types, which will affect how other values in this expression are promoted.
int logicalrightshift_v2a(int x, int n)
{
return ((x & 0x7fffffff) >> n) | (x & (unsigned)0x80000000 ? (unsigned)0x80000000 >> n : 0);
}
If you create an unsigned value that is not in the range of int (for example, given a 32bit int, values > 0x7fffffff) then the implicit conversion in the return statement produces an implementation-defined value. The same applies to v3 and v4.
int logicalrightshift_v5(int x, int n)
{
unsigned y;
*(int *)&y = x;
y >>= n;
*(unsigned *)&x = y;
return x;
}
This is still implementation defined, because it is unspecified whether the sign bit in the representation of int corresponds to a value bit or a padding bit in the representation of unsigned. If it corresponds to a padding bit it could be a trap representation, in which case the behaviour is undefined.
int logicalrightshift_v6(int x, int n)
{
unsigned y;
memcpy (&y, &x, sizeof (x));
y >>= n;
memcpy (&x, &y, sizeof (x));
return x;
}
The same comments applying to v5 apply to this.
Also, I think the (unsigned) cast in v2, v2a, v3, v4 is redundant. Is
this correct?
It depends. As a hex constant, 0x80000000 will have type int if that value is within the range of int; otherwise unsigned if that value is within the range of unsigned; otherwise long if that value is within the range of long; otherwise unsigned long (because that value is within the minimum allowed range of unsigned long).
If you wish to ensure that it has unsigned type, then suffix the constant with a U, to 0x80000000U.
Summary:
Converting a number greater than INT_MAX to int gives an implementation-defined result (or indeed, allows an implementation-defined signal to be raised).
Converting an out-of-range number to unsigned is done by repeated addition or subtraction of UINT_MAX + 1, which means it depends on the mathematical value, not the representation.
Inspecting a negative int representation as unsigned is not portable (positive int representations are OK, though).
Generating a negative zero through use of bitwise operators and trying to use the resulting value is not portable.
If you want "logical shifts", then you should be using unsigned types everywhere. The signed types are designed for dealing with algorithms where the value is what matters, not the representation.

If you follow the standard to the word, none of these are guaranteed to be the same on all platforms.
In v5, you violate strict-aliasing, which is undefined behavior.
In v2 - v4, you have signed right-shift, which is implementation defined. (see comments for more details)
In v1, you have signed to unsigned cast, which is implementation defined when the number is out of range.
EDIT:
v6 might actually work given the following assumptions:
'int' is either 2's or 1's complement.
unsigned and int are exactly the same size (in both bytes and bits, and are densely packed).
The endian of unsigned matches that of int.
The padding and bit-layout is the same: (See caf's comment for more details.)

Related

Modulus operator returning quotient instead of remainder

I am currenly attempting to implement a basic pairing function in c. The pairing function will take in 2 unsigned integers and output a single unsigned long value. In order to unpair the result and retrieve the original values, the modulus operator must be used. But for some reason the modulues operator is returning the quotient and not the remainder like it is supposed to.
Here is the code:
unsigned long pair(unsigned int x, unsigned int y)
{
return (unsigned long)UINT_MAX * y + x;
}
unsigned int depair_x(unsigned long z)
{
return (unsigned int)(z % UINT_MAX);
}
unsigned int depair_y(unsigned long z)
{
return (unsigned int)(z / UINT_MAX);
}
A good example is that when I input the values 1058242433 and 1063370847 I get the result 4567143011379691298. However the result of the modulus operator is roughly 1063370847, which is incorrect (feel free to check it). However that result is the quotient. What is happening here?
Looks like "off-by-1".
In order to achieve "to unpair the result and retrieve the original values" a scale of UINT_MAX + 1UL is needed, not UINT_MAX.
// return (unsigned long)UINT_MAX * y + x;
return (UINT_MAX + 1UL)* y + x;
//return (unsigned int)(z % UINT_MAX);
return (unsigned int)(z % (UINT_MAX + 1ul));
// Likewise for `/`
If unsigned long has same range as unsigned, look to using unsigned long long instead of unsigned long.

Bizarre right bitshift inconsistency

I've been working with bits in C (running on ubuntu). In using two different ways to right shift an integer, I got oddly different outputs:
#include <stdio.h>
int main(){
int x = 0xfffffffe;
int a = x >> 16;
int b = 0xfffffffe >> 16;
printf("%X\n%X\n", a, b);
return 0;
}
I would think the output would be the same for each: FFFF, because the right four hex places (16 bits) are being rightshifted away. Instead, the output is:
FFFFFFFF
FFFF
What explains this behaviour?
When you say:
int x = 0xfffffffe;
That sets x to -2 because the maximum value an int can hold here is 0x7FFFFFFF and it wraps around during conversion. When you bit-shift the negative number it gets weird.
If you change those values to unsigned int it all works out.
#include <stdio.h>
int main(){
unsigned int x = 0xfffffffe;
unsigned int a = x >> 16;
unsigned int b = 0xfffffffe >> 16;
printf("%X\n%X\n", a, b);
return 0;
}
The behaviour you see here has to do with shifting on signed or unsigned integers which give different results.
Shifts on unsigned integers are logical. On the contrary, shift on signed integers are arithmetic. EDIT: In C, it's implementation defined but generally the case.
Consequently,
int x = 0xfffffffe;
int a = x >> 16;
this part performs an arithmetic shift because x is signed. And because x is actually negative (-2 in two's complement), x is sign extended, so '1's are appended which results in 0xFFFFFFFF.
On the contrary,
int b = 0xfffffffe >> 16;
0xfffffffe is a litteral interpreted as an unsigned integer. Therefore a logical shift of 16 results in 0x0000FFFF as expected.

Iterate bits from left to right for any number

I am trying to implement Modular Exponentiation (square and multiply left to right) algorithm in c.
In order to iterate the bits from left to right, I can use masking which is explained in this link
In this example mask used is 0x80 which can work only for a number with max 8 bits.
In order to make it work for any number of bits, I need to assign mask dynamically but this makes it a bit complicated.
Is there any other solution by which it can be done.
Thanks in advance!
-------------EDIT-----------------------
long long base = 23;
long long exponent = 297;
long long mod = 327;
long long result = 1;
unsigned int mask;
for (mask = 0x80; mask != 0; mask >>= 1) {
result = (result * result) % mod; // Square
if (exponent & mask) {
result = (base * result) % mod; // Mul
}
}
As in this example, it will not work if I will use mask 0x80 but if I use 0x100 then it works fine.
Selecting the mask value at run time seems to be an overhead.
If you want to iterate over all bits, you first have to know how many bits there are in your type.
This is a surprisingly complicated matter:
sizeof gives you the number of bytes, but a byte can have more than 8 bits.
limits.h gives you CHAR_BIT to know the number of bits in a byte, but even if you multiply this by the sizeof your type, the result could still be wrong because unsigned types are allowed to contain padding bits that are not part of the number representation, while sizeof returns the storage size in bytes, which includes these padding bits.
Fortunately, this answer has an ingenious macro that can calculate the number of actual value bits based on the maximum value of the respective type:
#define IMAX_BITS(m) ((m) /((m)%0x3fffffffL+1) /0x3fffffffL %0x3fffffffL *30 \
+ (m)%0x3fffffffL /((m)%31+1)/31%31*5 + 4-12/((m)%31+3))
The maximum value of an unsigned type is surprisingly easy to get: just cast -1 to your unsigned type.
So, all in all, your code could look like this, including the macro above:
#define UNSIGNED_BITS IMAX_BITS((unsigned)-1)
// [...]
unsigned int mask;
for (mask = 1 << (UNSIGNED_BITS-1); mask != 0; mask >>= 1) {
// [...]
}
Note that applying this complicated macro has no runtime drawback at all, it's a compile-time constant.
Your algorithm seems unnecessarily complicated: bits from the exponent can be tested from the least significant to the most significant in a way that does not depend on the integer type nor its maximum value. Here is a simple implementation that does not need any special case for any size integers:
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv) {
unsigned long long base = (argc > 1) ? strtoull(argv[1], NULL, 0) : 23;
unsigned long long exponent = (argc > 2) ? strtoull(argv[2], NULL, 0) : 297;
unsigned long long mod = (argc > 3) ? strtoull(argv[3], NULL, 0) : 327;
unsigned long long y = exponent;
unsigned long long x = base;
unsigned long long result = 1;
for (;;) {
if (y & 1) {
result = result * x % mod;
}
if ((y >>= 1) == 0)
break;
x = x * x % mod;
}
printf("expmod(%llu, %llu, %llu) = %llu\n", base, exponent, mod, result);
return 0;
}
Without any command line arguments, it produces: expmod(23, 297, 327) = 185. You can try other numbers by passing the base, exponent and modulo as command line arguments.
EDIT:
If you must scan the bits in exponent from most significant to least significant, mask should be defined as the same type as exponent and initialized this way if the type is unsigned:
unsigned long long exponent = 297;
unsigned long long mask = 0;
mask = ~mask - (~mask >> 1);
If the type is signed, for complete portability, you must use the definition for its maximum value from <limits.h>. Note however that it would be more efficient to use the unsigned type.
long long exponent = 297;
long long mask = LLONG_MAX - (LLONG_MAX >> 1);
The loop will waste time running through all the most significant 0 bits, so a simpler loop could be used first to skip these bits:
while (mask > exponent) {
mask >>= 1;
}

How can i determine if I can compute x+y without overflow in C? [duplicate]

This question already has answers here:
Detecting signed overflow in C/C++
(13 answers)
Closed 7 years ago.
I can only use the operations ! ~ & ^ ! + << >>, and I'm having trouble grasping overflow, could use any tips or help!
It depends on whether the numbers are signed or unsigned.
If both operands are unsigned, overflow will wrap back around to 0.
If one or both operands are signed, the behavior is implementation defined, however most implementations represent signed integers in 2's complement, so in those cases positive overflow will wrap around to the negative side, and negative overflow will wrap around to the positive side.
In the case of unsigned overflow, the result will be less than at least one operand, so you can test for it this way:
if ((x + y < x) || (x + y < y) {
printf("overflow\n");
}
In the signed case, you first need to check whether both are positive (and check for negative wraparound) or both are negative (and check for positive wraparound):
if ((x > 0) && (y > 0) && ((x + y < x) || (x + y < y))) {
printf("negative overflow\n");
}
if ((x < 0) && (y < 0) && ((x + y > x) || (x + y > y))) {
printf("positive overflow\n");
}
As I mentioned before, the signed case is implementation defined, and the above will only work if signed integers are represented as 2's complement. In practice however, this will typically be the case.
This should give you an idea of how overflow works, although it doesn't use only the specific operators you mentioned. With this, you should be able to figure out how to use those other operators to achieve what the expressions above do.
With signed integer math, unless you have access to the limits like INT_MAX INT_MIN, there is no answer that gets around undefined behavior.
#include <limits.h>
int is_overflow_add_signed(int a, int b) {
// This uses -, so does not meet OP's goal.
// Available as a guide
return (a < 0) ? (b < INT_MIN - a) : (b > INT_MAX - a);
}
With unsigned math, simply see if the result "wrapped" around.
int is_overflow_add_unsigned(unsigned a, unsigned b) {
return (a + b) < a;
}
As pointed by many peoples, it is not right for signed...
So I changed it for unsigned first.
You need to calculate part by part.
Since you didn't tell us the data type, I assumed it is 4 byte unsigned data.
unsigned long x, unsigned long y;
// x = ...
// y = ...
unsigned long first_byte_x = (x & 0xFF000000) >> 24;
unsigned long first_byte_y = (y & 0xFF000000) >> 24;
unsigned long other_bytes_x = x & 0x00FFFFFF;
unsigned long other_bytes_y = y & 0x00FFFFFF;
unsigned long other_bytes_sum = other_bytes_x + other_bytes_y;
unsigned long carry = (other_bytes_sum & 0xFF000000) >> 24;
unsigned long first_byte_sum = first_byte_x + first_byte_y + carry;
if (first_byte_sum > 0xFF)
// overflow
else
// not overflow
If you can use mod(%), then it will be more simple.
*It looks like a homework so I hoped you considered enough before your asking...

function to convert float to int (huge integers)

This is a university question. Just to make sure :-) We need to implement (float)x
I have the following code which must convert integer x to its floating point binary representation stored in an unsigned integer.
unsigned float_i2f(int x) {
if (!x) return x;
/* get sign of x */
int sign = (x>>31) & 0x1;
/* absolute value of x */
int a = sign ? ~x + 1 : x;
/* calculate exponent */
int e = 0;
int t = a;
while(t != 1) {
/* divide by two until t is 0*/
t >>= 1;
e++;
};
/* calculate mantissa */
int m = a << (32 - e);
/* logical right shift */
m = (m >> 9) & ~(((0x1 << 31) >> 9 << 1));
/* add bias for 32bit float */
e += 127;
int res = sign << 31;
res |= (e << 23);
res |= m;
/* lots of printf */
return res;
}
One problem I encounter now is that when my integers are too big then my code fails. I have this control procedure implemented:
float f = (float)x;
unsigned int r;
memcpy(&r, &f, sizeof(unsigned int));
This of course always produces the correct output.
Now when I do some test runs, this are my outputs (GOAL is what It needs to be, result is what I got)
:!make && ./btest -f float_i2f -1 0x80004999
make: Nothing to be done for `all'.
Score Rating Errors Function
x: [-2147464807] 10000000000000000100100110011001
sign: 1
expone: 01001110100000000000000000000000
mantis: 00000000011111111111111101101100
result: 11001110111111111111111101101100
GOAL: 11001110111111111111111101101101
So in this case, a 1 is added as the LSB.
Next case:
:!make && ./btest -f float_i2f -1 0x80000001
make: Nothing to be done for `all'.
Score Rating Errors Function
x: [-2147483647] 10000000000000000000000000000001
sign: 1
expone: 01001110100000000000000000000000
mantis: 00000000011111111111111111111111
result: 11001110111111111111111111111111
GOAL: 11001111000000000000000000000000
Here 1 is added to the exponent while the mantissa is the complement of it.
I tried hours to look ip up on the internet plus in my books etc but I can't find any references to this problem. I guess It has something to do with the fact that the mantissa is only 23 bits. But how do I have to handle it then?
EDIT: THIS PART IS OBSOLETE THANKS TO THE COMMENTS BELOW. int l must be unsigned l.
int x = 2147483647;
float f = (float)x;
int l = f;
printf("l: %d\n", l);
then l becomes -2147483648.
How can this happen? So C is doing the casting wrong?
Hope someone can help me here!
Thx
Markus
EDIT 2:
My updated code is now this:
unsigned float_i2f(int x) {
if (x == 0) return 0;
/* get sign of x */
int sign = (x>>31) & 0x1;
/* absolute value of x */
int a = sign ? ~x + 1 : x;
/* calculate exponent */
int e = 158;
int t = a;
while (!(t >> 31) & 0x1) {
t <<= 1;
e--;
};
/* calculate mantissa */
int m = (t >> 8) & ~(((0x1 << 31) >> 8 << 1));
m &= 0x7fffff;
int res = sign << 31;
res |= (e << 23);
res |= m;
return res;
}
I also figured out that the code works for all integers in the range -2^24, 2^24. Everything above/below sometimes works but mostly doesn't.
Something is missing, but I really have no idea what. Can anyone help me?
The answer printed is absolutely correct as it's totally dependent on the underlying representation of numbers being cast. However, If we understand the binary representation of the number, you won't get surprised with this result.
To understand an implicit conversion is associated with the assignment operator (ref C99 Standard 6.5.16). The C99 Standard goes on to say:
6.3.1.4 Real floating and integer
When a finite value of real floating type is converted to an integer type other than _Bool, the fractional part is discarded (i.e., the value is truncated toward zero). If the value of the integral part cannot be represented by the integer type, the behavior is undefined.
Your earlier example illustrates undefined behavior due to assigning a value outside the range of the destination type. Trying to assign a negative value to an unsigned type, not from converting floating point to integer.
The asserts in the following snippet ought to prevent any undefined behavior from occurring.
#include <limits.h>
#include <math.h>
unsigned int convertFloatingPoint(double v) {
double d;
assert(isfinite(v));
d = trunc(v);
assert((d>=0.0) && (d<=(double)UINT_MAX));
return (unsigned int)d;
}
Another way for doing the same thing, Create a union containing a 32-bit integer and a float. The int and float are now just different ways of looking at the same bit of memory;
union {
int myInt;
float myFloat;
} my_union;
my_union.myInt = 0x BFFFF2E5;
printf("float is %f\n", my_union.myFloat);
float is -1.999600
You are telling the compiler to take the number you have (large integer) and make it into a float, not to interpret the number AS float. To do that, you need to tell the compiler to read the number from that address in a different form, so this:
myFloat = *(float *)&myInt ;
That means, if we take it apart, starting from the right:
&myInt - the location in memory that holds your integer.
(float *) - really, I want the compiler use this as a pointer to float, not whatever the compiler thinks it may be.
* - read from the address of whatever is to the right.
myFloat = - set this variable to whatever is to the right.
So, you are telling the compiler: In the location of (myInt), there is a floating point number, now put that float into myFloat.

Resources