a = -2147483648 - a; compiler optimization - c

I'm trying to learn how to reverse engineer software and all the tricks to understand how the code looks like before the compiler optimizations.
I found something like this several times:
if (a < 0)
a = -2147483648 - a;
I originally thought it was an abs(): a underflows so you get the positive value. But since a is negative (see the if), this is equivalent to:
if (a < 0)
a = -2147483648 + abs(a);
Which will be a very small negative number, and not the absolute value of a at all. What am I missing?

It is converting the number so that bit 31 becomes a sign bit, and the rest bits (0...30) denotes the absolute magnitude. e.g. if a = -5, then after the operation it becomes 0x80000005.

It appears to be converting from 2's complement to sign-magnitude

Maybe: http://en.wikipedia.org/wiki/Two%27s_complement ?

I sincerely hope that the original source said 0x80000000 and not -2147483648 ! The hex number at least gives the reader a clue. The decimal is very cryptic.

Related

Attempting to convert a value into 2s complement in C

I am writing an emulator in C and I want to make the constantValuable , which is 65530 (0xFFFA) be the two's complement variable of 5 however I cannot seem to get it quite right. Below is the example of the if statement where I would like this to be done.
if(opCodeType == 4)
{
if(registers[rsVariable] == registers[rtVariable])
{
int twosVariable = ~(constantVariable) + 1;
printf("%d", twosVariable);
pc = pc + (twosVariable*4);
}
}
I can't seem to understand why this does not work.
Indeed 2's complement is such that the complement to your number n (which algebraically is -n) is derived by reflecting the bit pattern of n then adding 1 to that number. Note that in a 2's complement scheme, -1 has all its bits set to 1.
The problem with reflecting the bit pattern using ~ is that it can cause unwanted type promotion which ruins the result.
One solution is to mask the the result of ~, another is to cast the result. Of course, on a 2's complement platform, you can write simply -n, taking care to ensure that n is not already the smallest possible negative.

Getting the negative integer from a two's complement value Embedded C

I know that many had similar questions over here about converting from/to two's complement format and I tried many of them but nothing seems to help in my case.
Well, I'm working on an embedded project that involves writing/reading registers of a slave device over SPI. The register concerned here is a 22-bit position register that stores the uStep value in two's complement format and it ranges from -2^21 to +2^21 -1. The problem is when I read the register, I get a big integer that has nothing to do with the actual value.
Example:
After sending a command to the slave to move 4000 steps (forward/positive), I read the position register and I get exactly 4000. However, if I send a reverse move command, say -1, and then read the register, the value I get is something like 4292928. I believe it's the negative offset of the register as the two's complement has no zero. I have no problem sending a negative integer to the device to move x number of steps, however, getting the actual negative integer from the value retrieved is something else.
I know that this involves two's complement but the question is, how to get the actual negative integer out of that strange value? I mean, if I moved the device -4000 steps, what I have to do to get the exact value for the negative steps moved so far from my register?
You need to sign-extend bit 21 through the bits to the left.
For negative values when bit 21 is set, you can do this by ORring the value with 0xFFC00000.
For positive values when bit 21 is clear, you can ensure by ANDing the value with 0x003FFFFF.
The solutions by Clifford and Weather Vane assume the target machine is two's-complement. This is very likely true, but a solution that removes this dependency is:
static const int32_t sign_bit = 0x00200000;
int32_t pos_count = (getPosRegisterValue() ^ sign_bit) - sign_bit;
It has the additional advantage of being branch-free.
The simplest method perhaps is simply to shift the position value left by 10 bits and assign to an int32_t. You will then have a 32 bit value and the position will be scaled up by 210 (1024), and have 32 bit resolution, but 10 bit granularity, which normally shouldn't matter since the position units are entirely arbitrary in any case, and can be converted to real-world units if necessary taking into account the scaling:
int32_t pos_count = (int32_t)(getPosRegisterValue() << 10) ;
Where getPosRegisterValue() returns a uint32_t.
If you do however want to retain 22 bit resolution then it is simply a case of dividing the value by 1024:
int32_t pos_count = (int32_t)(getPosRegisterValue() << 10)) / 1024 ;
Both solutions rely in the implementation-defined behaviour of casting a uint32_t of value not representable in an int32_t; but one a two's complement machine any plausible implementation will not modify the bit-pattern and the result will be as required.
Another perhaps less elegant solution also retaining 22 bit resolution and single bit granularity is:
int32_t pos_count = getPosRegisterValue() ;
// If 22 bit sign bit set...
if( (pos_count & 0x00200000) != 0)
{
// Sign-extend to 32bit
pos_count |= 0xFFC00000 ;
}
It would be wise perhaps to wrap the solution is a function to isolate any implementation defined behaviour:
int32_t posCount()
{
return (int32_t)(getPosRegisterValue() << 10)) / 1024 ;
}

Getting absolute value from a binary int using bit arithmetics

flt32 flt32_abs (flt32 x) {
int mask=x>>31;
printMask(mask,32);
puts("Original");
printMask(x,32);
x=x^mask;
puts("after XOR");
printMask(x,32);
x=x-mask;
puts("after x-mask");
printMask(x,32);
return x;
}
Here's my code, calling the function on the value -32 is returning .125. I'm confused because it's a pretty straight up formula for abs on bits, but I seem to be missing something. Any ideas?
Is flt32 a type for floating point or fixed point numbers?
I suspect it's a type for fixed point arithmetic and you are not using it correctly. Let me explain it.
A fixed-point number uses, as the name says, a fixed position for the decimal digit; this means it uses a fixed number of bits for the decimal part. It is, in fact, a scaled integer.
I guess the flt32 type you are using uses the most significant 24 bits for the whole part and the least significant 8 bits for the decimal part; the value as real number of the 32-bit representation is the value of the same 32 bit representation as integer, divided by 256 (i.e. 28).
For example, the 32-bit number 0x00000020 is interpreted as integer as 32. As fixed-point number using 8 bits for the decimal part, its value is 0.125 (=32/256).
The code you posted is correct but you are not using it correctly.
The number -32 encoded as fixed-point number using 8 decimal digits is 0xFFFFE000 which is the integer representation of -8192 (=-32*256). The algorithm correctly produces 8192 which is 0x00002000 (=32*256); this is also 32 when it is interpreted as fixed-point.
If you pass -32 to the function without taking care to encode it as fixed-point, it correctly converts it to 32 and returns this value. But 32 (0x00000020) is 0.125 (=1/8=32/256) when it is interpreted as fixed-point (what I assume the function printMask() does).
How can you test the code correctly?
You probably have a function that creates fixed-point numbers from integers. Use it to get the correct representation of -32 and pass that value to the flt32_abs() function.
In case you don't have such a function, it is easy to write it. Just multiply the integer with 256 (or even better, left-shift it 8 bits) and that's all:
function int_to_fx32(int x)
{
return x << 8;
}
The fixed-point libraries usually use macros for such conversions because they produce faster code. Expressed as macro, it looks like this:
#define int_to_fx32(x) ((x) << 8)
Now you do the test:
fx32 negative = int_to_fx32(-32);
fx32 positive = fx32_abs(negative);
// This should print 32
printMask(positive, 32);
// This should print 8192
printf("%d", positive);
// This should print -8192
printf("%d", negative);
// This should print 0.125
printMask(32, 32);
int flt32_abs (int x) {
^^^ ^^^
int mask=x>>31;
x=x^mask;
x=x-mask;
return x;
}
I've been able to fix this and obtain the result of 32 by changing float to int, else the code wouldn't build with the error:
error: invalid operands of types 'float' and 'int' to binary 'operator>>'
For an explanation of why binary operations on floats are not allowed in C++, see
How to perform a bitwise operation on floating point numbers
I would like to ask more experienced developers, why did the code even build for OP? Relaxed compiler settings, I guess?

Is bitwise & equivalent to modulo operation

I came across the following snippet which in my opinion is to convert an integer into binary equivalence.
Can anyone tell me why an &1 is used instead of %2 ? Many thanks.
for (i = 0; i <= nBits; ++i) {
bits[i] = ((unsigned long) x & 1) ? 1 : 0;
x = x / 2;
}
The representation of unsigned integers is specified by the Standard: An unsigned integer with n value bits represents numbers in the range [0, 2n), with the usual binary semantics. Therefore, the least significant bit is the remainder of the value of the integer after division by 2.
It is debatable whether it's useful to replace readable mathematics with low-level bit operations; this kind of style was popular in the 70s when compilers weren't very smart. Nowadays I think you can assume that a compiler will know that dividing by two can be realized as bit shift etc., so you can just Write What You Mean.
what the code snippet does, is not to convert a unsigned int into a binary number (it's internal representation is already binary). It created a bit array with the values of the unsigned int's bits. Spreads it out over an array if you will.
e.g. x=3 => bits[2]=0 bits[1]=1 bits[0]=1
To do this
it selects the last bit of the number and places it the bits array
(the &1 operation).
then shifts the number to the right by one position ( /2 is
equivalent to >>1).
Repeats the above operations for all the bits
You could have used %2 instead of &1, the generated code should be the same. But I guess it's just a matter of programming style and preference. For most programmers, the &1 is a lot clearer than %2.
In your example, %2 and &1 are the same. Which one to take is probably simply a matter of taste. While %2 is probably more easier to read for people with a strong mathematics background, &1 is easier to understand for people with a strong technical background.
They are equivalent in the very special case. It's an old Fortran influenced style.

Bitwise operations and shifts

Im having some trouble understanding how and why this code works the way it does. My partner in this assignment finished this part and I cant get ahold of him to find out how and why this works. I've tried a few different things to understand it, but any help would be much appreciated. This code is using 2's complement and a 32-bit representation.
/*
* fitsBits - return 1 if x can be represented as an
* n-bit, two's complement integer.
* 1 <= n <= 32
* Examples: fitsBits(5,3) = 0, fitsBits(-4,3) = 1
* Legal ops: ! ~ & ^ | + << >>
* Max ops: 15
* Rating: 2
*/
int fitsBits(int x, int n) {
int r, c;
c = 33 + ~n;
r = !(((x << c)>>c)^x);
return r;
}
c = 33 + ~n;
This calculates how many high order bits are remaining after using n low order bits.
((x << c)>>c
This fills the high order bits with the same value as the sign bit of x.
!(blah ^ x)
This is equivalent to
blah == x
On a 2's-complement platform -n is equivalent to ~n + 1. For this reason, c = 33 + ~n on such platform is actually equivalent to c = 32 - n. This c is intended to represent how many higher-order bits remain in a 32-bit int value if n lower bits are occupied.
Note two pieces of platform dependence present in this code: 2's-complement platform, 32-bit int type.
Then ((x << c) >> c is intended to sign-fill those c higher order bits. Sign-fill means that those values of x that have 0 in bit-position n - 1, these higher-order bits have to be zeroed-out. But for those values of x that have 1 in bit-position n - 1, these higher-order bits have to be filled with 1s. This is important to make the code work properly for negative values of x.
This introduces another two pieces of platform dependence: << operator that behaves nicely when shifting negative values or when 1 is shifted into the sign bit (formally it is undefined behavior) and >> operator that performs sign-extension when shifting negative values (formally it is implementation-defined)
The rest is, as answered above, just a comparison with the original value of x: !(a ^ b) is equivalent to a == b. If the above transformations did not destroy the original value of x then x does indeed fit into n lower bits of 2's-complement representation.
Using the bitwise complement (unary ~) operator on a signed integer has implementation-defined and undefined aspects. In other words, this code isn't portable, even when you consider only two's complement implementations.
It is important to note that even two's complement representations in C may have trap representations. 6.2.6.2p2 even states this quite clearly:
If the sign bit is one, the value shall be modified in one of the following ways:
-- the corresponding value with sign bit 0 is negated (sign and magnitude);
-- the sign bit has the value -(2 M ) (two's complement );
-- the sign bit has the value -(2 M - 1) (ones' complement ).
Which of these applies is implementation-defined, as is whether the value with sign bit 1 and all value bits zero (for the first two), or with sign bit and all value bits 1 (for ones' complement), is a trap representation or a normal value.
The emphasis is mine. Using trap representations is undefined behaviour.
There are actual implementations that reserve that value as a trap representation in the default mode. The notable one I tend to cite is Unisys Clearpath Dordado on OS2200 (go to 2-29). Do note the date on that document; such implementations aren't necessarily ancient (hence the reason I cite this one).
According to 6.2.6.2p4, shifting negative values left is undefined behaviour, too. I haven't done a whole lot of research into what behaviours are out there in reality, but I would reasonably expect that there might be implementations that sign-extend, as well as implementations that don't. This would also be one way of forming the trap representations mentioned above, which are undefined in nature and thus undesirable. Theoretically (or perhaps some time in the distant or not-so-distant future), you might also face signals "corresponding to a computational exception" (that's a C standard category similar to that which SIGSEGV falls into, corresponding to things like "division by zero") or otherwise erratic and/or undesirable behaviours...
In conclusion, the only reason the code in the question works is by coincidence that the decisions your implementation made happen to align in the right way. If you use the implementation I've listed, you'll probably find that this code doesn't work as expected for some values.
Such heavy wizardry (as it has been described in comments) isn't really necessary, and doesn't really look that optimal to me. If you want something that doesn't rely upon magic (e.g. something portable) to solve this problem consider using this (actually, this code will work for at least 1 <= n <= 64):
#include <stdint.h>
int fits_bits(intmax_t x, unsigned int n) {
uintmax_t min = 1ULL << (n - 1),
max = min - 1;
return (x < 0) * min + x <= max;
}

Resources