Performing arithmetic operations in binary using only bitwise operators [duplicate] - c

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
How can I multiply and divide using only bit shifting and adding?
I have to write functions to perform binary subtraction, multiplication, and division without using any arithmetic operators except for loop control. I've only written code in Java before now, so I'm having a hard time wrapping my head around this.
Starting with subtraction, I need to write a function with prototype
int bsub(int x, int y)
I know I need to convert y to two's complement in order to make it negative and add it to x, but I only know how to do this by using one's complement ~ operator and adding 1, but I can't use the + operator.
The badd function was provided, and I will be able to implement it in bsub if I can figure out how to make y a negative number. The code for badd is shown below. Thanks in advance for any tips.
int badd(int x,int y){
int i;
char sum;
char car_in=0;
char car_out;
char a,b;
unsigned int mask=0x00000001;
int result=0;
for(i=0;i<32;i++){
a=(x&mask)!=0;
b=(y&mask)!=0;
car_out=car_in & (a|b) |a&b;
sum=a^b^car_in;
if(sum) {
result|=mask;
}
if(i!=31) {
car_in=car_out;
} else {
if(car_in!=car_out) {
printf("Overflow occurred\n");
}
}
mask<<=1;
}
return result;
}

Well, subtracting in bitwise operations without the + or - operators is slightly tricky, but can be done. You have the basic idea with the complement, but without using + it becomes slightly tricky.
You can do it by first setting up addition with bit-wise only, then using that, you can do subtraction. Which is used for the complement, So the code looks like this:
int badd(int n1, int n2){
int carry, sum;
carry = (n1 & n2) << 1; // Find bits that are used for carry
sum = n1 ^ n2; // Add each bit, discard carry.
if (sum & carry) // If bits match, add current sum and carry.
return badd(sum, carry);
else
return sum ^ carry; // Return the sum.
}
int bsub(int n1, int n2){
// Add two's complement and return.
return badd(n1, badd(~n2, 1));
}
And then if we use the above code in an example:
int main(){
printf("%d\n", bsub(53, 17));
return 0;
}
Which ends up returning 36. And that is how subtraction works with bitwise only operations.
Afterwards multiplication and division get more complicated, but can be done; for those two operations, use shifts along with addition and/or subtraction to get the job done. You may also want to read this question and this article on how to do it.

You have to implement the binary addition first:
Example with 4 bits:
a = 1101
b = 1011
mask will range from 0001 to 1000
for (i=0;i<4;i++) {
x = a & pow(2, i); //mask, you can shift left as well
y = b & pow(2, i);
z = x ^ y; //XOR to calculate addition
z = z ^ carry; //add previous carry
carry = x & y | x ^ carry | y ^ carry; //new carry
}
This is pseudocode. The mask allows for operating bit by bit from left to right. You'll have to store z conveniently into another variable.
Once you have the addition, you'll be able to implement subtraction by 1'complementing and adding 1.
Multiplication goes the same way, but slightly more difficult. Basically it's the same division method you learned at school, using masks to select bits conveniently and adding the intermediate results using the addition above.
Division is a bit more complicated, it would take some more time to explain but basically it's the same principle.

Related

Is there a way to know if an integer division operation had a remainder?

I don't want to know what the remainder is, I just want to know if there was a remainder as a boolean value.
As such, using the modulo operator is not what I'm looking for.
Something in C would be preferable, but any language works.
If really you cannot use the remainder operation (homework constraint ?), then you can use
check_remainder = b*(a/b) != a;
But the remainder use % is the natural way to go.
Use div(), which, hopefully, calculates quotient and remainder with 1 single operation.
#include <stdlib.h>
//...
int a = 42, b = 5;
div_t c = div(a, b);
if (c.rem) /* remainder for a/b is not zero */;
//...
You cannot tell if there is a remainder until you calculate such remainder, which means you have to perform the division.
That said, there are some shortcuts you can take but only if the divisor is a power of 2. For example, a division by 2 will have a remainder if the least significant bit is 1. In general, a division by 2 raised to the power n will have a remainder if the least n significant bits have a value other than 0000.....0 (n zeroes).
At the assembly level, some architectures, such as x86 provide you with both the quotient and the remainder as he result of a DIV instruction.
If using the modulus operator (or the division operator which is used internally) is not an option, you can do it the naive way:
unsigned has_remainder (unsigned a, unsigned b)
{
while (a >= b)
a -= b;
return a;
}

binary division in c require assistance

i need help i cant seemed to get the result. the assignment i am doing right now has to do with binary division. the purpose of this project is to try to mimic the ALU and we are not allowed to use the addition or subtraction operator. is there some parts in the code that i am missing?
#include <stdio.h>
#include <math.h>
int subtraction(int operand_1, int operand_2);
int division(int dividen, int divisor);
int addition(int operand_1, int operand_2);
int main()
{
int operand_1, operand_2,res;
printf (" enter the value for operand_1(dividen) and operand_2(divisor): ");
scanf ("%d %d",&operand_1,&operand_2);
res = division(operand_1,operand_2);
printf(" binary division: %d\n\n",res);
}
int addition(int operand_1, int operand_2)
{
int carry = operand_2;
int sum = operand_1;
while (carry !=0)
{
int temp =(sum & carry )<<1;
sum = sum ^ carry;
carry = temp;
}
return sum;
}
int subtraction(int operand_1, int operand_2)
{
operand_2= addition(~operand_2,1);
return addition (operand_1,operand_2);
}
int division ( int dividen, int divisor)
{
int i;
int quotient =0;
for (i= 0; i < 33 ; i++)
{
int remainder = subtraction(remainder,divisor);
if (remainder<0)
{
remainder = addition(divisor, remainder);
quotient= quotient << 1 & 0xfe;
}
else
{
quotient = quotient >> 1 & 0xfe;
}
divisor =divisor >> 1 ;
}
return quotient;
}
Issues with the division() function presented
Function division() initializes its quotient variable to 0. It later performs shift operations on it (quotient >> 1, quotient << 1), and bitwise & operations on the shifted results, but all of these will always produce 0 when quotient is 0. Those results are the only thing ever assigned back to quotient, so the function presented will never return anything other than 0 (unless as a result of exercising undefined behavior).
Moreover, this line is certainly wrong:
int remainder = subtraction(remainder,divisor);
It passes the indeterminate initial value of remainder to the subtraction() function, and uses the undefined result of that as the value of remainder.
And perhaps most telling, the function never uses its dividen parameter.
If your compiler is not producing warnings about the last two, then you would be well advised to learn how to turn up its warning output, or else to find a more helpful compiler. On the other hand, if you are ignoring your compiler's warnings then stop doing that. Compiler warnings are there to help you. Take the time to understand what they are telling you, and fix the problems they describe, or else be sure you can explain why it is safe to ignore them.
Binary long division
Based on the small, fixed bounds of the for loop in the presented division() function and its use of bit shifting, I'm inclined to think that you are specifically trying to implement binary long division. That would be a reasonable way to go, but the details aren't right at all.
is there some parts in the code that i am missing?
It's not a matter of some essential detail having been skipped. Although the algorithm implemented is reminiscent of binary long division, there's pretty much nothing correct about it.
Setting aside the questions of negative inputs and division by zero, binary long division could take this general form:
Convert dividend and divisor to type unsigned int. This gives you an extra value bit to work with, and it gives you defined behavior in relevant cases where [signed] int does not. (See also below.)
Shift the (unsigned) divisor left until it exceeds the (unsigned) dividend, then shift it one bit back right. Let b designate the total number of left shifts performed in that process, which may be zero.
Initialize the working quotient to 0, and the working remainder to the dividend.
Perform the following steps b times:
shift the working quotient one bit left
subtract the (shifted) divisor from the working remainder
if the difference is non-negative then
set the working remainder to the difference
turn on the least-significant bit of the working quotient (equivalently, add 1 to the quotient)
(else the difference is negative. The working remainder should not be updated, and the corresponding quotient bit is (already) zero).
shift the divisor one bit right.
After the iterations are finished, the working quotient is the correct binary quotient (of the unsigned operands), and the working remainder is the remainder, as would be computed by the % operator.
Note that although the bit-shifting disguises it somewhat, this is the same long division algorithm you learned in grade school, simplified by the fact that 1 and 0 are the only digits to be concerned with.
If your function must handle negative inputs then the initial conversion to unsigned must capture information about their signs and yield their absolute values. The sign information will inform whether the quotient needs to be inverted at the end.
I have intentionally avoided writing actual C code for the division function, so as not to rob you of the instructional value of writing that code yourself. I would encourage you, moreover, to study the algorithm description until you understand what it's doing and why, and then attempt to rewrite your division() function without referring further to this answer.
This works. I check for operators sign, if both are - result will be +.
Btw, your keyboard input doesnt work fine for 2 negative values, probably it is better in 2 different lines, in 2 scanfs.
int division ( int dividen, int divisor)
{
char sign=0;
int i;
int quotient =0;
int remainder=dividen;
if(dividen<0)
{
dividen=addition(~dividen,1);
sign^=1;
}
if(divisor<0)
{
divisor=addition(~divisor,1);
sign^=1;
}
do
{
printf("%d %d\r\n ",quotient ,remainder);
if (remainder>=divisor)
{
quotient= addition(quotient,1);
remainder=subtraction(remainder,divisor);
}
}while(remainder>=divisor);
if(sign)
quotient=addition(~quotient,1);
return quotient;
}

In C bits, multiply by 3 and divide by 16

A buddy of mine had these puzzles and this is one that is eluding me. Here is the problem, you are given a number and you want to return that number times 3 and divided by 16 rounding towards 0. Should be easy. The catch? You can only use the ! ~ & ^ | + << >> operators and of them only a combination of 12.
int mult(int x){
//some code here...
return y;
}
My attempt at it has been:
int hold = x + x + x;
int hold1 = 8;
hold1 = hold1 & hold;
hold1 = hold1 >> 3;
hold = hold >> 4;
hold = hold + hold1;
return hold;
But that doesn't seem to be working. I think I have a problem of losing bits but I can't seem to come up with a way of saving them. Another perspective would be nice. Just to add, you also can only use variables of type int and no loops, if statements or function calls may be used.
Right now I have the number 0xfffffff. It is supposed to return 0x2ffffff but it is returning 0x3000000.
For this question you need to worry about the lost bits before your division (obviously).
Essentially, if it is negative then you want to add 15 after you multiply by 3. A simple if statement (using your operators) should suffice.
I am not going to give you the code but a step by step would look like,
x = x*3
get the sign and store it in variable foo.
have another variable hold x + 15;
Set up an if statement so that if x is negative it uses that added 15 and if not then it uses the regular number (times 3 which we did above).
Then divide by 16 which you already showed you know how to do. Good luck!
This seems to work (as long as no overflow occurs):
((num<<2)+~num+1)>>4
Try this JavaScript code, run in console:
for (var num = -128; num <= 128; ++num) {
var a = Math.floor(num * 3 / 16);
var b = ((num<<2)+~num+1)>>4;
console.log(
"Input:", num,
"Regular math:", a,
"Bit math:", b,
"Equal: ", a===b
);
}
The Maths
When you divide a positive integer n by 16, you get a positive integer quotient k and a remainder c < 16:
(n/16) = k + (c/16).
(Or simply apply the Euclidan algorithm.) The question asks for multiplication by 3/16, so multiply by 3
(n/16) * 3 = 3k + (c/16) * 3.
The number k is an integer, so the part 3k is still a whole number. However, int arithmetic rounds down, so the second term may lose precision if you divide first, And since c < 16, you can safely multiply first without overflowing (assuming sizeof(int) >= 7). So the algorithm design can be
(3n/16) = 3k + (3c/16).
The design
The integer k is simply n/16 rounded down towards 0. So k can be found by applying a single AND operation. Two further operations will give 3k. Operation count: 3.
The remainder c can also be found using an AND operation (with the missing bits). Multiplication by 3 uses two more operations. And shifts finishes the division. Operation count: 4.
Add them together gives you the final answer.
Total operation count: 8.
Negatives
The above algorithm uses shift operations. It may not work well on negatives. However, assuming two's complement, the sign of n is stored in a sign bit. It can be removed beforing applying the algorithm and reapplied on the answer.
To find and store the sign of n, a single AND is sufficient.
To remove this sign, OR can be used.
Apply the above algorithm.
To restore the sign bit, Use a final OR operation on the algorithm output with the stored sign bit.
This brings the final operation count up to 11.
what you can do is first divide by 4 then add 3 times then again devide by 4.
3*x/16=(x/4+x/4+x/4)/4
with this logic the program can be
main()
{
int x=0xefffffff;
int y;
printf("%x",x);
y=x&(0x80000000);
y=y>>31;
x=(y&(~x+1))+(~y&(x));
x=x>>2;
x=x&(0x3fffffff);
x=x+x+x;
x=x>>2;
x=x&(0x3fffffff);
x=(y&(~x+1))+(~y&(x));
printf("\n%x %d",x,x);
}
AND with 0x3fffffff to make msb's zero. it'l even convert numbers to positive.
This uses 2's complement of negative numbers. with direct methods to divide there will be loss of bit accuracy for negative numbers. so use this work arround of converting -ve to +ve number then perform division operations.
Note that the C99 standard states in section section 6.5.7 that right shifts of signed negative integer invokes implementation-defined behavior. Under the provisions that int is comprised of 32 bits and that right shifting of signed integers maps to an arithmetic shift instruction, the following code works for all int inputs. A fully portable solution that also fulfills the requirements set out in the question may be possible, but I cannot think of one right now.
My basic idea is to split the number into high and low bits to prevent intermediate overflow. The high bits are divided by 16 first (this is an exact operation), then multiplied by three. The low bits are first multiplied by three, then divided by 16. Since arithmetic right shift rounds towards negative infinity instead of towards zero like integer division, a correction needs to be applied to the right shift for negative numbers. For a right shift by N, one needs to add 2N-1 prior to the shift if the number to be shifted is negative.
#include <stdio.h>
#include <stdlib.h>
int ref (int a)
{
long long int t = ((long long int)a * 3) / 16;
return (int)t;
}
int main (void)
{
int a, t, r, c, res;
a = 0;
do {
t = a >> 4; /* high order bits */
r = a & 0xf; /* low order bits */
c = (a >> 31) & 15; /* shift correction. Portable alternative: (a < 0) ? 15 : 0 */
res = t + t + t + ((r + r + r + c) >> 4);
if (res != ref(a)) {
printf ("!!!! error a=%08x res=%08x ref=%08x\n", a, res, ref(a));
return EXIT_FAILURE;
}
a++;
} while (a);
return EXIT_SUCCESS;
}

Implement ceil() in C

I want to implement my own ceil() in C. Searched through the libraries for source code & found here, but it seems pretty difficult to understand. I want clean & elegant code.
I also searched on SO, found some answer here. None of the answer seems to be correct. One of the answer is:
#define CEILING_POS(X) ((X-(int)(X)) > 0 ? (int)(X+1) : (int)(X))
#define CEILING_NEG(X) ((X-(int)(X)) < 0 ? (int)(X-1) : (int)(X))
#define CEILING(X) ( ((X) > 0) ? CEILING_POS(X) : CEILING_NEG(X) )
AFAIK, the return type of the ceil() is not int. Will macro be type-safe here?
Further, will the above implementation work for negative numbers?
What will be the best way to implement it?
Can you provide the clean code?
The macro you quoted definitely won't work correctly for numbers that are greater than INT_MAX but which can still be represented exactly as a double.
The only way to implement ceil() correctly (assuming you can't implement it using an equivalent assembly instruction) is to do bit-twiddling on the binary representation of the floating point number, as is done in the s_ceil.c source file behind your first link. Understanding how the code works requires an understanding of the floating point representation of the underlying platform -- the representation is most probably going to be IEEE 754 -- but there's no way around this.
Edit:
Some of the complexities in s_ceil.c stem from the special cases it handles (NaNs, infinities) and the fact that it needs to do its work without being able to assume that a 64-bit integral type exists.
The basic idea of all the bit-twiddling is to mask off the fractional bits of the mantissa and add 1 to it if the number is greater than zero... but there's a bit of additional logic involved as well to make sure you do the right thing in all cases.
Here's a illustrative version of ceil() for floats that I cobbled together. Beware: This does not handle the special cases correctly and it is not tested extensively -- so don't actually use it. It does however serve to illustrate the principles involved in the bit-twiddling. I've tried to comment the routine extensively, but the comments do assume that you understand how floating point numbers are represented in IEEE 754 format.
union float_int
{
float f;
int i;
};
float myceil(float x)
{
float_int val;
val.f=x;
// Extract sign, exponent and mantissa
// Bias is removed from exponent
int sign=val.i >> 31;
int exponent=((val.i & 0x7fffffff) >> 23) - 127;
int mantissa=val.i & 0x7fffff;
// Is the exponent less than zero?
if(exponent<0)
{
// In this case, x is in the open interval (-1, 1)
if(x<=0.0f)
return 0.0f;
else
return 1.0f;
}
else
{
// Construct a bit mask that will mask off the
// fractional part of the mantissa
int mask=0x7fffff >> exponent;
// Is x already an integer (i.e. are all the
// fractional bits zero?)
if((mantissa & mask) == 0)
return x;
else
{
// If x is positive, we need to add 1 to it
// before clearing the fractional bits
if(!sign)
{
mantissa+=1 << (23-exponent);
// Did the mantissa overflow?
if(mantissa & 0x800000)
{
// The mantissa can only overflow if all the
// integer bits were previously 1 -- so we can
// just clear out the mantissa and increment
// the exponent
mantissa=0;
exponent++;
}
}
// Clear the fractional bits
mantissa&=~mask;
}
}
// Put sign, exponent and mantissa together again
val.i=(sign << 31) | ((exponent+127) << 23) | mantissa;
return val.f;
}
Nothing you will write is more elegant than using the standard library implementation. No code at all is always more elegant than elegant code.
That aside, this approach has two major flaws:
If X is greater than INT_MAX + 1 or less than INT_MIN - 1, the behavior of your macro is undefined. This means that your implementation may give incorrect results for nearly half of all floating-point numbers. You will also raise the invalid flag, contrary to IEEE-754.
It gets the edge cases for -0, +/-infinity, and nan wrong. In fact, the only edge case it gets right is +0.
You can implement ceil in manner similar to what you tried, like so (this implementation assumes IEEE-754 double precision):
#include <math.h>
double ceil(double x) {
// All floating-point numbers larger than 2^52 are exact integers, so we
// simply return x for those inputs. We also handle ceil(nan) = nan here.
if (isnan(x) || fabs(x) >= 0x1.0p52) return x;
// Now we know that |x| < 2^52, and therefore we can use conversion to
// long long to force truncation of x without risking undefined behavior.
const double truncation = (long long)x;
// If the truncation of x is smaller than x, then it is one less than the
// desired result. If it is greater than or equal to x, it is the result.
// Adding one cannot produce a rounding error because `truncation` is an
// integer smaller than 2^52.
const double ceiling = truncation + (truncation < x);
// Finally, we need to patch up one more thing; the standard specifies that
// ceil(-small) be -0.0, whereas we will have 0.0 right now. To handle this
// correctly, we apply the sign of x to the result.
return copysign(ceiling, x);
}
Something like that is about as elegant as you can get and still be correct.
I flagged a number of concerns with the (generally good!) implementation that Martin put in his answer. Here's how I would implement his approach:
#include <stdint.h>
#include <string.h>
static inline uint64_t toRep(double x) {
uint64_t r;
memcpy(&r, &x, sizeof x);
return r;
}
static inline double fromRep(uint64_t r) {
double x;
memcpy(&x, &r, sizeof x);
return x;
}
double ceil(double x) {
const uint64_t signbitMask = UINT64_C(0x8000000000000000);
const uint64_t significandMask = UINT64_C(0x000fffffffffffff);
const uint64_t xrep = toRep(x);
const uint64_t xabs = xrep & signbitMask;
// If |x| is larger than 2^52 or x is NaN, the result is just x.
if (xabs >= toRep(0x1.0p52)) return x;
if (xabs < toRep(1.0)) {
// If x is in (1.0, 0.0], the result is copysign(0.0, x).
// We can generate this value by clearing everything except the signbit.
if (x <= 0.0) return fromRep(xrep & signbitMask);
// Otherwise x is in (0.0, 1.0), and the result is 1.0.
else return 1.0;
}
// Now we know that the exponent of x is strictly in the range [0, 51],
// which means that x contains both integral and fractional bits. We
// generate a mask covering the fractional bits.
const int exponent = xabs >> 52;
const uint64_t fractionalBits = significandMask >> exponent;
// If x is negative, we want to truncate, so we simply mask off the
// fractional bits.
if (xrep & signbitMask) return fromRep(xrep & ~fractionalBits);
// x is positive; to force rounding to go away from zero, we first *add*
// the fractionalBits to x, then truncate the result. The add may
// overflow the significand into the exponent, but this produces the
// desired result (zero significand, incremented exponent), so we just
// let it happen.
return fromRep(xrep + fractionalBits & ~fractionalBits);
}
One thing to note about this approach is that it does not raise the inexact floating-point flag for non-integral inputs. That may or may not be a concern for your usage. The first implementation that I listed does raise the flag.
I don't think a macrofunction is a good solution: it isn't type safe and there is a multi-evaluation of the arguments (side-effects). You should rather write a clean and elegant function.
As I would have expected more jokes in answers, I will try a couple
#define CEILING(X) ceil(X)
Bonus: a macro with not so many side effects
If you don't care too much of negative zeroes
#define CEILING(X) (-floor(-(X)))
If you care of negative zero, then
#define CEILING(X) (NEGATIVE_ZERO - floor(-(X)))
Portable definition of NEGATIVE_ZERO left as an exercize....
Bonus, it will also set FP flags (OVERFLOW INVALID INEXACT)

How will you implement pow(a,b) in C ? condition follows --

without using multiplication or division operators.
You can use only add/substract operators.
A pointless problem, but solvable with the properties of logarithms:
pow(a,b) = exp( b * log(a) )
= exp( exp(log(b) + log(log(a)) )
Take care to insure that your exponential and logarithm functions are using the same base.
Yes, I know how to use a sliderule. Learning that trick will change your perspective of logarithms.
If they are integers, it's simple to turn pow (a, b) into b multiplications of a.
pow(a, b) = a * a * a * a ... ; // do this b times
And simple to turn a * a into additions
a * a = a + a + a + a + ... ; // do this a times
If you combine them, you can make pow.
First, make mult(int a, int b), then use it to make pow.
A recursive solution :
#include<stdio.h>
int multiplication(int a1, int b1)
{
if(b1)
return (a1 + multiplication(a1, b1-1));
else
return 0;
}
int pow(int a, int b)
{
if(b)
return multiplication(a, pow(a, b-1));
else
return 1;
}
int main()
{
printf("\n %d", pow(5, 4));
}
You've already gotten answers purely for FP and purely for integers. Here's one for a FP number raised to an integer power:
double power(double x, int y) {
double z = 1.0;
while (y > 0) {
while (!(y&1)) {
y >>= 2;
x *= x;
}
--y;
z = x * z;
}
return z;
}
At the moment this uses multiplication. You can implement multiplication using only bit shifts, a few bit comparisons, and addition. For integers it looks like this:
int mul(int x, int y) {
int result = 0;
while (y) {
if (y&1)
result += x;
x <<= 1;
y >>= 1;
}
return result;
}
Floating point is pretty much the same, except you have to normalize your results -- i.e., in essence, a floating point number is 1) a significand expressed as a (usually fairly large) integer, and 2) a scale factor. If you want to produce normal IEEE floating point numbers a few parts get a bit ugly though -- for example, the scale factor is stored as a "bias" number instead of any of the usual 1's complement, 2's complement, etc., so working with it is clumsy (basically, each operation you subtract off the bias, do the operation, check for overflow, and (assuming it hasn't overflowed) add the bias back on again).
Doing the job without any kind of logical tests sounds (to me) like it probably wasn't really intended. For quite a few computer architecture classes, it's interesting to reduce a problem to primitive operations you can express directly in hardware (e.g., bit shifts, bitwise-AND, -OR and -NOT, etc.) The implementation shown above fits that reasonably well (if you want to get technical, an adder takes a few gates, but VHDL, Verilog, etc., but it's included in things like VHDL and Verilog anyway).

Resources