I am trying to code a program that will take a floating point number in base 10 and convert its fractional part in base 2. In the following code, I am intending to call my converting function into a printf, and format the output; the issue I have lies in my fra_binary() where I can't figure out the best way to return an integer made of the result of the conversion at each turn respectively (concatenation). Here is what I have done now (the code is not optimized because I am still working on it) :
#include <stdio.h>
#include <math.h>
int fra_binary(double fract) ;
int main()
{
long double n ;
double fract, deci ;
printf("base 10 :\n") ;
scanf("%Lf", &n) ;
fract = modf(n, &deci) ;
int d = deci ;
printf("base 2: %d.%d\n", d, fra_binary(fract)) ;
return(0) ;
}
int fra_binary(double F)
{
double fl ;
double decimal ;
int array[30] ;
for (int i = 0 ; i < 30 ; i++) {
fl = F * 2 ;
F = modf(fl, &decimal) ;
array[i] = decimal ;
if (F == 0) break ;
}
return array[0] ;
}
Obviously this returns partly the desired output, because I would need the whole array concatenated as one int or char to display the series of 1 and 0s I need. So at each turn, I want to use the decimal part of the number I work on as the binary number to concatenate (1 + 0 = 10 and not 1). How would I go about it?
Hope this makes sense!
return array[0] ; is only the first value of int array[30] set in fra_binary(). Code discards all but the first calculation of the loop for (int i = 0 ; i < 30 ; i++).
convert its fractional part in base 2
OP's loop idea is a good starting point. Yet int array[30] is insufficient to encode the fractional portion of all double into a "binary".
can't figure out the best way to return an integer
Returning an int will be insufficient. Instead consider using a string - or manage an integer array in a likewise fashion.
Use defines from <float.h> to drive the buffer requirements.
#include <stdio.h>
#include <math.h>
#include <float.h>
char *fra_binary(char *dest, double x) {
_Static_assert(FLT_RADIX == 2, "Unexpected FP base");
double deci;
double fract = modf(x, &deci);
fract = fabs(fract);
char *s = dest;
do {
double d;
fract = modf(fract * 2.0, &d);
*s++ = "01"[(int) d];
} while (fract);
*s = '\0';
// For debug
printf("%*.*g --> %.0f and .", DBL_DECIMAL_DIG + 8, DBL_DECIMAL_DIG, x,
deci);
return dest;
}
int main(void) {
// Perhaps 53 - -1021 + 1
char fraction_string[DBL_MANT_DIG - DBL_MIN_EXP + 1];
puts(fra_binary(fraction_string, -0.0));
puts(fra_binary(fraction_string, 1.0));
puts(fra_binary(fraction_string, asin(-1))); // machine pi
puts(fra_binary(fraction_string, -0.1));
puts(fra_binary(fraction_string, DBL_MAX));
puts(fra_binary(fraction_string, DBL_MIN));
puts(fra_binary(fraction_string, DBL_TRUE_MIN));
}
Output
-0 --> -0 and .0
1 --> 1 and .0
3.1415926535897931 --> 3 and .001001000011111101101010100010001000010110100011
-0.10000000000000001 --> -0 and .0001100110011001100110011001100110011001100110011001101
1.7976931348623157e+308 --> 179769313486231570814527423731704356798070600000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 and .0
2.2250738585072014e-308 --> 0 and .00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
4.9406564584124654e-324 --> 0 and .000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001
Also unclear why input is long double, yet processing is with double. Recommend using just one FP type.
Note that your algorithm finds out the binary representation of the fraction most significant bit first.
One way to convert the fractional part to a binary string, would be to supply the function with a string and a string length, and have the function fill it with up to that many binary digits:
/* This function returns the number of chars needed in dst
to describe the fractional part of value in binary,
not including the trailing NUL ('\0').
Returns zero in case of an error (non-finite value).
*/
size_t fractional_bits(char *dst, size_t len, double value)
{
double fraction, integral;
size_t i = 0;
if (!isfinite(value))
return 0;
if (value > 0.0)
fraction = modf(value, &integral);
else
if (value < 0.0)
fraction = modf(-value, &integral);
else {
/* Zero fraction. */
if (len > 1) {
dst[0] = '0';
dst[1] = '\0';
} else
if (len > 0)
dst[0] = '\0';
/* One binary digit was needed for exact representation. */
return 1;
}
while (fraction > 0.0) {
fraction = fraction * 2.0;
if (fraction >= 1.0) {
fraction = fraction - 1.0;
if (i < len)
dst[i] = '1';
} else
if (i < len)
dst[i] = '0';
i++;
}
if (i < len)
dst[i] = '\0';
else
if (len > 0)
dst[len - 1] = '\0';
return i;
}
The above function works very much like snprintf(), except it takes only the double whose fractional bits are to be stored as a string of binary digits (0 or 1). and returns 0 in case of an error (non-finite double value).
Another option is to use an unsigned integer type to hold the bits. For example, if your code is intended to work on architectures where double is an IEEE-754 Binary64 type or similar, the mantissa has up to 53 bits of precision, and an uint64_t would suffice.
Here is an example of that:
uint64_t fractional_bits(const double val, size_t bits)
{
double fraction, integral;
uint64_t result = 0;
if (bits < 1 || bits > 64) {
errno = EINVAL;
return 0;
}
if (!isfinite(val)) {
errno = EDOM;
return 0;
}
if (val > 0.0)
fraction = modf(val, &integral);
else
if (val < 0.0)
fraction = modf(-val, &integral);
else {
errno = 0;
return 0;
}
while (bits-->0) {
result = result << 1;
fraction = fraction * 2.0;
if (fraction >= 1.0) {
fraction = fraction - 1.0;
result = result + 1;
}
}
errno = 0;
return result;
}
The return value is the binary representation of the fractional part: [i]fractional_part[/i] ≈ [i]result[/i] / 2[sup][i]bits[/i][/sup], where [i]bits[/i] is between 1 and 64, inclusive.
In order for the caller to detect an error, the function clears errno to zero if no error occurred. If an error does occur, the function returns zero with errno set to EDOM if the value is not finite, or to EINVAL if bits is less than 1 or greater than 64.
You can combine the two approaches, if you implement an arbitrary-size unsigned integer type, or a bitmap type.
Related
I'm having problems converting negative numbers, from decimal base to hexadecimal base, with the following function:
#include <stdio.h>
int main()
{
int quotient, remainder;
int i, j = 0;
char hexadecimalnum[100];
quotient = -50;
while (quotient != 0)
{
remainder = quotient % 16;
if (remainder < 10)
hexadecimalnum[j++] = 48 + remainder;
else
hexadecimalnum[j++] = 55 + remainder;
quotient = quotient / 16;
}
strrev(hexadecimalnum);
printf("%s", hexadecimalnum);
return 0;
}
For quotient = -50; the correct output should be:
ffffffce
But this function's output is:
.
With positive numbers the output is always correct but with negative numbers not.
I'm having a hard time understanding to why it doesn't work with negative numbers.
Some fixes:
unsigned int quotient - you need to convert -50 to a large hex number in two's complement or you'll get the wrong number of iterations (2) in the loop, instead of 8 as required.
Removal of "magic numbers": '0' + remainder and 'A' + remainder - 10.
Zero initialize hexadecimalnum becaues it needs to be null terminated before printing a string from there. Better yet, add the null termination explicitly.
Use for loops when possible.
Might as well store the characters from the back to front and save the extra call of reversing the string.
Result:
#include <stdio.h>
// 4 bytes*2 = 8 nibbles
#define HEX_STRLEN (sizeof(int)*2)
int main()
{
unsigned int remainder;
int i = 0;
char hex[100];
for(unsigned int q = -50; q!=0; q/=16)
{
remainder = q % 16;
if (remainder < 10)
hex[HEX_STRLEN-i-1] = '0' + remainder;
else
hex[HEX_STRLEN-i-1] = 'A' + remainder - 10;
i++;
}
hex[HEX_STRLEN] = '\0'; // explict null termination
printf("%s\n", hex);
}
(There's lots of improvements than can be made still, this is just to be considered as the first draft.)
You can use printf's format specifier "%08x", then you can print any number in their respective hexadecimal representation.
#include <stdio.h>
void num_to_hex(int a, char *ptr) { snprintf(ptr, 9, "%08x", a); }
int main() {
char hex[10] = {};
num_to_hex(-50, hex);
printf("%s\n", hex);
return 0;
}
Output:
ffffffce
I'm coding a C function that allows me to transform any float or double into a string containing 32 bits of 0s and 1s (according to IEEE754 standard). I'm not going to make use of printf as the objective is to understand the way it works and be able to store the string.
I took the calculus method from this video: https://www.youtube.com/watch?v=8afbTaA-gOQ. It enabled me to deconstruct the floats into 1 bit for the sign, 8 bits for the exponent and 23 bits for the mantissa.
I'm getting some pretty good results, but my converter is still not accurate, and my mantissa is often wrong in the last bits. The method I use to calculate the mantissa is (where strnew is just a malloc of the appropriate length):
char *ft_double_decimals(double n, int len)
{
char *decimals;
int i;
if (!(decimals = ft_strnew(len)))
return (NULL);
i = 0;
while (i < len)
{
n = n * 2;
decimals[i++] = (n >= 1) ? '1' : '0';
n = n - (int)n;
}
return (decimals);
}
For a float such as 0.1 I get this mantissa: 1001 1001 1001 1001 1001 100 where I should get 1001 1001 1001 1001 1001 101. This is so frustrating! I'm obviously missing something here, and I guess it has something to do with a wrong approximation of the decimals, so if someone knows what method I should use instead of the one I'm using I'll be very grateful!
my mantissa is often wrong in the last bits.
When the conversion is incomplete, results should be rounded. #Eric Postpischil
The below rounds half-way cases away from zero.
char *ft_double_decimals(double n, int len) {
char *decimals;
int i;
if (!(decimals = ft_strnew(len)))
return (NULL);
i = 0;
while (i < len) {
n = n * 2;
decimals[i++] = (n >= 1) ? '1' : '0';
n = n - (int) n;
}
// Add rounding code
if (n >= 0.5) {
int carry = 1;
while (i > 0) {
i--;
int sum = decimals[i] - '0' + carry;
decimals[i] = sum % 2 + '0';
carry = sum / 2;
}
if (i == 0 && carry > 0) {
// Rounding into the "one's" digit"
// TBD code to indicate to the caller that event
}
}
return (decimals);
}
int main(void) {
printf("%s\n", ft_double_decimals(0.1f, 23)); // --> 00011001100110011001101
return 0;
}
A more common rounding: round half-way cases to nearest even.
if (n >= 0.5 && (n > 0.5 || ((i > 0) && decimals[i-1] > '0'))) {
Further, calling code needs to be informed when the rounded result is "1.00000..."
Short story. I made a program that does addition for binary integers. I need to make it work for binary real numbers (e.g. 1010.1010(binary)=10.625(decimal)
The input is given as a binary string.
I made a lot of attempts and I couldn't find a simple way to do it. Please help create such a program.
Example: {input: 1010.1010(10.625 in decimal) 0.1(0.5 in decimal)
output: 1011.001 (11.125 in decimal)}
Code:
#include <stdio.h>
#include <string.h>
void bin_add(int c[400], int d[400])
{
int car[400]; //carry
int i = 199;
car[i] = 0;
while (i >= 0)
{
//find carry and shift it left
//find the sum
car[i - 1] = (c[i] & d[i]) | (c[i] & car[i]) | (d[i] & car[i]);
c[i] = (c[i] ^ d[i]) ^ car[i];
printf("car[i-1]=%d c[i]=%d\n", car[i - 1], c[i]);
i--;
}
// printf("\n");
}
int main()
{
int l, l1, i;//l and l1 are lengths
char a[200], b[200]; //a and b are the inputs
int c[200], d[200]; //c and d are used for processing
for (i = 0; i < 200; i++)
{
c[i] = 0;
d[i] = 0;
}
gets(a);
gets(b);
l = strlen(a);
l1 = strlen(b);
for (int i = 0; i < l; i++)
{
c[200 - l + i] = a[i] - 48;
}
////////////////////////////////////////////
for (int i = 0; i < l1; i++)
{
d[200 - l1 + i] = b[i] - 48;
}
////////////////////////////////
bin_add(c, d);
for (i = 0; i < 200; i++)
printf("%d", c[i]);
return 0;
}
What you really want to do, is handle each digit in order of increasing importance. To make that easier, you should implement the following functions:
/* Return the number of fractional bits in bs */
int bs_fractbits(const char *bs);
/* Return the number of integer bits in bs */
int bs_intbits(const char *bs);
/* Return the bit in bs corresponding to value 2**i,
0 if outside the bit string */
int bs_bit(const char *bs, int i);
/* Return -1 if bs is negative,
0 if bs is zero or NULL,
+1 if bs is positive */
int bs_sign(const char *bs);
/* Return -1 if bs1 < bs2,
0 if bs1 == bs2,
+1 if bs1 > bs2. */
int bs_cmp(const char *bs1, const char *bs2);
To support negative values, you'll need to implement both addition and subtraction (of "unsigned" bit strings):
Addition: The result has as many fractional bits as the term that has most fractional bits, and possibly one more integer bit than the term that has most integer bits. Start at the least significant bit in either term, and work your way up to the most significant bit in either term, summing each bit, and keeping the "carry bit" along, just like you'd do by hand. If the carry is nonzero at end, you'll get that one additional bit.
Subtraction: Always subtract smaller from larger. If that changes the order of the terms, negate the result. The result has at most as many fractional bits as the term that has most fractional bits, and at most as many integer bits as the term that has most integer bits. This is just like addition, except you subtract the bits, and instead of "carry bit", you use a "borrow bit". Because you subtract smaller unsigned value from larger unsigned value, the "borrow bit" will be zero at end.
Multiplication: The integer part has the number of integer bits, and the number of fractional bits, as the terms have in total (summed). You can implement the operation as if multiplying two unsigned integer values, and just insert the bit at end. (So that the result has as many fractional bits as the input terms have in total.) This usually involves a double loop, just like in long multiplication by hand.
Note that the same logic also works if you use larger radix instead of 2. The "carry"/"borrow" is a digit, between zero and one less than the radix.
Personally, I'd be very tempted to use a structure to describe each digit string:
typedef struct {
int idigits; /* Number of integral digits before point */
int fdigits; /* Number of fractional digits after point */
int size; /* Number of chars dynamically allocated at data */
char *point; /* Location of decimal point */
char *data; /* Dynamically allocated buffer */
} digitstring;
#define DIGITSTRING_INIT { 0, 0, 0, NULL, NULL }
with an additional flag if negative digit strings are to be supported.
Digit D with numerical value D×Bi, where B is the radix (number of unique digits used) and i being the position of said digit, is located at point[-i] if i < 0 (and -i <= fdigits), or at point[-i-1] if i >= 0 (and i < idigits). point[0] itself is where the decimal point is, if there is one.
For example, if we have string 0100.00, then idigits = 4, fdigits = 2, and the only nonzero digit is at position 2. (Position 0 is on the left side of the decimal point, and -1 on the right side.)
size and data fields allow reuse of the dynamically allocated buffer. Each declaration of a digitstring must be initialized, digitstring value = DIGITSTRING_INIT;, because there is no initialization function; this way you are less likely to leak memory (unless you forget to free a digitstring when no longer needed):
/* Free the specified digit string. */
static inline void digitstring_free(digitstring *ds)
{
if (ds) {
if (ds->data)
free(ds->data);
ds->idigits = 0;
ds->fdigits = 0;
ds->size = 0;
ds->point = NULL;
ds->data = NULL;
}
}
To use the digit string as a C string, you use a helper function to obtain the pointer to the most significant digit in the digit string:
/* Return a pointer to a printable version of the digit string. */
static const char *digitstring_str(const digitstring *ds, const char *none)
{
if (ds && ds->point)
return (const char *)(ds->point - ds->idigits);
else
return none;
}
I've found that rather than crash, it is often useful to pass an extra parameter that is only used for the return value when the return value is otherwise undefined. For example, if you have an initialized digit string foo without any contents, then digitstring_str(&foo, "0") returns the string literal "0".
The main point of the digit string structure is to have accessor functions that get and set each individual digit:
/* Get the value of a specific digit. */
static inline unsigned int digitstring_get(const digitstring *ds, const int position)
{
if (ds) {
if (position < 0) {
if (-position <= ds->fdigits)
return digit_to_value(ds->point[-position]);
else
return 0;
} else {
if (position < ds->idigits)
return digit_to_value(ds->point[-position-1]);
else
return 0;
}
} else
return 0;
}
/* Set the value of a specific digit. */
static inline void digitstring_set(digitstring *ds, const int position, const unsigned int value)
{
if (!ds) {
fprintf(stderr, "digitstring_set(): NULL digitstring specified.\n");
exit(EXIT_FAILURE);
} else
if (position < 0) {
if (-position > ds->fdigits) {
fprintf(stderr, "digitstring_set(): Digit position underflow (in fractional part).\n");
exit(EXIT_FAILURE);
}
ds->point[-position] = value_to_digit(value);
} else {
if (position >= ds->idigits) {
fprintf(stderr, "digitstring_set(): Digit position overflow (in integer part).\n");
exit(EXIT_FAILURE);
}
ds->point[-position-1] = value_to_digit(value);
}
}
Above, value_to_digit() is a helper function that converts a numerical value to the corresponding character, and digit_to_value() converts a character to the corresponding numerical value.
All operations (from parsing to arithmetic operators) really need a "constructor", that creates a new digit string with sufficient number of digits. (The number of digits is known beforehand for each operation, and depends only on the number of significant digits in the terms.) For this, I created a function that constructs a zero of desired size:
/* Clear the specified digit string to zero. */
static inline void digitstring_zero(digitstring *ds, int idigits, int fdigits)
{
int size;
char *data;
if (!ds) {
fprintf(stderr, "digitstring_zero(): No digitstring specified.\n");
exit(EXIT_FAILURE);
}
/* Require at least one integral digit. */
if (idigits < 1)
idigits = 1;
if (fdigits < 0)
fdigits = 0;
/* Total number of chars needed, including decimal point
and string-terminating nul char. */
size = idigits + 1 + fdigits + 1;
/* Check if dynamically allocated buffer needs resizing. */
if (ds->size < size) {
if (ds->data)
data = realloc(ds->data, size);
else
data = malloc(size);
if (!data) {
fprintf(stderr, "digitstring_zero(): Out of memory.\n");
exit(EXIT_FAILURE);
}
ds->data = data;
ds->size = size;
} else {
data = ds->data;
size = ds->size;
}
/* Fill it with zeroes. */
memset(data, value_to_digit(0), idigits + 1 + fdigits);
/* Pad the unused space with nul chars, terminating the string. */
memset(data + idigits + 1 + fdigits, '\0', size - idigits - 1 - fdigits);
/* Assign the decimal point. */
ds->point = data + idigits;
/* If there are no decimals, no need for a decimal point either. */
if (fdigits > 0)
ds->point[0] = decimal_point;
else
ds->point[0] = '\0';
/* After setting the desired digits, use digitstring_trim(). */
ds->idigits = idigits;
ds->fdigits = fdigits;
}
It will ensure the digit string has enough room for the specified number of digits, reallocating its dynamically allocated buffer if necessary, reusing it if already large enough.
The idea is that to implement an operation, you first find out the maximum number of integral and fractional digits the result can have. You use the above to create the result digit string, then digitstring_set() to set each digit to their respective values. You will typically operate in increasing digit significance, which means increasing digit "positions".
If we have additional helper functions int digits(const char *src), which returns the number of consecutive valid digit characters starting at src, and int decimal_points(const char *src), which returns 1 if src points to a decimal point, and 0 otherwise, we can parse input strings into digit strings using
/* Parse a string into a digit string, returning the pointer
to the first unparsed character, or NULL if an error occurs. */
static const char *digitstring_parse(digitstring *ds, const char *src)
{
const int zero = value_to_digit(0);
const char *idigit, *fdigit;
int idigits, fdigits, fextra, n;
/* Fail if nothing to parse. */
if (!src)
return NULL;
/* Skip leading whitespace. */
while (isspace((unsigned char)(*src)))
src++;
/* Fail if nothing to parse. */
if (*src == '\0')
return NULL;
/* Scan integer digits. */
idigit = src;
src += digits(src);
idigits = (int)(src - idigit);
/* Decimal point? */
fextra = 0;
n = decimal_points(src);
if (n > 0) {
src += n;
/* Scan fractional digits. */
fdigit = src;
src += digits(src);
fdigits = (int)(src - fdigit);
if (fdigits < 1)
fextra = 1;
} else {
fdigit = src;
fdigits = 0;
}
/* No digits? */
if (idigit == 0 && fdigit == 0)
return NULL;
/* Trim leading zeroes. */
while (idigits > 1 && *idigit == zero) {
idigits--;
idigit++;
}
/* Trim trailing zeroes. */
while (fdigits > 1 && fdigit[fdigits - 1] == zero)
fdigits--;
/* Create the necessary digit string, */
digitstring_zero(ds, idigits, fdigits + fextra);
/* copy the integer digits, if any, */
if (idigits > 0)
memcpy(ds->point - idigits, idigit, idigits);
/* and the fractional digits, if any. */
if (fdigits > 0)
memcpy(ds->point + 1, fdigit, fdigits);
/* Return a pointer to the first unparsed character. */
return src;
}
After updating its digits, one can call a helper function to remove any extra leading zeroes:
static inline void digitstring_ltrim(digitstring *ds)
{
if (ds && ds->point) {
const int zero = value_to_digit(0);
while (ds->idigits > 1 && ds->point[-ds->idigits] == zero)
ds->idigits--;
}
}
Adding two (unsigned) digit strings, possibly reusing one of the terms, is now quite simple to implement:
static void digitstring_add(digitstring *to, const digitstring *src1, const digitstring *src2)
{
digitstring result = DIGITSTRING_INIT;
unsigned int carry = 0;
int i, idigits, fdigits;
if (!to || !src1 || !src2) {
fprintf(stderr, "digitstring_add(): NULL digitstring specified.\n");
exit(EXIT_FAILURE);
}
/* For addition, the result has as many digits
as the longer source term. */
idigits = (src1->idigits >= src2->idigits) ? src1->idigits : src2->idigits;
fdigits = (src1->fdigits >= src2->fdigits) ? src1->fdigits : src2->fdigits;
/* Result needs possibly one more integer digit,
in case carry overflows. */
digitstring_zero(&result, idigits + 1, fdigits);
/* Addition loop, in order of increasing digit significance. */
for (i = -fdigits; i < idigits; i++) {
const unsigned int sum = digitstring_get(src1, i)
+ digitstring_get(src2, i)
+ carry;
digitstring_set(&result, i, sum % RADIX);
carry = sum / RADIX;
}
digitstring_set(&result, idigits, carry);
/* Trim leading zeroes. */
digitstring_ltrim(&result);
/* At this point, we can discard the target, even if it is actually
one of the sources, and copy the result to it. */
digitstring_free(to);
*to = result;
}
where RADIX is the radix used (the number of unique digits, 2 for binary). Pay extra attention to the digit loop. -fdigits is the least significant position in the result, and idigits-1 the most significant position. We need the accessor functions, because the source terms might not contain those digits at all (they are logically zero then).
These functions have been tested to work on both binary and octal number strings. I like this implementation, because it omits the decimal point if all terms are integers (so you get 12 + 33 = 45), but (due to fextra in digitstring_parse()) if any of the terms have a decimal point, then the result will have at least one fractional digit (so 12. + 33 = 45.0).
After all the beautiful ideas in Animals' answer I felt the strange urge, to present my own solution:
#include <stdbool.h>
#include <stddef.h>
#include <string.h>
#include <stdlib.h>
#include <stdio.h>
#define MAX(x, y) ((x) > (y) ? (x) : (y))
size_t gpp(char const *s)
{
char *n = strchr(s, '.');
return n ? n - s + 1 : 0;
}
char* bin_add(char const *a, char const *b)
{
char const *inp[] = { a, b };
size_t ll[] = { strlen(a), strlen(b) };
size_t pp[] = { gpp(a), gpp(b) };
size_t OO[2], off[2];
for (size_t i = 0; i < 2; ++i) {
OO[i] = pp[i] ? pp[i] - 1 : ll[i];
pp[i] = pp[i] ? ll[i] - pp[i] : 0;}
for (size_t i = 0; i < 2; ++i)
off[i] = OO[i] < OO[!i] ? OO[!i] - OO[i] : 0;
size_t ML = MAX(OO[0], OO[1]) + MAX(pp[0], pp[1]) + (!!pp[0] || !!pp[1]);
char *Ol = calloc(ML + 2, 1);
if(!Ol) return Ol;
char ops[2];
int xc = 0;
size_t lO = ML;
unsigned cc[2] = { 0 };
for (size_t i = ML; i; --i) {
bool pt = false;
for (size_t l = 0; l < 2; ++l) {
ops[l] = i <= ll[l] + off[l] && i - off[l] - 1
< ll[l] ? inp[l][i - off[l] - 1] : '0';
if (ops[l] == '.') {
if (cc[l]) {
free(Ol);
return NULL;
}
pt = true;
++cc[l];
}
ops[l] -= '0';
}
if (pt) {
Ol[i] = '.';
continue;
}
if ((Ol[i] = ops[0] + ops[1] + xc) > 1) {
Ol[i] = 0;
xc = 1;
}
else xc = 0;
lO = (Ol[i] += '0') == '1' ? i : lO;
}
if((Ol[0] = '0' + xc) == '1') return Ol;
for (size_t i = 0; i <= ML - lO + 1; ++i)
Ol[i] = Ol[lO + i];
return Ol;
}
int main(void)
{
char a[81], b[81];
while (scanf(" %80[0.1] %80[0.1]", a, b) & 1 << 1) {
char *c = bin_add(a, b);
if (!c && errno == ENOMEM) {
fputs("Not enough memory :(\n\n", stderr);
return EXIT_FAILURE;
}
else if (!c) {
fputs("Input error :(\n\n", stderr);
goto clear;
}
char* O[] = { a, b, c };
size_t lO[3], Ol = 0;
for (size_t i = 0; i < 3; ++i) {
lO[i] = gpp(O[i]);
lO[i] = lO[i] ? lO[i] : strlen(i[O]) + 1;
Ol = lO[i] > Ol ? lO[i] : Ol;
}
putchar('\n');
for (size_t i = 0; i < 3; ++i) {
for (size_t l = 0; l < Ol - lO[i]; ++l, putchar(' '));
puts(O[i]);
}
putchar('\n');
free(c);
clear :{ int c; while ((c = getchar()) != '\n' && c != EOF); }
}
}
Sample Output:
11001001 .11001001
11001001
.11001001
11001001.11001001
11001001 1010
11001001
1010
11010011
111111 1
111111
1
1000000
1010101 010111001.0101110101010
1010101
010111001.0101110101010
100001110.0101110101010
1001001.010111010101 10100101100.10010111101
1001001.010111010101
10100101100.10010111101
10101110101.111000001111
. .
.
.
0
.. .
Input error :(
A
Press any key to continue . . .
I contemplated wheter I should ask for a review at codereview. But I think I schould rather not.
There are two answers, depending upon whether you desire fixed- or floating- point arithmetic.
The first issue is reading the number. strtol() is your friend here:
char input[BUFFER_SIZE];
char * tmp;
long integral, fractional;
fgets(input, BUFFER_SIZE-1, stdin);
integral = strtol(input, &tmp, 2); /* Read BINARY integral part */
tmp++; /* Pass over the binary point. You may want to check that it is actually a dot */
fractional = strtol(tmp, null, 2); /* Read BINARY fractional part */
The next issue is figuring out how you will do the arithmetic. fractional must be bit-shifted an amount depending on how many digits past the point were provided and your desired precision. Fixed point arithmetic is simple: fractional <<= FRAC_BITS - strlen(tmp) then add the fractional parts together. Mask by ((1<<FRAC_BITS)-1) for the fractional part of the sum, shift the remaining bits and add them to the integral parts for the integral part of the sum. Floating-point is a little more finicky, but not too much harder.
For real numbers, convert non-fraction and fraction part to decimal, do the addition and print it as binary. This will require function to convert a number to binary string. Just a note that real numbers are float numbers in C and they are represented in binary with mantessa form like 2e^3 which is 2 multiplied by exponent to the power of 3.
This is a follow-up to my original post. But I'll repeat it for clarity:
As per DICOM standard, a type of floating point can be stored using a Value Representation of Decimal String. See Table 6.2-1. DICOM Value Representations:
Decimal String: A string of characters representing either a fixed
point number or a floating point number. A fixed point number shall
contain only the characters 0-9 with an optional leading "+" or "-"
and an optional "." to mark the decimal point. A floating point number
shall be conveyed as defined in ANSI X3.9, with an "E" or "e" to
indicate the start of the exponent. Decimal Strings may be padded with
leading or trailing spaces. Embedded spaces are not allowed.
"0"-"9", "+", "-", "E", "e", "." and the SPACE character of Default
Character Repertoire. 16 bytes maximum
The standard is saying that the textual representation is fixed point vs. floating point. The standard only refers to how the values are represented within in the DICOM data set itself. As such there is not requirement to load a fixed point textual representation into a fixed-point variable.
So now that this is clear that DICOM standard implicitely recommend double (IEEE 754-1985) for representing a Value Representation of type Decimal String (maximum of 16 significant digits). My question is how do I use the standard C I/O library to convert back this binary representation from memory into ASCII onto this limited sized string ?
From random source on internet, this is non-trivial, but a generally accepted solution is either:
printf("%1.16e\n", d); // Round-trippable double, always with an exponent
or
printf("%.17g\n", d); // Round-trippable double, shortest possible
Of course both expression are invalid in my case since they can produce output much longer than my limited maximum of 16 bytes. So what is the solution to minimizing the loss in precision when writing out an arbitrary double value to a limited 16 bytes string ?
Edit: if this is not clear, I am required to follow the standard. I cannot use hex/uuencode encoding.
Edit 2: I am running the comparison using travis-ci see: here
So far the suggested codes are:
Serge Ballesta
chux
Mark Dickinson
chux
Results I see over here are:
compute1.c leads to a total sum error of: 0.0095729050923877828
compute2.c leads to a total sum error of: 0.21764383725715469
compute3.c leads to a total sum error of: 4.050031792674619
compute4.c leads to a total sum error of: 0.001287056579548422
So compute4.c leads to the best possible precision (0.001287056579548422 < 4.050031792674619), but triple (x3) the overall execution time (only tested in debug mode using time command).
It is trickier than first thought.
Given the various corner cases, it seems best to try at a high precision and then work down as needed.
Any negative number prints the same as a positive number with 1 less precision due to the '-'.
'+' sign not needed at the beginning of the string nor after the 'e'.
'.' not needed.
Dangerous to use anything other than sprintf() to do the mathematical part given so many corner cases. Given various rounding modes, FLT_EVAL_METHOD, etc., leave the heavy coding to well established functions.
When an attempt is too long by more than 1 character, iterations can be saved. E.g. If an attempt, with precision 14, resulted with a width of 20, no need to try precision 13 and 12, just go to 11.
Scaling of the exponent due to the removal of the '.', must be done after sprintf() to 1) avoid injecting computational error 2) decrementing a double to below its minimum exponent.
Maximum relative error is less than 1 part in 2,000,000,000 as with -1.00000000049999e-200. Average relative error about 1 part in 50,000,000,000.
14 digit precision, the highest, occurs with numbers like 12345678901234e1 so start with 16-2 digits.
static size_t shrink(char *fp_buffer) {
int lead, expo;
long long mant;
int n0, n1;
int n = sscanf(fp_buffer, "%d.%n%lld%ne%d", &lead, &n0, &mant, &n1, &expo);
assert(n == 3);
return sprintf(fp_buffer, "%d%0*llde%d", lead, n1 - n0, mant,
expo - (n1 - n0));
}
int x16printf(char *dest, size_t width, double value) {
if (!isfinite(value)) return 1;
if (width < 5) return 2;
if (signbit(value)) {
value = -value;
strcpy(dest++, "-");
width--;
}
int precision = width - 2;
while (precision > 0) {
char buffer[width + 10];
// %.*e prints 1 digit, '.' and then `precision - 1` digits
snprintf(buffer, sizeof buffer, "%.*e", precision - 1, value);
size_t n = shrink(buffer);
if (n <= width) {
strcpy(dest, buffer);
return 0;
}
if (n > width + 1) precision -= n - width - 1;
else precision--;
}
return 3;
}
Test code
double rand_double(void) {
union {
double d;
unsigned char uc[sizeof(double)];
} u;
do {
for (size_t i = 0; i < sizeof(double); i++) {
u.uc[i] = rand();
}
} while (!isfinite(u.d));
return u.d;
}
void x16printf_test(double value) {
printf("%-27.*e", 17, value);
char buf[16+1];
buf[0] = 0;
int y = x16printf(buf, sizeof buf - 1, value);
printf(" %d\n", y);
printf("'%s'\n", buf);
}
int main(void) {
for (int i = 0; i < 10; i++)
x16printf_test(rand_double());
}
Output
-1.55736829786841915e+118 0
'-15573682979e108'
-3.06117209691283956e+125 0
'-30611720969e115'
8.05005611774356367e+175 0
'805005611774e164'
-1.06083057094522472e+132 0
'-10608305709e122'
3.39265065244054607e-209 0
'33926506524e-219'
-2.36818580315246204e-244 0
'-2368185803e-253'
7.91188576978592497e+301 0
'791188576979e290'
-1.40513111051994779e-53 0
'-14051311105e-63'
-1.37897140950449389e-14 0
'-13789714095e-24'
-2.15869805640288206e+125 0
'-21586980564e115'
For finite floating point values the printf() format specifier "%e" well matches
"A floating point number shall be ... with an "E" or "e" to indicate the start of the exponent"
[−]d.ddd...ddde±dd
The sign is present with negative numbers and likely -0.0. The exponent is at least 2 digits.
If we assume DBL_MAX < 1e1000, (safe for IEEE 754-1985 double), then the below works in all cases: 1 optional sign, 1 lead digit, '.', 8 digits, 'e', sign, up to 3 digits.
(Note: the "16 bytes maximum" does not seem to refer to C string null character termination. Adjust by 1 if needed.)
// Room for 16 printable characters.
char buf[16+1];
int n = snprintf(buf, sizeof buf, "%.*e", 8, x);
assert(n >= 0 && n < sizeof buf);
puts(buf);
But this reserves room for the optional sign and 2 to 3 exponent digits.
The trick is the boundary, due to rounding, of when a number uses 2 or uses 3 exponent digits is fuzzy. Even testing for negative numbers, the -0.0 is an issue.
[Edit] Also needed test for very small numbers.
Candidate:
// Room for 16 printable characters.
char buf[16+1];
assert(isfinite(x)); // for now, only address finite numbers
int precision = 8+1+1;
if (signbit(x)) precision--; // Or simply `if (x <= 0.0) precision--;`
if (fabs(x) >= 9.99999999e99) precision--; // some refinement possible here.
else if (fabs(x) <= 1.0e-99) precision--;
int n = snprintf(buf, sizeof buf, "%.*e", precision, x);
assert(n >= 0 && n < sizeof buf);
puts(buf);
Additional concerns:
Some compilers print at least 3 exponent digits.
The maximum number of decimal significant digits for IEEE 754-1985 double needed varies on definition of need, but likely about 15-17. Printf width specifier to maintain precision of floating-point value
Candidate 2: One time test for too long an output
// Room for N printable characters.
#define N 16
char buf[N+1];
assert(isfinite(x)); // for now, only address finite numbers
int precision = N - 2 - 4; // 1.xxxxxxxxxxe-dd
if (signbit(x)) precision--;
int n = snprintf(buf, sizeof buf, "%.*e", precision, x);
if (n >= sizeof buf) {
n = snprintf(buf, sizeof buf, "%.*e", precision - (n - sizeof buf) - 1, x);
}
assert(n >= 0 && n < sizeof buf);
puts(buf);
C library formatter has no direct format for your requirement. At a simple level, if you can accept the waste of characters of the standard %g format (e20 is written e+020: 2 chars wasted), you can:
generate the output for the %.17g format
if it is greater the 16 characters, compute the precision that would lead to 16
generate the output for that format.
Code could look like:
void encode(double f, char *buf) {
char line[40];
char format[8];
int prec;
int l;
l = sprintf(line, "%.17g", f);
if (l > 16) {
prec = 33 - strlen(line);
l = sprintf(line, "%.*g", prec, f);
while(l > 16) {
/* putc('.', stdout);*/
prec -=1;
l = sprintf(line, "%.*g", prec, f);
}
}
strcpy(buf, line);
}
If you really try to be optimal (meaning write e30 instead of e+030), you could try to use %1.16e format and post-process the output. Rationale (for positive numbers):
the %1.16e format allows you to separate the mantissa and the exponent (base 10)
if the exponenent is between size-2 (included) and size (excluded): just correctly round the mantissa to the int part and display it
if the exponent is between 0 and size-2 (both included): display the rounded mantissa with the dot correctly placed
if the exponent is between -1 and -3 (both included): start with a dot, add eventual 0 and fill with rounded mantissa
else use a e format with minimal size for the exponent part and fill with the rounded mantissa
Corner cases:
for negative numbers, put a starting - and add the display for the opposite number and size-1
rounding : if first rejected digit is >=5, increase preceding number and iterate if it was a 9. Process 9.9999999999... as a special case rounded to 10
Possible code:
void clean(char *mant) {
char *ix = mant + strlen(mant) - 1;
while(('0' == *ix) && (ix > mant)) {
*ix-- = '\0';
}
if ('.' == *ix) {
*ix = '\0';
}
}
int add1(char *buf, int n) {
if (n < 0) return 1;
if (buf[n] == '9') {
buf[n] = '0';
return add1(buf, n-1);
}
else {
buf[n] += 1;
}
return 0;
}
int doround(char *buf, unsigned int n) {
char c;
if (n >= strlen(buf)) return 0;
c = buf[n];
buf[n] = 0;
if ((c >= '5') && (c <= '9')) return add1(buf, n-1);
return 0;
}
int roundat(char *buf, unsigned int i, int iexp) {
if (doround(buf, i) != 0) {
iexp += 1;
switch(iexp) {
case -2:
strcpy(buf, ".01");
break;
case -1:
strcpy(buf, ".1");
break;
case 0:
strcpy(buf, "1.");
break;
case 1:
strcpy(buf, "10");
break;
case 2:
strcpy(buf, "100");
break;
default:
sprintf(buf, "1e%d", iexp);
}
return 1;
}
return 0;
}
void encode(double f, char *buf, int size) {
char line[40];
char *mant = line + 1;
int iexp, lexp, i;
char exp[6];
if (f < 0) {
f = -f;
size -= 1;
*buf++ = '-';
}
sprintf(line, "%1.16e", f);
if (line[0] == '-') {
f = -f;
size -= 1;
*buf++ = '-';
sprintf(line, "%1.16e", f);
}
*mant = line[0];
i = strcspn(mant, "eE");
mant[i] = '\0';
iexp = strtol(mant + i + 1, NULL, 10);
lexp = sprintf(exp, "e%d", iexp);
if ((iexp >= size) || (iexp < -3)) {
i = roundat(mant, size - 1 -lexp, iexp);
if(i == 1) {
strcpy(buf, mant);
return;
}
buf[0] = mant[0];
buf[1] = '.';
strncpy(buf + i + 2, mant + 1, size - 2 - lexp);
buf[size-lexp] = 0;
clean(buf);
strcat(buf, exp);
}
else if (iexp >= size - 2) {
roundat(mant, iexp + 1, iexp);
strcpy(buf, mant);
}
else if (iexp >= 0) {
i = roundat(mant, size - 1, iexp);
if (i == 1) {
strcpy(buf, mant);
return;
}
strncpy(buf, mant, iexp + 1);
buf[iexp + 1] = '.';
strncpy(buf + iexp + 2, mant + iexp + 1, size - iexp - 1);
buf[size] = 0;
clean(buf);
}
else {
int j;
i = roundat(mant, size + 1 + iexp, iexp);
if (i == 1) {
strcpy(buf, mant);
return;
}
buf[0] = '.';
for(j=0; j< -1 - iexp; j++) {
buf[j+1] = '0';
}
if ((i == 1) && (iexp != -1)) {
buf[-iexp] = '1';
buf++;
}
strncpy(buf - iexp, mant, size + 1 + iexp);
buf[size] = 0;
clean(buf);
}
}
I think your best option is to use printf("%.17g\n", d); to generate an initial answer and then trim it. The simplest way to trim it is to drop digits from the end of the mantissa until it fits. This actually works very well but will not minimize the error because you are truncating instead of rounding to nearest.
A better solution would be to examine the digits to be removed, treating them as an n-digit number between 0.0 and 1.0, so '49' would be 0.49. If their value is less than 0.5 then just remove them. If their value is greater than 0.50 then increment the printed value in its decimal form. That is, add one to the last digit, with wrap-around and carry as needed. Any trailing zeroes that are created should be trimmed.
The only time this becomes a problem is if the carry propagates all the way to the first digit and overflows it from 9 to zero. This might be impossible, but I don't know for sure. In this case (+9.99999e17) the answer would be +1e18, so as long as you have tests for that case you should be fine.
So, print the number, split it into sign/mantissa strings and an exponent integer, and string manipulate them to get your result.
Printing in decimal cannot work because for some numbers a 17 digit mantissa is needed which uses up all of your space without printing the exponent. To be more precise, printing a double in decimal sometimes requires more than 16 characters to guarantee accurate round-tripping.
Instead you should print the underlying binary representation using hexadecimal. This will use exactly 16 bytes, assuming that a null-terminator isn't needed.
If you want to print the results using fewer than 16 bytes then you can basically uuencode it. That is, use more than 16 digits so that you can squeeze more bits into each digit. If you use 64 different characters (six bits) then a 64-bit double can be printed in eleven characters. Not very readable, but tradeoffs must be made.
Following this question:
If you dont have %f in C, how to write a C program to print decimal number upto 2 decimal places without %f?
This is my attempt:
#include <stdio.h>
#include <math.h>
void dbl2str(char *s, double number, int decimals)
{
double integral, fractional;
int n, i;
fractional = modf(number, &integral);
n = sprintf(s, "%d%c", (int)integral, decimals ? '.' : 0);
for (i = 0; i < decimals; i++) {
fractional *= 10;
s[n + i] = '0' + (int)fractional;
fractional = modf(fractional, &integral);
}
s[n + i] = '\0';
}
int main(void)
{
char s[32];
dbl2str(s, 3.1416, 4);
printf("%s\n", s);
dbl2str(s, 3.159, 4);
printf("%s\n", s);
dbl2str(s, 3.04, 2);
printf("%s\n", s);
return 0;
}
Output:
3.1415
3.1589
3.04
As you can see there are round errors, is there a way to get the correct output?
The problem is that 0.0006 isn't exactly representable in binary instead the representation will exactly match something like 0.0005999... This means that when you do the multiplication by 10 four times you get 5 as your last digit instead of 10. You need to look at the next digit in your sequence and round appropriately (the next digit will be greater than 5 in this case).
Be careful with the rounding because rounding up the last digit may cause the one before to round up as well (if the last digit was also a 9).
The line
s[n + i] = '0' + (int)fractional;
truncates the fractional part to an integer. You want the last digit to be correctly rounded, so you have to treat it different than the others:
void dbl2str(char *s, double number, int decimals)
{
double integral, fractional;
int n, i;
fractional = modf(number, &integral);
if (fractional < 0)
fractional = -fractional;
n = sprintf(s, "%d%c", (int)integral, decimals ? '.' : 0);
for (i = 0; i < decimals-1; i++) {
fractional *= 10;
s[n++] = '0' + (int)fractional;
fractional = modf(fractional, &integral);
}
fractional *= 10;
s[n++] = '0' + (int)(fractional+0.5f);
fractional = modf(fractional, &integral);
s[n] = '\0';
}
I added a test for negative numbers as well. Without, the fractional part goes wild.
I would do something like
void dbl2str(char *s, double number, int decimals)
{
double integral, fractional;
int n, i;
/* use an epsilon and add the correct rounding value */
double eps= 1e-9;
double round = 0.5*pow(10,-decimals);
fractional = modf(number+round+eps, &integral);
n = sprintf(s, "%d%c", (int)integral, decimals ? '.' : 0);
for (i = 0; i < decimals; i++) {
fractional *= 10;
s[n + i] = '0' + (int)fractional;
fractional = modf(fractional, &integral);
}
s[n + i] = '\0';
}
This way, I add 0.5*10^-decimals to get the (int) fractional to round correctly up to an epsilon.
It works for your input:
3.1416
3.1590
3.04
Of course, you may need to adjust the epsilon, depending on the digits, and then think about the case where you have exhausted your 16 digit precision. Overall, this is very hard to do right.
Add this:
if(i == decimals - 1)
fractional = round(fractional);
after fractional *= 10;