Numbers bigger than 8 bytes in C - c

I am writing some code to deal with numbers in C which are bigger than 8 bytes in size (don't fit into unsigned long). In this example I will use 16 bytes (128 bits) as the width. The numbers are unsigned and integers (no decimal places). They are stored as an array of unsigned chars eg:
unsigned char n[16];
I have managed to get addition to work (it works like an unsigned number in C so if you had a number which was 0xffffffffffffffffffffffffffffffff (2**128) and you were to add 1 you would get 0. I have managed to get addition to work, but I cannot get subtraction to work. I would assume it would be similar code to addition, but I don't seem to be able to get it to work.
Addition code:
//a and b are numbers
unsigned char *add(unsigned char *a, unsigned char *b){
unsigned char *c = malloc(NUM_SIZE);
//d is the carry and c is the output number
unsigned short d = 0;
if(!c){
return NULL;
}
for(int i = 0; i < NUM_SIZE; i++){
c[i] = 0;
}
for(int i = NUM_SIZE * 2 - 1; i >= 0; i--){
d += a[i % NUM_SIZE] + b[i % NUM_SIZE];
c[i % NUM_SIZE] = d % 256;
d >>= 8;
}
return c;
}
NUM_SIZE is defined as 16 (the width of the number in bytes)
What I have tried:
//changing the signs to minuses
d -= a[i % NUM_SIZE] - b[i % NUM_SIZE];
//changing the some signs to minuses
d -= a[i % NUM_SIZE] + b[i % NUM_SIZE];
//or
d += a[i % NUM_SIZE] - b[i % NUM_SIZE];
//looping through the number backwards
for(int i = 0; i < NUM_SIZE * 2; i++)

Just an idea (not compiled):
void not( unsigned char* a, unsigned int n )
{
for ( unsigned int i = 0; i < n; ++i )
a[i] = ~a[i];
}
void inc( unsigned char* a, unsigned int n )
{
for ( unsigned int i = 0; i < n; ++i )
if ( ++a[i] )
return;
}
void add( unsigned char* c, unsigned char* a, unsigned char* b, unsigned int n )
{
for ( unsigned int i = 0, r = 0; i < n; ++i )
c[i] = r = a[i] + b[i] + ( r >> 8 );
}
void sub( unsigned char* c, unsigned char* a, unsigned char* b, unsigned int n )
{
not( b, n );
add( c, a, b, n );
not( b, n ); // revert
inc( c, n );
}

You may want to use arbitrary-precision arithmetic, a.k.a. as bigint or bignum. You should use a library for that (because bignum algorithms are very clever and use some assembler code). I recommend GMPlib. See also this.

NUM_SIZE * 2 does not make sense with malloc(NUM_SIZE); ... for(int i = NUM_SIZE * 2 - 1. Only a loop of NUM_SIZE iterations is needed.
Repaired code
#define NUM_SIZE 8
//a - b
unsigned char *sub(const unsigned char *a, const unsigned char *b) {
unsigned char *c = malloc(NUM_SIZE);
if (!c) {
return NULL;
}
// zeroing `c[]` not needed. Retain that code if desired
int d = 0; // Use signed accumulator to save the "borrow"
// drop *2
for (int i = NUM_SIZE - 1; i >= 0; i--) {
d += a[i] - b[i]; // Perform the subtraction
c[i] = d; // Save the 8 least significant bits in c[]
d = (d - c[i]) / (UCHAR_MAX+1); // Form the "borrow" for the next loop
}
// If d<0 at this point, b was greater than a
return c;
}
Various performance improvements can be made, but important to get functionality correct first.

Numbers have a "base" that determines the range of each digit (e.g. "base 10" is decimal).
One uint8_t is a single digit in "base 256". One uint16_t is a single digit in "base 65536". One uint32_t is a single digit in "base 4294967296".
For mathematical operations, performance is heavily effected by the number of digits. By using a larger base you need fewer digits for the same number, which improves performance (until you exceed the CPU's native word size).
For subtraction of unsigned numbers:
#define DIGITS 4
int subtract(uint32_t *result, uint32_t *src1, uint32_t *src2) {
int carry = 0;
int oldCarry;
int i;
for(i = 0; i < DIGITS; i++) {
oldCarry = carry;
if(src2[i] < src1[i]) {
carry = 1;
} else if( (src2[i] == src1[i]) && (oldCarry != 0) ) {
carry = 1;
} else {
carry = 0;
}
result[i] = src1[i] - src2[i] - oldCarry;
}
return carry;
}

There may be some __int128_t. But if your compiler does not support it you define a struct with hi and lo with the biggest type you have. In c++ you can also add operators similar to the operators you know from the other int_t-s.
typedef struct uint128 {
uint64_t lo, hi; // lo comes first if you want to use little-endian else hi comes first
} uint128_t;
If you want to double the size, you use uint128_t in a similar struct.
Edit:
A simple function to increase the int128:
int128_t& int128_increase(int128_t& value) {
// increase the low part, it is 0 if it was overflown
// so increase hi
if (!(++value.lo)) {
++value.hi;
};
return value;
};
Edit:
A runtime scaled version of ints, I use words, because it is faster in accessing memory:
typedef struct uint_dynamic {
// the length as a multiple of the wordsize
size_t length;
size_t* words;
} uint_dynamic_t;
uint_dynamic_t& uint_dynamic_increase(uint_dynamic_t& value) {
size_t* ptr = value.words; size_t i = value.length;
while(i && !(++*ptr)) { ++ptr; --i; };
return value;
};
Or if you want some constant size, put it clearly into a struct.
#define uint_fixed_SIZE (16 / sizeof(size_t))
typedef struct uint_fixed {
size_t words[uint_fixed_SIZE];
} uint_fixed_t;
uint_fixed_t& uint_fixed_increase(uint_fixed_t& value) {
size_t* ptr = value.words; size_t i = uint_fixed_SIZE;
while(i && !(++*ptr)) { ++ptr; --i; };
return value;
};
This can be rewritten as a #define-macro, where you replace the specific values by a parameter. Which has similar functionality, by defining specific values and including a file:
File fixed_int.h
// note that here is no #ifndef FILE_H or #pragma once
// to reuse the file
#define _concat1(a, b) a ## b
#define _concat(a, b) _concat1(a, b)
#define _size (-((-fixed_int_size) / sizeof(size_t) / 8))
#ifndef fixed_int_name
#define _name concat(uint_, fixed_int_size)
#else
#define _name fixed_int_name
#endif
#define _name_(member) _concat(_concat(_name, _), member)
typedef struct _name {
size_t words[_size];
} _name_(t);
_name_(t)& _name_(increase)(_name_(t)& value) {
size_t* ptr = value.words; size_t i = _size;
while(i && !(++*ptr)) { ++ptr; --i; };
return value;
};
// undef all defines!
#undef _concat1
#undef _concat
#undef _size
#undef _name
#undef _name_
File my_ints.h
//...
// the following lines define the type uint128_t and the function uint_128_t& uint128_increase(uint128_t&)
#define fixed_int_name uint128 // is optional
#define fixed_int_size 128
#include"fixed_int.h"
#undef fixed_int_size
#undef fixed_int_name
//...

Related

How to convert large HEX string to INT in C

I got large HEX string in result into int i could be more than 10 ^ 30, and I converted in hex. I need sum (3 hex string) and remove last 12 numbers.
hex example "000000000000000000000000bd4c61f945644cf099d41ab8a0ab2ac5d2533835", "000000000000000000000000000000000000000000000000f32f5908b7f3c000", "00000000000000000000000000000000000000000000000000e969cd49be4000". And I need to sum them and get result into int. Thank you
I "made" a little two functions and they work but i think could be better, and they dont convert to normal integer number
// convert hex to unsigned char decimal
unsigned char div10(unsigned char *hex, unsigned size)
{
unsigned rem = 0;
for(int i = 0; i < size; i++)
{
unsigned n = rem * 256 + hex[i];
hex[i] = n / 10;
rem = n % 10;
}
return rem;
}
unsigned char hex_to_dec_summer(char *local){
unsigned char result[32]={0};
unsigned char output[18]={};
char input[64];
strcpy(input, local);
unsigned char hexnr[sizeof(input)/2]={};
for (int i=0; i<sizeof(input)/2; i++) {
sscanf(&input[i*2], "%02xd", &hexnr[i]);
}
unsigned char hexzero[32] = {0};
unsigned i = 0;
while(memcmp(hexnr, hexzero, sizeof(hexnr)) != 0 && i < sizeof(result))
{
result[sizeof(result) - i - 1] = div10(hexnr, sizeof(hexnr));
i++;
}
printf("\n");
for(unsigned j = 0; j < sizeof output; j++)
{
output[j]=result[j];
printf("%d", output[j]);
}
output[18]='\0';
}
I know how its make in python3 -> int(hex_number, 16)/(10**12) - like that but i need it in c
The reason this sort of thing works so easily in Python is that, unusually, Python supports arbitrary-precision integers natively.
Most languages, including C, use fixed sizes for their native types. To perform arbitrary-precision arithmetic, you generally need a separate library, such as GMP.
Here is a basic example of using GMP to solve your problem:
#include <stdio.h>
#include <gmp.h>
char *inputs[] = {
"000000000000000000000000bd4c61f945644cf099d41ab8a0ab2ac5d2533835",
"000000000000000000000000000000000000000000000000f32f5908b7f3c000",
"00000000000000000000000000000000000000000000000000e969cd49be4000"
};
int main()
{
char outstr[100];
mpz_t x; mpz_init(x);
mpz_t y; mpz_init(y);
mpz_t sum; mpz_init(sum);
mpz_t ten; mpz_init_set_si(ten, 10);
mpz_t fac; mpz_init(fac);
mpz_pow_ui(fac, ten, 12); /* fac = 10**12 */
int i;
for(i = 0; i < 3; i++) {
mpz_set_str(x, inputs[i], 16);
mpz_tdiv_q(y, x, fac);
mpz_add(sum, sum, y); /* sum += x / fac */
}
printf("%s\n", mpz_get_str(outstr, 10, sum));
}
The code is a bit verbose, because arbitrary-precision integers (that is, variables of type mpz_t) have nontrivial memory allocation requirements, and everything you do with them requires explicit function calls. (Working with extended types like this would be considerably more convenient in a language with good support for object-oriented programming, like C++.)
To compile this, you'll need to have GMP installed. On my machine, I used
cc testprog.c -lgmp
When run, this program prints
1080702647035076263416932216315997551
Or, if I changed 10 to 16 in the last line, it would print d022c1183a2720991b1fea332a6d6f.
It will make a slight difference whether you divide by 1012 and then sum, or sum and then divide. To sum and then divide, you could get rid of the line mpz_tdiv_q(y, x, fac) inside the loop, change mpz_add(sum, sum, y) to mpz_add(sum, sum, x), and add the line
mpz_tdiv_q(sum, sum, fac);
outside the loop, just before printing.
It's fairly straight forward to add up the (in this case hex) digits of two strings.
This doesn't try to be "optimal", but it does give a sum (as a string of hex digits). vals[0] acts as the accumulator.
When OP clarifies what is meant by "I need sum (3 hex string) and remove last 12 numbers", this answer could be extended.
If more speed is needed, the accumulator could be allocated and used as an array of uint8_t's (saving converting back to ASCII hex until a final total is available.) Also the LUT to convert ASCII hex to '0-F' could be 'binary' (not requiring the subtraction of ASCII character values.)
Anyway...
#include <stdio.h>
char *vals[] = {
"000000000000000000000000bd4c61f945644cf099d41ab8a0ab2ac5d2533835",
"000000000000000000000000000000000000000000000000f32f5908b7f3c000",
"00000000000000000000000000000000000000000000000000e969cd49be4000",
};
char *frmHex =
"................................................0000000000......"
".777777..........................WWWWWW.........................";
char *tohex = "0123456789ABCDEF";
void addTo( char *p0, char *p1 ) {
printf( " %s\n+ %s\n", p0, p1 );
char *px = p0 + strlen( p0 ) - 1;
char *py = p1 + strlen( p1 ) - 1;
for( int carry = 0; px >= p0 && py >= p1; px--, py-- ) {
int val = *px - frmHex[ *px ] + *py - frmHex[ *py ] + carry;
carry = val / 0x10; *px = tohex[ val % 0x10 ];
}
printf( "= %s\n\n", p0 );
}
int main() {
addTo( vals[ 0 ], vals[ 1 ] );
addTo( vals[ 0 ], vals[ 2 ] );
return 0;
}
Output
000000000000000000000000bd4c61f945644cf099d41ab8a0ab2ac5d2533835
+ 000000000000000000000000000000000000000000000000f32f5908b7f3c000
= 000000000000000000000000BD4C61F945644CF099D41AB993DA83CE8A46F835
000000000000000000000000BD4C61F945644CF099D41AB993DA83CE8A46F835
+ 00000000000000000000000000000000000000000000000000e969cd49be4000
= 000000000000000000000000BD4C61F945644CF099D41AB994C3ED9BD4053835
If this were to progress (and use binary accumulators), 'compaction' after summing would quickly lead into integer division (that could be done simply with shifting and repeated subtraction.) Anyway...

Decimal to Binary on C library

I want to know if there is a function in C library that convert a decimal to binary number and save number by number in a position on an array.
For example: 2 -> 10 -> array [0] = 0 array[1] = 1.
Thanks.
here:
void dec2bin(int c)
{
int i = 0;
for(i = 31; i >= 0; i--){
if((c & (1 << i)) != 0){
printf("1");
}else{
printf("0");
}
}
}
But this only prints the value of an integer in binary format. All data is represented in binary format internally anyway.
You did not define what is a decimal number for you. I am guessing it is character representation (e.g. in ASCII) of that number.
Notice that numbers are just numbers. Binary or decimal numbers do not exist, but a given number may have a binary, and a decimal, representation. Numbers are not made of digits!
Then you probably want sscanf(3) or strtol(3) pr atoi to convert a string to an integer (e.g. an int or a long), and snprintf(3) to convert an integer to a string.
If you want to convert a number to a binary string (with only 0 or 1 char-s in it) you need to code that conversion by yourself. To convert a binary string to some long use strtol.
There is no such function in C standard library. Anyway, you can write your own:
void get_bin(int *dst, intmax_t x);
Where dst is the resulting array (with 1s and 0s), and x is the decimal number.
For example:
C89 version:
#include <limits.h>
void get_bin(int *dst, int x)
{
int i;
for (i = sizeof x * CHAR_BIT - 1; i >= 0; --i)
*dst++ = x >> i & 1;
}
C99 version:
/* C99 version */
#include <limits.h>
#include <stdint.h>
void get_bin(int *dst, intmax_t x)
{
for (intmax_t i = sizeof x * CHAR_BIT - 1; i >= 0; --i)
*dst++ = x >> i & 1;
}
It works as follow: we run through the binary representation of x, from left to right. The expression (sizeof x * CHAR_BIT - 1) give the number of bits of x - 1. Then, we get the value of each bit (*dst++ = x >> i & 1), and push it into the array.
Example of utilisation:
void get_bin(int *dst, int x)
{
int i;
for (i = sizeof x * CHAR_BIT - 1; i >= 0; --i)
*dst++ = x >> i & 1;
}
int main(void)
{
int buf[128]; /* binary number */
int n = 42; /* decimal number */
unsigned int i;
get_bin(buf, n);
for (i = 0; i < sizeof n * CHAR_BIT; ++i)
printf("%d", buf[i]);
return 0;
}
Here is a version that explicitly uses a string buffer:
#include <string.h>
const char *str2bin(int num, char buffer[], const int BUFLEN)
{
(void) memset(buffer, '\0', BUFLEN );
int i = BUFLEN - 1; /* Index into buffer, running backwards. */
int r = 0; /* Remainder. */
char *p = &buffer[i - 1]; /* buffer[i] holds string terminator '\0'. */
while (( i >= 0 ) && ( num > 0 )) {
r = num % 2;
num = num / 2;
*p = r + '0';
i--;
p--;
}
return (p+1);
}
Use char * itoa ( int value, char * str, int base );
Find more here ...
the function should go like this:
int dec2bin(int n){
static int bin,osn=1,c;
if(n==0) return 0;
else {
c=n%2;
bin += c*osn;
osn*=10;
dec2bin(n/2);
}
return bin;
}
As far as i know there is no such function in any C library. But here's a recursive function that returns a binary representation of a decimal number as int:
int dec2bin(int n)
{
if(n == 0) return 0;
return n % 2 + 10 * dec2bin(n / 2);
}
The max number that it can represent is 1023 (1111111111 in binary) because of int data type limit, but you can substitute int for long long data type to increase the range. Then, you can store the return value to array like this:
int array[100], i = 0;
int n = dec2bin(some_number);
do{
array[i] = n % 10;
n /= 10;
i++;
}while(n > 10)
I know this is an old post, but i hope this will still help somebody!
If it helps you can convert any decimal to binary using bitset library, for example:
#include <iostream>
#include <bits/stdc++.h>
using namespace std;
int main(){
int decimal = 20;
bitset<5> binary20(decimal);
cout << binary20 << endl;
return 0;
}
So, you have an output like 10100. Bitsets also have a "toString()" method for any purpose.

Bitwise Operations C on long hex Linux

Briefly: Question is related to bitwise operations on hex - language C ; O.S: linux
I would simply like to do some bitwise operations on a "long" hex string.
I tried the following:
First try:
I cannot use the following because of overflow:
long t1 = 0xabefffcccaadddddffff;
and t2 = 0xdeeefffffccccaaadacd;
Second try: Does not work because abcdef are interpreted as string instead of hex
char* t1 = "abefffcccaadddddffff";
char* t2 = "deeefffffccccaaadacd";
int len = strlen(t1);
for (int i = 0; i < len; i++ )
{
char exor = *(t1 + i) ^ *(t2 + i);
printf("%x", exor);
}
Could someone please let me know how to do this? thx
Bitwise operations are usually very easily extended to larger numbers.
The best way to do this is to split them up into 4 or 8 byte sequences, and store them as an array of uints. In this case you need at least 80 bits for those particular strings.
For AND it is pretty simple, something like:
unsigned int A[3] = { 0xabef, 0xffcccaad, 0xddddffff };
unsigned int B[3] = { 0xdeee, 0xfffffccc, 0xcaaadacd };
unsigned int R[3] = { 0 };
for (int b = 0; b < 3; b++) {
R[b] = A[b] & B[b];
}
A more full example including scanning hex strings and printing them:
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
typedef unsigned int uint;
void long_Print(int size, const uint a[]) {
printf("0x");
for (int i = 0; i < size; i++) {
printf("%x", a[i]);
}
}
void long_AND(int size, const uint a[], const uint b[], uint r[]) {
for (int i = 0; i < size; i++) {
r[i] = a[i] & b[i];
}
}
// Reads a long hex string and fills an array. Returns the number of elements filled.
int long_Scan(int size, const char* str, uint r[]) {
int len = strlen(str);
int ri = size;
for (const char* here = &str[len]; here != str; here -= 8) {
if (here < str) {
char* tmp = (char*)malloc(4);
tmp[0] = '%';
tmp[1] = (char)(str - here + '0');
tmp[2] = 'x';
tmp[3] = '\0';
sscanf(str, tmp, &r[ri--]);
free(tmp);
break;
}
else {
sscanf(here, "%8x", &r[ri--]);
}
}
for (; ri >= 0; ri--) {
r[ri] == 0;
}
return size - ri;
}
int main(int argc, char* argv[])
{
uint A[3] = { 0 };
uint B[3] = { 0 };
uint R[3] = { 0 };
long_Scan(3, "abefffcccaadddddffff", A);
long_Scan(3, "deeefffffccccaaadacd", B);
long_Print(3, A);
puts("\nAND");
long_Print(3, B);
puts("\n=");
long_AND(3, A, B, R);
long_Print(3, R);
getchar();
return 0;
}
You'll certainly need to use a library that can handle arbitrarily long integers. Consider using libgmp: http://gmplib.org/
Before you can do any sort of bitwise operations, you need to be working with integers. "abeffccc" is not an integer. It is a string. You need to use something like strtol
to first convert the string to an integer.
If your values are too big to fit into a 64-bit long long int (0xFFFFFFFF,FFFFFFFF) then you'll need to use a Big Integer library, or something similar, to support arbitrarily large values. As H2CO3 mentioned, libgmp is an excellent choice for large numbers in C.
Instead of using unsigned long directly, you could try using an array of unsigned int. Each unsigned int holds 32 bits, or 8 hex digits. You would therefore have to chop-up your constant into chunks of 8 hex digits each:
unsigned int t1[3] = { 0xabef , 0xffcccaad , 0xddddffff };
Note that for sanity, you should store them in reverse order so that the first entry of t1 contains the lowest-order bits.

Porting MMX/SSE instructions to AltiVec

Let me preface this with.. I have extremely limited experience with ASM, and even less with SIMD.
But it happens that I have the following MMX/SSE optimised code, that I would like to port across to AltiVec instructions for use on PPC/Cell processors.
This is probably a big ask.. Even though it's only a few lines of code, I've had no end of trouble trying to work out what's going on here.
The original function:
static inline int convolve(const short *a, const short *b, int n)
{
int out = 0;
union {
__m64 m64;
int i32[2];
} tmp;
tmp.i32[0] = 0;
tmp.i32[1] = 0;
while (n >= 4) {
tmp.m64 = _mm_add_pi32(tmp.m64,
_mm_madd_pi16(*((__m64 *)a),
*((__m64 *)b)));
a += 4;
b += 4;
n -= 4;
}
out = tmp.i32[0] + tmp.i32[1];
_mm_empty();
while (n --)
out += (*(a++)) * (*(b++));
return out;
}
Any tips on how I might rewrite this to use AltiVec instructions?
My first attempt (a very wrong attempt) looks something like this.. But it's not entirely (or even remotely) correct.
static inline int convolve_altivec(const short *a, const short *b, int n)
{
int out = 0;
union {
vector unsigned int m128;
int i64[2];
} tmp;
vector unsigned int zero = {0, 0, 0, 0};
tmp.i64[0] = 0;
tmp.i64[1] = 0;
while (n >= 8) {
tmp.m128 = vec_add(tmp.m128,
vec_msum(*((vector unsigned short *)a),
*((vector unsigned short *)b), zero));
a += 8;
b += 8;
n -= 8;
}
out = tmp.i64[0] + tmp.i64[1];
#endif
while (n --)
out += (*(a++)) * (*(b++));
return out;
}
You're not far off - I fixed a few minor problems, cleaned up the code a little, added a test harness, and it seems to work OK now:
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <altivec.h>
static int convolve_ref(const short *a, const short *b, int n)
{
int out = 0;
int i;
for (i = 0; i < n; ++i)
{
out += a[i] * b[i];
}
return out;
}
static inline int convolve_altivec(const short *a, const short *b, int n)
{
int out = 0;
union {
vector signed int m128;
int i32[4];
} tmp;
const vector signed int zero = {0, 0, 0, 0};
assert(((unsigned long)a & 15) == 0);
assert(((unsigned long)b & 15) == 0);
tmp.m128 = zero;
while (n >= 8)
{
tmp.m128 = vec_msum(*((vector signed short *)a),
*((vector signed short *)b), tmp.m128);
a += 8;
b += 8;
n -= 8;
}
out = tmp.i32[0] + tmp.i32[1] + tmp.i32[2] + tmp.i32[3];
while (n --)
out += (*(a++)) * (*(b++));
return out;
}
int main(void)
{
const int n = 100;
vector signed short _a[n / 8 + 1];
vector signed short _b[n / 8 + 1];
short *a = (short *)_a;
short *b = (short *)_b;
int sum_ref, sum_test;
int i;
for (i = 0; i < n; ++i)
{
a[i] = rand();
b[i] = rand();
}
sum_ref = convolve_ref(a, b, n);
sum_test = convolve_altivec(a, b, n);
printf("sum_ref = %d\n", sum_ref);
printf("sum_test = %d\n", sum_test);
printf("%s\n", sum_ref == sum_test ? "PASS" : "FAIL");
return 0;
}
(Warning: all of my Altivec experience comes from working on Xbox360/PS3 - I'm not sure how different they are from other Altivec platforms).
First off, you should check your pointer alignment. Most vector loads (and stores) operations are expected to be from 16-byte aligned addresses. If they aren't, things will usually carry on without warning, but you won't get the data you were expecting.
It's possible (but slower) to do unaligned loads, but you basically have to read a bit before and after your data and combine them. See Apple's Altivec page. I've also done it before using an lvlx and lvrx load instructions, and then ORing them together.
Next up, I'm not sure your multiplies and adds are the same. I've never used either _mm_madd_pi16 or vec_msum, so I'm not positive they're equivalent. You should step through in a debugger and make sure they give you the same output for the same input data. Another possible difference is that they may treat overflow differently (e.g. modular vs. saturate).
Last but not least, you're computing 4 ints at a time instead of 2. So your union should hold 4 ints, and you should sum all 4 of them at the end.

Algorithm to convert infinitely long base 2^32 number to printable base 10

I'm representing an infinitely precise integer as an array of unsigned ints for processing on a GPU. For debugging purposes I'd like to print the base 10 representation of one of these numbers, but am having difficulty wrapping my head around it. Here's what I'd like to do:
//the number 4*(2^32)^2+5*(2^32)^1+6*(2^32)^0
unsigned int aNumber[3] = {4,5,6};
char base10TextRepresentation[50];
convertBase2To32ToBase10Text(aNumber,base10TextRepresentation);
Any suggestions on how to approach this problem?
Edit: Here's a complete implementation thanks to drhirsch
#include <string.h>
#include <stdio.h>
#include <stdint.h>
#define SIZE 4
uint32_t divideBy10(uint32_t * number) {
uint32_t r = 0;
uint32_t d;
for (int i=0; i<SIZE; ++i) {
d = (number[i] + r*0x100000000) / 10;
r = (number[i] + r*0x100000000) % 10;
number[i] = d;
}
return r;
}
int zero(uint32_t* number) {
for (int i=0; i<SIZE; ++i) {
if (number[i] != 0) {
return 0;
}
}
return 1;
}
void swap(char *a, char *b) {
char tmp = *a;
*a = *b;
*b = tmp;
}
void reverse(char *str) {
int x = strlen(str);
for (int y = 0; y < x/2; y++) {
swap(&str[y],&str[x-y-1]);
}
}
void convertTo10Text(uint32_t* number, char* buf) {
int n = 0;
do {
int digit = divideBy10(number);
buf[n++] = digit + '0';
} while(!zero(number));
buf[n] = '\0';
reverse(buf);
}
int main(int argc, char** argv) {
uint32_t aNumber[SIZE] = {0,0xFFFFFFFF,0xFFFFFFFF,0xFFFFFFFF};
uint32_t bNumber[4] = {1,0,0,0};
char base10TextRepresentation[50];
convertTo10Text(aNumber, base10TextRepresentation);
printf("%s\n",base10TextRepresentation);
convertTo10Text(bNumber, base10TextRepresentation);
printf("%s\n",base10TextRepresentation);
}
If you have access to 64 bit arithmetic, it is easier. I would do something along the line of:
int32_t divideBy10(int32_t* number) {
uint32_t r = 0;
uint32_t d;
for (int i=0; i<SIZE; ++i) {
d = (number[i] + r*0x100000000) / 10;
r = (number[i] + r*0x100000000) % 10;
number[i] = d;
number[i] = r;
}
void convertTo10Text(int32_t* number, char* buf) {
do {
digit = divideBy10(number);
*buf++ = digit + '0';
} while (!isEqual(number, zero));
reverse(buf);
}
isEqual() and reverse() left to be implemented. divideBy10 divides by 10 and returns the remainder.
Fundamentally you need classic decimal printing using digit production by dividing your number by ten (in your base 2^32) repeatedly and using the remainder as digits. You may not have a divide by (anything, let alone) 10 routine, which is probably the key source of your problem.
If you are working in C or C++, you can get a complete infinite precision arithmetic package from GNU Bignum package. Most other widely used languages have similar packages available.
Of course, if you have too much free time, you can always implement multiprecision division yourself. You're already borrowing terminology from Knuth; he also supplies the multiprecision algorithms in Seminumerical Algorithms.
If it is .NET, take a look at this implementation of a BigInteger class.
How about using long doubles? Then you get 80bits in the mantissa, but I guess that the accuracy is lost when using floating point numbers.

Resources