What does "Unsigned modulo 256" mean in the context of image decoding - c

Because I'm masochistic I'm trying to write something in C to decode an 8-bit PNG file (it's a learning thing, I'm not trying to reinvent libpng...)
I've got to the point when the stuff in my deflated, unfiltered data buffer unmistakably resembles the source image (see below), but it's still quite, erm, wrong, and I'm pretty sure there's something askew with my implementation of the filtering algorithms. Most of them are quite simple, but there's one major thing I don't understand in the docs, not being good at maths or ever having taken a comp-sci course:
Unsigned arithmetic modulo 256 is used, so that both the inputs and outputs fit into bytes.
What does that mean?
If someone can tell me that I'd be very grateful!
For reference, (and I apologise for the crappy C) my noddy implementation of the filtering algorithms described in the docs look like:
unsigned char paeth_predictor (unsigned char a, unsigned char b, unsigned char c) {
// a = left, b = above, c = upper left
char p = a + b - c; // initial estimate
char pa = abs(p - a); // distances to a, b, c
char pb = abs(p - b);
char pc = abs(p - c);
// return nearest of a,b,c,
// breaking ties in order a,b,c.
if (pa <= pb && pa <= pc) return a;
else if (pb <= pc) return b;
else return c;
}
void unfilter_sub(char* out, char* in, int bpp, int row, int rowlen) {
for (int i = 0; i < rowlen; i++)
out[i] = in[i] + (i < bpp ? 0 : out[i-bpp]);
}
void unfilter_up(char* out, char* in, int bpp, int row, int rowlen) {
for (int i = 0; i < rowlen; i++)
out[i] = in[i] + (row == 0 ? 0 : out[i-rowlen]);
}
void unfilter_paeth(char* out, char* in, int bpp, int row, int rowlen) {
char a, b, c;
for (int i = 0; i < rowlen; i++) {
a = i < bpp ? 0 : out[i - bpp];
b = row < 1 ? 0 : out[i - rowlen];
c = i < bpp ? 0 : (row == 0 ? 0 : out[i - rowlen - bpp]);
out[i] = in[i] + paeth_predictor(a, b, c);
}
}
And the images I'm seeing:
Source
Source http://img220.imageshack.us/img220/8111/testdn.png
Output
Output http://img862.imageshack.us/img862/2963/helloworld.png

It means that, in the algorithm, whenever an arithmetic operation is performed, it is performed modulo 256, i.e. if the result is greater than 256 then it "wraps" around. The result is that all values will always fit into 8 bits and not overflow.
Unsigned types already behave this way by mandate, and if you use unsigned char (and a byte on your system is 8 bits, which it probably is), then your calculation results will naturally just never overflow beyond 8 bits.

It means only the last 8 bits of the result is used. 2^8=256, the last 8 bits of unsigned value v is the same as (v%256).
For example, 2+255=257, or 100000001, last 8 bits of 257 is 1, and 257%256 is also 1.

In 'simple language' it means that you never go "out" of your byte size.
For example in C# if you try this it will fail:
byte test = 255 + 255;
(1,13): error CS0031: Constant value '510' cannot be converted to a
'byte'
byte test = (byte)(255 + 255);
(1,13): error CS0221: Constant value '510' cannot be converted to a
'byte' (use 'unchecked' syntax to override)
For every calculation you have to do modulo 256 (C#: % 256).
Instead of writing % 256 you can also do AND 255:
(175 + 205) mod 256 = (175 + 205) AND 255
Some C# samples:
byte test = ((255 + 255) % 256);
// test: 254
byte test = ((255 + 255) & 255);
// test: 254
byte test = ((1 + 379) % 256);
// test: 124
byte test = ((1 + 379) & 0xFF);
// test: 124
Note that you sometimes can simplify a byte-series:
(byteVal1 + byteVal2 + byteVal3) % 256
= (((byteVal1 % 256) + (byteVal2 % 256)) % 256 + (byteVal3 % 256)) % 256

Related

How to generate random 64-bit unsigned integer in C

I need generate random 64-bit unsigned integers using C. I mean, the range should be 0 to 18446744073709551615. RAND_MAX is 1073741823.
I found some solutions in the links which might be possible duplicates but the answers mostly concatenates some rand() results or making some incremental arithmetic operations. So results are always 18 digits or 20 digits. I also want outcomes like 5, 11, 33387, not just 3771778641802345472.
By the way, I really don't have so much experience with the C but any approach, code samples and idea could be beneficial.
Concerning "So results are always 18 digits or 20 digits."
See #Thomas comment. If you generate random numbers long enough, code will create ones like 5, 11 and 33387. If code generates 1,000,000,000 numbers/second, it may take a year as very small numbers < 100,000 are so rare amongst all 64-bit numbers.
rand() simple returns random bits. A simplistic method pulls 1 bit at a time
uint64_t rand_uint64_slow(void) {
uint64_t r = 0;
for (int i=0; i<64; i++) {
r = r*2 + rand()%2;
}
return r;
}
Assuming RAND_MAX is some power of 2 - 1 as in OP's case 1073741823 == 0x3FFFFFFF, take advantage that 30 at least 15 bits are generated each time. The following code will call rand() 5 3 times - a tad wasteful. Instead bits shifted out could be saved for the next random number, but that brings in other issues. Leave that for another day.
uint64_t rand_uint64(void) {
uint64_t r = 0;
for (int i=0; i<64; i += 15 /*30*/) {
r = r*((uint64_t)RAND_MAX + 1) + rand();
}
return r;
}
A portable loop count method avoids the 15 /*30*/ - But see 2020 edit below.
#if RAND_MAX/256 >= 0xFFFFFFFFFFFFFF
#define LOOP_COUNT 1
#elif RAND_MAX/256 >= 0xFFFFFF
#define LOOP_COUNT 2
#elif RAND_MAX/256 >= 0x3FFFF
#define LOOP_COUNT 3
#elif RAND_MAX/256 >= 0x1FF
#define LOOP_COUNT 4
#else
#define LOOP_COUNT 5
#endif
uint64_t rand_uint64(void) {
uint64_t r = 0;
for (int i=LOOP_COUNT; i > 0; i--) {
r = r*(RAND_MAX + (uint64_t)1) + rand();
}
return r;
}
The autocorrelation effects commented here are caused by a weak rand(). C does not specify a particular method of random number generation. The above relies on rand() - or whatever base random function employed - being good.
If rand() is sub-par, then code should use other generators. Yet one can still use this approach to build up larger random numbers.
[Edit 2020]
Hallvard B. Furuseth provides as nice way to determine the number of bits in RAND_MAX when it is a Mersenne Number - a power of 2 minus 1.
#define IMAX_BITS(m) ((m)/((m)%255+1) / 255%255*8 + 7-86/((m)%255+12))
#define RAND_MAX_WIDTH IMAX_BITS(RAND_MAX)
_Static_assert((RAND_MAX & (RAND_MAX + 1u)) == 0, "RAND_MAX not a Mersenne number");
uint64_t rand64(void) {
uint64_t r = 0;
for (int i = 0; i < 64; i += RAND_MAX_WIDTH) {
r <<= RAND_MAX_WIDTH;
r ^= (unsigned) rand();
}
return r;
}
If you don't need cryptographically secure pseudo random numbers, I would suggest using MT19937-64. It is a 64 bit version of Mersenne Twister PRNG.
Please, do not combine rand() outputs and do not build upon other tricks. Use existing implementation:
http://www.math.sci.hiroshima-u.ac.jp/~m-mat/MT/emt64.html
Iff you have a sufficiently good source of random bytes (like, say, /dev/random or /dev/urandom on a linux machine), you can simply consume 8 bytes from that source and concatenate them. If they are independent and have a linear distribution, you're set.
If you don't, you MAY get away by doing the same, but there is likely to be some artefacts in your pseudo-random generator that gives a toe-hold for all sorts of hi-jinx.
Example code assuming we have an open binary FILE *source:
/* Implementation #1, slightly more elegant than looping yourself */
uint64_t 64bitrandom()
{
uint64_t rv;
size_t count;
do {
count = fread(&rv, sizeof(rv), 1, source);
} while (count != 1);
return rv;
}
/* Implementation #2 */
uint64_t 64bitrandom()
{
uint64_t rv = 0;
int c;
for (i=0; i < sizeof(rv); i++) {
do {
c = fgetc(source)
} while (c < 0);
rv = (rv << 8) | (c & 0xff);
}
return rv;
}
If you replace "read random bytes from a randomness device" with "get bytes from a function call", all you have to do is to adjust the shifts in method #2.
You're vastly more likely to get a "number with many digits" than one with "small number of digits" (of all the numbers between 0 and 2 ** 64, roughly 95% have 19 or more decimal digits, so really that is what you will mostly get.
If you are willing to use a repetitive pseudo random sequence and you can deal with a bunch of values that will never happen (like even numbers? ... don't use just the low bits), an LCG or MCG are simple solutions. Wikipedia: Linear congruential generator can get you started (there are several more types including the commonly used Wikipedia: Mersenne Twister). And this site can generate a couple prime numbers for the modulus and the multiplier below. (caveat: this sequence will be guessable and thus it is NOT secure)
#include <stdio.h>
#include <stdint.h>
uint64_t
mcg64(void)
{
static uint64_t i = 1;
return (i = (164603309694725029ull * i) % 14738995463583502973ull);
}
int
main(int ac, char * av[])
{
for (int i = 0; i < 10; i++)
printf("%016p\n", mcg64());
}
I have tried this code here and it seems to work fine there.
#include <time.h>
#include <stdlib.h>
#include <math.h>
int main(){
srand(time(NULL));
int a = rand();
int b = rand();
int c = rand();
int d = rand();
long e = (long)a*b;
e = abs(e);
long f = (long)c*d;
f = abs(f);
long long answer = (long long)e*f;
printf("value %lld",answer);
return 0;
}
I ran a few iterations and i get the following outputs :
value 1869044101095834648
value 2104046041914393000
value 1587782446298476296
value 604955295827516250
value 41152208336759610
value 57792837533816000
If you have 32 or 16-bit random value - generate 2 or 4 randoms and combine them to one 64-bit with << and |.
uint64_t rand_uint64(void) {
// Assuming RAND_MAX is 2^31.
uint64_t r = rand();
r = r<<30 | rand();
r = r<<30 | rand();
return r;
}
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
unsigned long long int randomize(unsigned long long int uint_64);
int main(void)
{
srand(time(0));
unsigned long long int random_number = randomize(18446744073709551615);
printf("%llu\n",random_number);
random_number = randomize(123);
printf("%llu\n",random_number);
return 0;
}
unsigned long long int randomize(unsigned long long int uint_64)
{
char buffer[100] , data[100] , tmp[2];
//convert llu to string,store in buffer
sprintf(buffer, "%llu", uint_64);
//store buffer length
size_t len = strlen(buffer);
//x : store converted char to int, rand_num : random number , index of data array
int x , rand_num , index = 0;
//condition that prevents the program from generating number that is bigger input value
bool Condition = 0;
//iterate over buffer array
for( int n = 0 ; n < len ; n++ )
{
//store the first character of buffer
tmp[0] = buffer[n];
tmp[1] = '\0';
//convert it to integer,store in x
x = atoi(tmp);
if( n == 0 )
{
//if first iteration,rand_num must be less than or equal to x
rand_num = rand() % ( x + 1 );
//if generated random number does not equal to x,condition is true
if( rand_num != x )
Condition = 1;
//convert character that corrosponds to integer to integer and store it in data array;increment index
data[index] = rand_num + '0';
index++;
}
//if not first iteration,do the following
else
{
if( Condition )
{
rand_num = rand() % ( 10 );
data[index] = rand_num + '0';
index++;
}
else
{
rand_num = rand() % ( x + 1 );
if( rand_num != x )
Condition = 1;
data[index] = rand_num + '0';
index++;
}
}
}
data[index] = '\0';
char *ptr ;
//convert the data array to unsigned long long int
unsigned long long int ret = _strtoui64(data,&ptr,10);
return ret;
}

How to generate a random number based on a byte array?

Suppose I have an array of bytes from a secure PRNG, and I need to generate a number between 1 and 10 using that data, how would I do that correctly?
Think of the array as one big unsigned integer. Then the answer is simple:
(Big_Number % 10) + 1
So all that is needed is a method to find the modulus 10 of big integers. Using modular exponentiation:
#include <limits.h>
#include <stdlib.h>
int ArrayMod10(const unsigned char *a, size_t n) {
int mod10 = 0;
int base = (UCHAR_MAX + 1) % 10;
for (size_t i = n; i-- > 0; ) {
mod10 = (base*mod10 + a[i]) % 10;
base = (base * base) % 10;
}
return mod10;
}
void test10(size_t n) {
unsigned char a[n];
// fill array with your secure PRNG
for (size_t i = 0; i<n; i++) a[i] = rand();
return ArrayMod10(a, n) + 1;
}
There will be a slight bias as 256^n is not a power of 10. With large n, this will rapidly decrease in significance.
Untested code: Detect if a biased result occurred. Calling code could repeatedly call this function with new a array values to get an unbiased result on the rare occasions when bias occurs.
int ArrayMod10BiasDetect(const unsigned char *a, size_t n, bool *biasptr) {
bool bias = true;
int mod10 = 0;
int base = (UCHAR_MAX + 1) % 10; // Note base is usually 6: 256%10, 65536%10, etc.
for (size_t i = n; i-- > 0; ) {
mod10 = (base*mod10 + a[i]) % 10;
if (n > 0) {
if (a[i] < UCHAR_MAX) bias = false;
} else {
if (a[i] < UCHAR_MAX + 1 - base) bias = false;
}
base = (base * base) % 10;
}
*biaseptr = bias;
return mod10;
}
As per the comments follow-up, it seems what you need is modulus operator [%].
You may also need to check the related wiki.
Note: Every time we use the modulo operator on a random number, there is a probability that we'll be running into modulo bias, which ends up in disbalancing the fair distribution of random numbers. You've to take care of that.
For a detailed discussion on this, please see this question and related answers.
It depends on a bunch of things. Secure PRNG sometimes makes long byte arrays instead of integers, let's say it is 16 bytes long array, then extract 32 bit integer like so: buf[0]*0x1000000+buf[1]*0x10000+buf[2]*0x100+buf[3] or use shift operator. This is random so big-endian/little-endian doesn't matter.
char randbytes[16];
//...
const char *p = randbytes;
//assumes size of int is 4
unsigned int rand1 = p[0] << 24 + p[1] << 16 + p[2] << 8 + p[3]; p += 4;
unsigned int rand2 = p[0] << 24 + p[1] << 16 + p[2] << 8 + p[3]; p += 4;
unsigned int rand3 = p[0] << 24 + p[1] << 16 + p[2] << 8 + p[3]; p += 4;
unsigned int rand4 = p[0] << 24 + p[1] << 16 + p[2] << 8 + p[3];
Then use % on the integer
ps, I think that's a long answer. If you want number between 1 and 10 then just use % on first byte.
OK, so this answer is in Java until I get to my Eclipse C/C++ IDE:
public final static int simpleBound(Random rbg, int n) {
final int BYTE_VALUES = 256;
// sanity check, only return positive numbers
if (n <= 0) {
throw new IllegalArgumentException("Oops");
}
// sanity check: choice of value 0 or 0...
if (n == 1) {
return 0;
}
// sanity check: does not fit in byte
if (n > BYTE_VALUES) {
throw new IllegalArgumentException("Oops");
}
// optimization for n = 2^y
if (Integer.bitCount(n) == 1) {
final int mask = n - 1;
return retrieveRandomByte(rbg) & mask;
}
// you can skip to this if you are sure n = 10
// z is upper bound, and contains floor(z / n) blocks of n values
final int z = (BYTE_VALUES / n) * n;
int x;
do {
x = retrieveRandomByte(rbg);
} while (x >= z);
return x % n;
}
So n is the maximum value in a range [0..n), i.e. n is exclusive. For a range [1..10] simply increase the result with 1.

Unusual condition (* ((uint64_t *) buf) == NEG)

The case is that i'm studying a code i found on the internet which caught my attention, is this:
#include <stdio.h>
#include <stdint.h>
#define NEG ~0x0LL
void ITOC(int8_t *vec, int n)
{
int8_t *p = vec;
for(; n; n /= 10) *p++ = n % 10;
}
void ncmp(int8_t *buf, int y)
{
int tmp, i = 0;
for (; y ; y/=10)
{
tmp = y % 10;
for(i = 0; i < 8; i++)
if(buf[i] == tmp && buf[i] != -1)
{
buf[i] = -1;
break;
}
}
}
int main(void)
{
int8_t buf[8];
int y = 21 ,z = 60, n = 1260;
*((uint64_t*) buf) = NEG;
ITOC(buf, n);
ncmp(buf, y);
ncmp(buf, z);
if( *((uint64_t*) buf) == NEG )
printf("%d = %d * %d\n", n, y, z);
return 0;
}
The part I do not understand this line:
if( *((uint64_t*) buf) == NEG )
If the variables have these values :
y = 21 z = 60 n = 1260
The condition is true but if these values contain:
y = 18 z = 81 n = 1458
In this case the first position buf is -1, if the if only compares the first position with NEG should also be true.
Can someone explain what happens?
I don't know where you found this code, but it certainly doesn't do what you think it does.
What you think it does is check some kind of multiplication. Probably because it prints "n = y * z" at the end. But what it actually does is it takes the digits of n, and remove the digits of y and z. If all digits were removed, it prints that message. So for example:
1111 = 11 * 11 true
1234 = 12 * 34 true
1500 = 10 * 50 true
1500 = 30 * 50 false
1458 = 18 * 81 false
1458 = 14 * 58 true
1458 = 45 * 18 true
At the top of your code you can see that
#define NEG ~0x0LL
Therefore NEG is the bit-wise inverse of 0x0LL, which stands for (long long)0. Therefore, NEG is a long long with all bits set to one.
To understand your problem:
First, fix undefined behavior, allocate your buf (you need stdlib.h)
int8_t * buf;
buf = malloc(sizeof(* buf) * 8);
Then attach a debugger to buf with the expression (uint64_t *)buf and its view property as vector of uint8 (or equivalent from your debugger). This way you can actually see what's done with your variable and why it fails on those specific values.
Something to keep in mind: setting buf[i] is equivalent to setting the corresponding bits to 0xff due to the data type.
Your code simply doesn't do what you think it does.
As said Nit, you are defining NEG as 64bits set to 1.
then what you do:
*((uint64_t*) buf) = NEG
so you are storing 64bits set to one in a 64bits of data pointed to by buf.
then, when you do the checking:
if( *((uint64_t*) buf) == NEG )
you compare 64bits pointed to by buf to 64 bits set to one.
what you seem to want to do is comparing only 8bits:
if( *((uint8_t*) buf) == (uint8_t)NEG )
I'm just reacting to:
, if the if only compares the first position
because the the if compares 64bits of data as asked by casting :)

Rotating an array of bits in C

I've just started learning C and I'm having some problems with some code I want to write.
Basically I have this struct that is a bit array, with the number of bits in the array, and a pointer to a buffer of chars, that stores the bits.
My strategy for rotating the bit array is simply taking the number of rotations (mod the length to avoid full rotations) and using a simple reversal algorithm to rotate the array.
EDIT:
However, my problem is that I want to rotate the bits in the actual buffer.
I also want to be able to rotate a subsequence of bits within the entire bit array. So for 1101101, I might want to rotate (0-indexed from the left) the subsequence starting at index 2 and ending at index 5. I'm not entirely sure how to use my char buffer to do this.
Thanks for the help!
struct arrayBits{
size_t numBits;
char *buf;
}
The buf array holds 8-bit integers, not bools as I previously mentioned.
The way that I can access and set an individual bit is just by indexing into the byte that holds the bit I want (so for an array ab, ab->buf[index_of_desired_bit/8] and then performing some bitwise operations on it to change the value, for performance reasons.
EDIT: Thanks to everyone for all the suggestions. I've looked at all of them and I believe I understand the code better. Here's the code I ended up writing, however, I think there are some problems with it.
While it passes some of my basic test cases, it seems to run a little too fast on an bitarray of size 98775 bits, randomly filled. By this I mean, is there some case in which my code just outright fails and crashes? The test cases do three rotations, in a row, on the full 98775-bit array. One rotation of -98775/4 (<--this is a size_t, so wrap around?), one rotation of 98775/4, and then a final rotation of 98775/2.
Is there something I'm missing or some problem I'm not seeing?
/*Reverse a bit array*/
/*v1.1: basic bit reversal w/o temp variable*/
static void arrayReversal(bitarray_t *ba, size_t begin, size_t end){
while(begin < end)
{
bitarray_set(ba, begin, (bitarray_get(ba, begin) ^ bitarray_get(ba, end))); /*x = x ^ y*/
bitarray_set(ba, end, (bitarray_get(ba, begin) ^ bitarray_get(ba, end))); /*y = x ^ y*/
bitarray_set(ba, begin, (bitarray_get(ba, begin) ^ bitarray_get(ba, end))); /*x = x ^ y*/
begin++;
end--;
}
}
/*Main Rotation Routine*/
void bitarray_rotate(bitarray_t *ba, size_t bit_off, size_t bit_len, ssize_t bit_right_amount) {
assert(bit_off + bit_len <= ba->bit_sz);
assert(bit_off + bit_len > 0);
if(bit_off + bit_len > ba->bit_sz || bit_off + bit_len < 0)
{
printf("\nError: Indices out of bounds\n");
return;
}
/*Find index to split bitarray on*/
if(bit_len == 0) return; //Rotate only 1 bit i.e. no rotation
size_t reversal_index;
reversal_index = modulo(-bit_right_amount, bit_len);
if(reversal_index == 0) return; //No rotation to do
/*3 bit string reversals*/
assert(reversal_index - 1 + bit_off < ba->bit_sz);
/* Reverse A*/
arrayReversal(ba, bit_off, reversal_index - 1 + bit_off);
assert(reversal_index + bit_off < ba->bit_sz);
/*Reverse B*/
arrayReversal(ba, reversal_index + bit_off, (bit_off + bit_len - 1));
/*Reverse ArBr*/
arrayReversal(ba, bit_off, (bit_off + bit_len -1));
}
Well the easy way to start is to consider how to rotate the bits in a single value. Let's say that you have x, which is an N-bit value and you want to rotate it by k places. (I'm only going to look at rotating upwards/left, it is easy to convert to downwards/right). The first thing to observe is that if k=N then x is unchanged. So before rotating we want to reduce k modulo N to throw away complete rotations.
Next we should observe that during the rotation the k upper-bits will move to the bottom of the value, and the lower N-k bits will move up k places. This is the same as saying that the top k-bits move down N-k places. The reason that we phrase it this way is that C has shift operators, but not rotation.
In psuedo-C we can say:
#define N sizeof(type)*8
type rotate(type x, int k) {
type lower = x & ((1 << (N-k)) - 1);
type upper = x >> (N-k) & ((1 <<k)-1);
return upper | lower;
}
This takes care of the simple atomic case, simply replace type with char or int as appropriate. If type is unsigned then the mask on the value of upper is unnecessary.
The next thing to consider is rotating in an array of values. If you think of the above code as glueing together two halves of a value then for the more complicated case we need to glue together upper and lower parts from different places in the array. If k is small then these places are adjacent in the array, but when k>N we are rotating through more than one intermediate word.
In particular if we are rotating up k places then we are moving bits from k/N words away in the array, and the N bits can span floor(k/N) and ceil(k/N) locations away in the array. Ok, so now we're ready to put it all together. For each word in the array the new upper N-(k mod N) bits will be the lower bits of floor(k/N) words away, and the new lower (k mod N) bits will be the upper bits of ceil(k/N) words away.
In the same psuedo-C (i.e replace type with what you are using) we can say:
#define N sizeof(type)*8
#define ARR_SIZE ...
type rotate(type *x, int k,type *out) {
int r = k % N;
int upperOff = k/N;
int lowerOff = (k+N-1)/N;
for(int i=0; i<ARR_SIZE; i++) {
int lowerPos = (i + ARR_SIZE - lowerOff) % ARR_SIZE
int upperPos = (i + ARR_SIZE - upperOff) % ARR_SIZE
type lower = x[lowerPos] & ((1 << (N-k)) - 1)
type upper = x[upperPos] >> (N-k) & ((1 <<k)-1)
out[i] = upper | lower;
}
}
Anyway, that's a lot more than I was intending to write so I'll quit now. It should be easy enough to convert this to a form that works inplace on a single array, but you'll probably want to fix the types and the range of k first in order to bound the temporary storage.
If you have any more problems in this area then one place to look is bitmap sprite graphics. For example this rotation problem was used to implement scrolling many, many moons ago in 8-bit games.
I would suggest a pointer/offset to a starting point of a bit in the buffer instead of rotating. Feel free to overload any operator that might be useful, operator[] comes to mind.
A rotate(n) would simply be a offset+=n operation. But I find the purpose of your comment about -"However, my problem is that I want to rotate the actual buffer" confusing.
You dont need an extra buffer for rotate (only for output).
You should implement a function for one rotate and loop this, eg: (right-shift variation)
char *itoa2(char *s,size_t i)
{
*s=0;
do {
memmove(s+1,s,strlen(s)+1);
*s='0'+(i&1);
} while( i>>=1 );
return s;
}
size_t bitrotateOne(size_t i)
{
return i>>1 | (i&1) << (sizeof i<<3)-1;
}
...
size_t i=12,num=17;
char b[129];
while( num-- )
{
i = bitrotateOne(i);
puts( itoa2(b,i) );
}
Since your criteria is so complex, I think the easiest way to do it would be to step through each bit and set where it would be in your new array. You could speed it up for some operations by copying a whole character if it is outside the shifted bits, but I can't think of how to reliably do shifting taking into account all the variables because the start and end of the shifted sequence can be in the middle of bytes and so can the end of the entire bits. The key is to get the new bit position for a bit in the old array:
j = (i < startBit || i >= startBit + length) ? i :
((i - startBit + shiftRightCount) % length) + startBit;
Code:
#include "stdafx.h"
#include <stdlib.h>
#include <string.h>
typedef struct {
size_t numBits;
unsigned char *buf;
} ARRAYBITS;
// format is big endian, shiftint left 8 bits will shift all bytes to a lower index
ARRAYBITS rotateBits(ARRAYBITS *pOriginalBits, int startBit, int length, int shiftRightCount);
void setBit(unsigned char *buf, int bit, bool isSet);
bool checkBit(unsigned char *buf, int bit);
ARRAYBITS fromString(char *onesAndZeros);
char *toString(ARRAYBITS *pBits);
int _tmain(int argc, _TCHAR* argv[])
{
char input[1024];
ARRAYBITS bits = fromString("11110000110010101110"); // 20 bits
ARRAYBITS bitsA = rotateBits(&bits, 0, bits.numBits, 1);
ARRAYBITS bitsB = rotateBits(&bits, 0, bits.numBits, -1);
ARRAYBITS bitsC = rotateBits(&bits, 6, 8, 4);
ARRAYBITS bitsD = rotateBits(&bits, 6, 8, -2);
ARRAYBITS bitsE = rotateBits(&bits, 6, 8, 31);
ARRAYBITS bitsF = rotateBits(&bits, 6, 8, -31);
printf("Starting : %s\n", toString(&bits));
printf("All right 1: %s\n", toString(&bitsA));
printf("All left 1 : %s\n", toString(&bitsB));
printf("\n");
printf(" : ********\n");
printf("Starting : %s\n", toString(&bits));
printf("6,8,4 : %s\n", toString(&bitsC));
printf("6,8,-2 : %s\n", toString(&bitsD));
printf("6,8,31 : %s\n", toString(&bitsE));
printf("6,8,-31 : %s\n", toString(&bitsF));
gets(input);
}
ARRAYBITS rotateBits(ARRAYBITS *pOriginalBits, int startBit, int length, int shiftRightCount)
{
// 0-8 == 1, 9-16 == 2, 17-24 == 3
ARRAYBITS newBits;
int i = 0, j = 0;
int bytes = 0;
while (shiftRightCount < 0)
shiftRightCount += length;
shiftRightCount = shiftRightCount % length;
newBits.numBits = pOriginalBits->numBits;
if (pOriginalBits->numBits <= 0)
return newBits;
bytes = ((pOriginalBits->numBits -1) / 8) + 1;
newBits.buf = (unsigned char *)malloc(bytes);
memset(newBits.buf, 0, bytes);
for (i = 0; i < pOriginalBits->numBits; i++) {
j = (i < startBit || i >= startBit + length) ? i : ((i - startBit + shiftRightCount) % length) + startBit;
if (checkBit(pOriginalBits->buf, i))
{
setBit(newBits.buf, j, true);
}
}
return newBits;
}
void setBit(unsigned char *buf, int bit, bool isSet)
{
int charIndex = bit / 8;
unsigned char c = 1 << (bit & 0x07);
if (isSet)
buf[charIndex] |= c;
else
buf[charIndex] &= (c ^ 255);
}
bool checkBit(unsigned char *buf, int bit)
{
// address of char is (bit / 8), bit within char is (bit & 7)
int index = bit / 8;
int b = bit & 7;
int value = 1 << b;
return ((buf[index] & value) > 0);
}
ARRAYBITS fromString(char *onesAndZeros)
{
int i;
ARRAYBITS bits;
int charCount;
bits.numBits = strlen(onesAndZeros);
charCount = ((bits.numBits -1) / 8) + 1;
bits.buf = (unsigned char *)malloc(charCount);
memset(bits.buf, 0, charCount);
for (i = 0; i < bits.numBits; i++)
{
if (onesAndZeros[i] != '0')
setBit(bits.buf, i, true);
}
return bits;
}
char *toString(ARRAYBITS *pBits)
{
char *buf = (char *)malloc(pBits->numBits + 1);
int i;
for (i = 0; i < pBits->numBits; i++)
{
buf[i] = checkBit(pBits->buf, i) ? '1' : '0';
}
buf[i] = 0;
return buf;
}
I suggest you use bit-level operations (>>,<<,~,&,|) rather than wasting space using int. Even so, using an int array, to rotate, pass the left & right index of substring:
void rotate ( struct arrayBits a, int left , int right )
{
int i;
int first_bit;
if(*( a.buf + right ) == 1) first_bit = 1;
else first_bit = 0;
for( i = left+1 ; i <= right ; i++ )
{
*( a.buf + i )=*( a.buf + i - 1 );
}
*a.buf = first_bit;
}
Example:
If struct_array is 010101,
rotate (struct_array,0,5); => rotates whole string 1 int to right
o/p: 101010
rotate (struct_array,2,4); => rotates substring 1 int to right
o/p: 01 001 1
To reverse the bit array call the rotate() function on the substring, size_of_substring times.

Print large base 256 array in base 10 in c

I have an array of unsigned chars in c I am trying to print in base 10, and I am stuck. I think this will be better explained in code, so, given:
unsigned char n[3];
char[0] = 1;
char[1] = 2;
char[2] = 3;
I would like to print 197121.
This is trivial with small base 256 arrays. One can simply 1 * 256 ^ 0 + 2 * 256 ^ 1 + 3 * 256 ^ 2.
However, if my array was 100 bytes large, then this quickly becomes a problem. There is no integral type in C that is 100 bytes large, which is why I'm storing numbers in unsigned char arrays to begin with.
How am I supposed to efficiently print this number out in base 10?
I am a bit lost.
There's no easy way to do it using only the standard C library. You'll either have to write the function yourself (not recommended), or use an external library such as GMP.
For example, using GMP, you could do:
unsigned char n[100]; // number to print
mpz_t num;
mpz_import(num, 100, -1, 1, 0, 0, n); // convert byte array into GMP format
mpz_out_str(stdout, 10, num); // print num to stdout in base 10
mpz_clear(num); // free memory for num
When I saw this question, I purpose to solve it, but at that moment I was very busy.
This last weekend I've could gain some prize hours of free time so I considered my pending challenge.
First of all, I suggest you to considered above response. I never use GMP library but I'm sure that it's better solution than a handmade code.
Also, you could be interest to analyze code of bc calculator; it can works with big numbers and I used to test my own code.
Ok, if you are still interested in a code do it by yourself (only with support C language and Standard C library) may be I can give you something.
Before all, a little bit theory. In basic numeric theory (modular arithmetic level) theres is an algorithm that inspire me to arrive at one solution; Multiply and Power algorithm to solve a^N module m:
Result := 1;
for i := k until i = 0
if n_i = 1 then Result := (Result * a) mod m;
if i != 0 then Result := (Result * Result) mod m;
end for;
Where k is number of digits less one of N in binary representation, and n_i is i binary digit. For instance (N is exponent):
N = 44 -> 1 0 1 1 0 0
k = 5
n_5 = 1
n_4 = 0
n_3 = 1
n_2 = 1
n_1 = 0
n_0 = 0
When we make a module operation, as an integer division, we can lose part of the number, so we only have to modify algorithm to don't miss relevant data.
Here is my code (take care that it is an adhoc code, strong dependency of may computer arch. Basically I play with data length of C language so, be carefully because my data length could not be the same):
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
enum { SHF = 31, BMASK = 0x1 << SHF, MODULE = 1000000000UL, LIMIT = 1024 };
unsigned int scaleBigNum(const unsigned short scale, const unsigned int lim, unsigned int *num);
unsigned int pow2BigNum(const unsigned int lim, unsigned int *nsrc, unsigned int *ndst);
unsigned int addBigNum(const unsigned int lim1, unsigned int *num1, const unsigned int lim2, unsigned int *num2);
unsigned int bigNum(const unsigned short int base, const unsigned int exp, unsigned int **num);
int main(void)
{
unsigned int *num, lim;
unsigned int *np, nplim;
int i, j;
for(i = 1; i < LIMIT; ++i)
{
lim = bigNum(i, i, &num);
printf("%i^%i == ", i, i);
for(j = lim - 1; j > -1; --j)
printf("%09u", num[j]);
printf("\n");
free(num);
}
return 0;
}
/*
bigNum: Compute number base^exp and store it in num array
#base: Base number
#exp: Exponent number
#num: Pointer to array where it stores big number
Return: Array length of result number
*/
unsigned int bigNum(const unsigned short int base, const unsigned int exp, unsigned int **num)
{
unsigned int m, lim, mem;
unsigned int *v, *w, *k;
//Note: mem has the exactly amount memory to allocate (dinamic memory version)
mem = ( (unsigned int) (exp * log10( (float) base ) / 9 ) ) + 3;
v = (unsigned int *) malloc( mem * sizeof(unsigned int) );
w = (unsigned int *) malloc( mem * sizeof(unsigned int) );
for(m = BMASK; ( (m & exp) == 0 ) && m; m >>= 1 ) ;
v[0] = (m) ? 1 : 0;
for(lim = 1; m > 1; m >>= 1)
{
if( exp & m )
lim = scaleBigNum(base, lim, v);
lim = pow2BigNum(lim, v, w);
k = v;
v = w;
w = k;
}
if(exp & 0x1)
lim = scaleBigNum(base, lim, v);
free(w);
*num = v;
return lim;
}
/*
scaleBigNum: Make an (num[] <- scale*num[]) big number operation
#scale: Scalar that multiply big number
#lim: Length of source big number
#num: Source big number (array of unsigned int). Update it with new big number value
Return: Array length of operation result
Warning: This method can write in an incorrect position if we don't previous reallocate num (if it's necessary). bigNum method do it for us
*/
unsigned int scaleBigNum(const unsigned short scale, const unsigned int lim, unsigned int *num)
{
unsigned int i;
unsigned long long int n, t;
for(n = 0, t = 0, i = 0; i < lim; ++i)
{
t = (n / MODULE);
n = ( (unsigned long long int) scale * num[i] );
num[i] = (n % MODULE) + t; // (n % MODULE) + t always will be smaller than MODULE
}
num[i] = (n / MODULE);
return ( (num[i]) ? lim + 1 : lim );
}
/*
pow2BigNum: Make a (dst[] <- src[] * src[]) big number operation
#lim: Length of source big number
#src: Source big number (array of unsigned int)
#dst: Destination big number (array of unsigned int)
Return: Array length of operation result
Warning: This method can write in an incorrect position if we don't previous reallocate num (if it's necessary). bigNum method do it for us
*/
unsigned int pow2BigNum(const unsigned int lim, unsigned int *src, unsigned int *dst)
{
unsigned int i, j;
unsigned long long int n, t;
unsigned int k, c;
for(c = 0, dst[0] = 0, i = 0; i < lim; ++i)
{
for(j = i, n = 0; j < lim; ++j)
{
n = ( (unsigned long long int) src[i] * src[j] );
k = i + j;
if(i != j)
{
t = 2 * (n % MODULE);
n = 2 * (n / MODULE);
// (i + j)
dst[k] = ( (k > c) ? ((c = k), 0) : dst[k] ) + (t % MODULE);
++k; // (i + j + 1)
dst[k] = ( (k > c) ? ((c = k), 0) : dst[k] ) + ( (t / MODULE) + (n % MODULE) );
++k; // (i + j + 2)
dst[k] = ( (k > c) ? ((c = k), 0) : dst[k] ) + (n / MODULE);
}
else
{
dst[k] = ( (k > c) ? ((c = k), 0) : dst[k] ) + (n % MODULE);
++k; // (i + j)
dst[k] = ( (k > c) ? ((c = k), 0) : dst[k] ) + (n / MODULE);
}
for(k = i + j; k < (lim + j); ++k)
{
dst[k + 1] += (dst[k] / MODULE);
dst[k] %= MODULE;
}
}
}
i = lim << 1;
return ((dst[i - 1]) ? i : i - 1);
}
/*
addBigNum: Make a (num2[] <- num1[] + num2[]) big number operation
#lim1: Length of source num1 big number
#num1: First source operand big number (array of unsigned int). Should be smaller than second
#lim2: Length of source num2 big number
#num2: Second source operand big number (array of unsigned int). Should be equal or greater than first
Return: Array length of operation result or 0 if num1[] > num2[] (dosen't do any op)
Warning: This method can write in an incorrect position if we don't previous reallocate num2
*/
unsigned int addBigNum(const unsigned int lim1, unsigned int *num1, const unsigned int lim2, unsigned int *num2)
{
unsigned long long int n;
unsigned int i;
if(lim1 > lim2)
return 0;
for(num2[lim2] = 0, n = 0, i = 0; i < lim1; ++i)
{
n = num2[i] + num1[i] + (n / MODULE);
num2[i] = n % MODULE;
}
for(n /= MODULE; n; ++i)
{
num2[i] += n;
n = (num2[i] / MODULE);
}
return (lim2 > i) ? lim2 : i;
}
To compile:
gcc -o bgn <name>.c -Wall -O3 -lm //Math library if you wants to use log func
To check result, use direct output as and input to bc. Easy shell script:
#!/bin/bash
select S in ` awk -F '==' '{print $1 " == " $2 }' | bc`;
do
0;
done;
echo "Test Finished!";
We have and array of unsigned int (4 bytes) where we store at each int of array a number of 9 digits ( % 1000000000UL ); hence num[0] we will have the first 9 digits, num[1] we will have digit 10 to 18, num[2]...
I use convencional memory to work but an improvement can do it with dinamic memory. Ok, but how length It could be the array? (or how many memory we need to allocate?). Using bc calculator (bc -l with mathlib) we can determine how many digits has a number:
l(a^N) / l(10) // Natural logarith to Logarithm base 10
If we know digits, we know amount integers we needed:
( l(a^N) / (9 * l(10)) ) + 1 // Truncate result
If you work with value such as (2^k)^N you can resolve it logarithm with this expression:
( k*N*l(2)/(9*l(10)) ) + 1 // Truncate result
to determine the exactly length of integer array. Example:
256^800 = 2^(8*800) ---> l(2^(8*800))/(9*l(10)) + 1 = 8*800*l(2)/(9*l(10)) + 1
The value 1000000000UL (10^9) constant is very important. A constant like 10000000000UL (10^10) dosen't work because can produce and indetected overflow (try what's happens with number 16^16 and 10^10 constant) and a constant more little such as 1000000000UL (10^8) are correct but we need to reserve more memory and do more steps. 10^9 is key constant for unsigned int of 32 bits and unsigned long long int of 64 bits.
The code has two parts, Multiply (easy) and Power by 2 (more hard). Multiply is just multiplication and scale and propagate the integer overflow. It take the principle of associative property in math to do exactly the inverse principle, so if k(A + B + C) we want kA + kB + kC where number will be k*A*10^18 + k*B*10^9 + kC. Obiously, kC operation can generate a number bigger than 999 999 999, but never more bigger than 0xFF FF FF FF FF FF FF FF. A number bigger than 64 bits can never occur in a multiplication because C is an unsigned integer of 32 bits and k is a unsigned short of 16 bits. In worts case, we will have this number:
k = 0x FF FF;
C = 0x 3B 9A C9 FF; // 999999999
n = k*C = 0x 3B 9A | 8E 64 36 01;
n % 1000000000 = 0x 3B 99 CA 01;
n / 1000000000 = 0x FF FE;
After Mul kB we need to add 0x FF FE from last multiplication of C ( B = kB + (C / module) ), and so on (we have 18 bits arithmetic offset, enough to guarantee correct values).
Power is more complex but is in essencial, the same problem (multiplication and add), so I give some tricks about code power:
Data types are important, very important
If you try to multiplication an unsigned integer with unsigned integer, you get another unsigned integer. Use explicit cast to get unsigned long long int and don't lose data.
Always use unsigned modifier, dont forget it!
Power by 2 can directly modify 2 index ahead of current index
gdb is your friend
I've developed another method that add big numbers. These last I don't prove so much but I think it works well. Don't be cruels with me if it has a bug.
...and that's all!
PD1: Developed in a
Intel(R) Pentium(R) 4 CPU 1.70GHz
Data length:
unsigned short: 2
unsigned int: 4
unsigned long int: 4
unsigned long long int: 8
Numbers such as 256^1024 it spend:
real 0m0.059s
user 0m0.033s
sys 0m0.000s
A bucle that's compute i^i where i goes to i = 1 ... 1024:
real 0m40.716s
user 0m14.952s
sys 0m0.067s
For numbers such as 65355^65355, spent time is insane.
PD2: My response is so late but I hope my code it will be usefull.
PD3: Sorry, explain me in english is one of my worst handicaps!
Last update: I just have had an idea that with same algorithm but other implementation, improve response and reduce amount memory to use (we can use the completely bits of unsigned int). The secret: n^2 = n * n = n * (n - 1 + 1) = n * (n - 1) + n.
(I will not do this new code, but if someone are interested, may be after exams... )
I don't know if you still need a solution, but I wrote an article about this problem. It shows a very simple algorithm which can be used to convert an arbitrary long number with base X to a corresponding number of base Y. The algorithm is written in Python, but it is really only a few lines long and doesn't use any Python magic. I needed such an algorithm for a C implementation, too, but decided to describe it using Python for two reasons. First, Python is very readable by anyone who understands algorithms written in a pseudo programming language and, second, I am not allowed to post the C version, because it I did it for my company. Just have a look and you will see how easy this problem can be solved in general. An implementation in C should be straight forward...
Here is a function that does what you want:
#include <math.h>
#include <stddef.h> // for size_t
double getval(unsigned char *arr, size_t len)
{
double ret = 0;
size_t cur;
for(cur = 0; cur < len; cur++)
ret += arr[cur] * pow(256, cur);
return ret;
}
That looks perfectly readable to me. Just pass the unsigned char * array you want to convert and the size. Note that it won't be perfect - for arbitrary precision, I suggest looking into the GNU MP BigNum library, as has been suggested already.
As a bonus, I don't like your storing your numbers in little-endian order, so here's a version if you want to store base-256 numbers in big-endian order:
#include <stddef.h> // for size_t
double getval_big_endian(unsigned char *arr, size_t len)
{
double ret = 0;
size_t cur;
for(cur = 0; cur < len; cur++)
{
ret *= 256;
ret += arr[cur];
}
return ret;
}
Just things to consider.
It may be too late or too irrelevant to make this suggestion, but could you store each byte as two base 10 digits (or one base 100) instead of one base 256? If you haven't implemented division yet, then that implies all you have is addition, subtraction, and maybe multiplication; those shouldn't be too hard to convert. Once you've done that, printing it would be trivial.
As I was not satisfied with the other answers provided, I decided to write an alternative solution myself:
#include <stdlib.h>
#define BASE_256 256
char *largenum2str(unsigned char *num, unsigned int len_num)
{
int temp;
char *str, *b_256 = NULL, *cur_num = NULL, *prod = NULL, *prod_term = NULL;
unsigned int i, j, carry = 0, len_str = 1, len_b_256, len_cur_num, len_prod, len_prod_term;
//Get 256 as an array of base-10 chars we'll use later as our second operand of the product
for ((len_b_256 = 0, temp = BASE_256); temp > 0; len_b_256++)
{
b_256 = realloc(b_256, sizeof(char) * (len_b_256 + 1));
b_256[len_b_256] = temp % 10;
temp = temp / 10;
}
//Our first operand (prod) is the last element of our num array, which we'll convert to a base-10 array
for ((len_prod = 0, temp = num[len_num - 1]); temp > 0; len_prod++)
{
prod = realloc(prod, sizeof(*prod) * (len_prod + 1));
prod[len_prod] = temp % 10;
temp = temp / 10;
}
while (len_num > 1) //We'll stay in this loop as long as we still have elements in num to read
{
len_num--; //Decrease the length of num to keep track of the current element
//Convert this element to a base-10 unsigned char array
for ((len_cur_num = 0, temp = num[len_num - 1]); temp > 0; len_cur_num++)
{
cur_num = (char *)realloc(cur_num, sizeof(char) * (len_cur_num + 1));
cur_num[len_cur_num] = temp % 10;
temp = temp / 10;
}
//Multiply prod by 256 and save that as prod_term
len_prod_term = 0;
prod_term = NULL;
for (i = 0; i < len_b_256; i++)
{ //Repeat this loop 3 times, one for each element in {6,5,2} (256 as a reversed base-10 unsigned char array)
carry = 0; //Set the carry to 0
prod_term = realloc(prod_term, sizeof(*prod_term) * (len_prod + i)); //Allocate memory to save prod_term
for (j = i; j < (len_prod_term); j++) //If we have digits from the last partial product of the multiplication, add it here
{
prod_term[j] = prod_term[j] + prod[j - i] * b_256[i] + carry;
if (prod_term[j] > 9)
{
carry = prod_term[j] / 10;
prod_term[j] = prod_term[j] % 10;
}
else
{
carry = 0;
}
}
while (j < (len_prod + i)) //No remaining elements of the former prod_term, so take only into account the results of multiplying mult * b_256
{
prod_term[j] = prod[j - i] * b_256[i] + carry;
if (prod_term[j] > 9)
{
carry = prod_term[j] / 10;
prod_term[j] = prod_term[j] % 10;
}
else
{
carry = 0;
}
j++;
}
if (carry) //A carry may be present in the last term. If so, allocate memory to save it and increase the length of prod_term
{
len_prod_term = j + 1;
prod_term = realloc(prod_term, sizeof(*prod_term) * (len_prod_term));
prod_term[j] = carry;
}
else
{
len_prod_term = j;
}
}
free(prod); //We don't need prod anymore, prod will now be prod_term
prod = prod_term;
len_prod = len_prod_term;
//Add prod (formerly prod_term) to our current number of the num array, expressed in a b-10 array
carry = 0;
for (i = 0; i < len_cur_num; i++)
{
prod[i] = prod[i] + cur_num[i] + carry;
if (prod[i] > 9)
{
carry = prod[i] / 10;
prod[i] -= 10;
}
else
{
carry = 0;
}
}
while (carry && (i < len_prod))
{
prod[i] = prod[i] + carry;
if (prod[i] > 9)
{
carry = prod[i] / 10;
prod[i] -= 10;
}
else
{
carry = 0;
}
i++;
}
if (carry)
{
len_prod++;
prod = realloc(prod, sizeof(*prod) * len_prod);
prod[len_prod - 1] = carry;
carry = 0;
}
}
str = malloc(sizeof(char) * (len_prod + 1)); //Allocate memory for the return string
for (i = 0; i < len_prod; i++) //Convert the numeric result to its representation as characters
{
str[len_prod - 1 - i] = prod[i] + '0';
}
str[i] = '\0'; //Terminate our string
free(b_256); //Free memory
free(prod);
free(cur_num);
return str;
}
The idea behind it all derives from simple math. For any base-256 number, its base-10 representation can be calculated as:
num[i]*256^i + num[i-1]*256^(i-1) + (···) + num[2]*256^2 + num[1]*256^1 + num[0]*256^0
which expands to:
(((((num[i])*256 + num[i-1])*256 + (···))*256 + num[2])*256 + num[1])*256 + num[0]
So all we have to do is to multiply, step-by step, each element of the number array by 256 and add to it the next element, and so on... That way we can get the base-10 number.

Resources