Let me preface this with.. I have extremely limited experience with ASM, and even less with SIMD.
But it happens that I have the following MMX/SSE optimised code, that I would like to port across to AltiVec instructions for use on PPC/Cell processors.
This is probably a big ask.. Even though it's only a few lines of code, I've had no end of trouble trying to work out what's going on here.
The original function:
static inline int convolve(const short *a, const short *b, int n)
{
int out = 0;
union {
__m64 m64;
int i32[2];
} tmp;
tmp.i32[0] = 0;
tmp.i32[1] = 0;
while (n >= 4) {
tmp.m64 = _mm_add_pi32(tmp.m64,
_mm_madd_pi16(*((__m64 *)a),
*((__m64 *)b)));
a += 4;
b += 4;
n -= 4;
}
out = tmp.i32[0] + tmp.i32[1];
_mm_empty();
while (n --)
out += (*(a++)) * (*(b++));
return out;
}
Any tips on how I might rewrite this to use AltiVec instructions?
My first attempt (a very wrong attempt) looks something like this.. But it's not entirely (or even remotely) correct.
static inline int convolve_altivec(const short *a, const short *b, int n)
{
int out = 0;
union {
vector unsigned int m128;
int i64[2];
} tmp;
vector unsigned int zero = {0, 0, 0, 0};
tmp.i64[0] = 0;
tmp.i64[1] = 0;
while (n >= 8) {
tmp.m128 = vec_add(tmp.m128,
vec_msum(*((vector unsigned short *)a),
*((vector unsigned short *)b), zero));
a += 8;
b += 8;
n -= 8;
}
out = tmp.i64[0] + tmp.i64[1];
#endif
while (n --)
out += (*(a++)) * (*(b++));
return out;
}
You're not far off - I fixed a few minor problems, cleaned up the code a little, added a test harness, and it seems to work OK now:
#include <assert.h>
#include <stdio.h>
#include <stdlib.h>
#include <altivec.h>
static int convolve_ref(const short *a, const short *b, int n)
{
int out = 0;
int i;
for (i = 0; i < n; ++i)
{
out += a[i] * b[i];
}
return out;
}
static inline int convolve_altivec(const short *a, const short *b, int n)
{
int out = 0;
union {
vector signed int m128;
int i32[4];
} tmp;
const vector signed int zero = {0, 0, 0, 0};
assert(((unsigned long)a & 15) == 0);
assert(((unsigned long)b & 15) == 0);
tmp.m128 = zero;
while (n >= 8)
{
tmp.m128 = vec_msum(*((vector signed short *)a),
*((vector signed short *)b), tmp.m128);
a += 8;
b += 8;
n -= 8;
}
out = tmp.i32[0] + tmp.i32[1] + tmp.i32[2] + tmp.i32[3];
while (n --)
out += (*(a++)) * (*(b++));
return out;
}
int main(void)
{
const int n = 100;
vector signed short _a[n / 8 + 1];
vector signed short _b[n / 8 + 1];
short *a = (short *)_a;
short *b = (short *)_b;
int sum_ref, sum_test;
int i;
for (i = 0; i < n; ++i)
{
a[i] = rand();
b[i] = rand();
}
sum_ref = convolve_ref(a, b, n);
sum_test = convolve_altivec(a, b, n);
printf("sum_ref = %d\n", sum_ref);
printf("sum_test = %d\n", sum_test);
printf("%s\n", sum_ref == sum_test ? "PASS" : "FAIL");
return 0;
}
(Warning: all of my Altivec experience comes from working on Xbox360/PS3 - I'm not sure how different they are from other Altivec platforms).
First off, you should check your pointer alignment. Most vector loads (and stores) operations are expected to be from 16-byte aligned addresses. If they aren't, things will usually carry on without warning, but you won't get the data you were expecting.
It's possible (but slower) to do unaligned loads, but you basically have to read a bit before and after your data and combine them. See Apple's Altivec page. I've also done it before using an lvlx and lvrx load instructions, and then ORing them together.
Next up, I'm not sure your multiplies and adds are the same. I've never used either _mm_madd_pi16 or vec_msum, so I'm not positive they're equivalent. You should step through in a debugger and make sure they give you the same output for the same input data. Another possible difference is that they may treat overflow differently (e.g. modular vs. saturate).
Last but not least, you're computing 4 ints at a time instead of 2. So your union should hold 4 ints, and you should sum all 4 of them at the end.
Related
I am writing some code to deal with numbers in C which are bigger than 8 bytes in size (don't fit into unsigned long). In this example I will use 16 bytes (128 bits) as the width. The numbers are unsigned and integers (no decimal places). They are stored as an array of unsigned chars eg:
unsigned char n[16];
I have managed to get addition to work (it works like an unsigned number in C so if you had a number which was 0xffffffffffffffffffffffffffffffff (2**128) and you were to add 1 you would get 0. I have managed to get addition to work, but I cannot get subtraction to work. I would assume it would be similar code to addition, but I don't seem to be able to get it to work.
Addition code:
//a and b are numbers
unsigned char *add(unsigned char *a, unsigned char *b){
unsigned char *c = malloc(NUM_SIZE);
//d is the carry and c is the output number
unsigned short d = 0;
if(!c){
return NULL;
}
for(int i = 0; i < NUM_SIZE; i++){
c[i] = 0;
}
for(int i = NUM_SIZE * 2 - 1; i >= 0; i--){
d += a[i % NUM_SIZE] + b[i % NUM_SIZE];
c[i % NUM_SIZE] = d % 256;
d >>= 8;
}
return c;
}
NUM_SIZE is defined as 16 (the width of the number in bytes)
What I have tried:
//changing the signs to minuses
d -= a[i % NUM_SIZE] - b[i % NUM_SIZE];
//changing the some signs to minuses
d -= a[i % NUM_SIZE] + b[i % NUM_SIZE];
//or
d += a[i % NUM_SIZE] - b[i % NUM_SIZE];
//looping through the number backwards
for(int i = 0; i < NUM_SIZE * 2; i++)
Just an idea (not compiled):
void not( unsigned char* a, unsigned int n )
{
for ( unsigned int i = 0; i < n; ++i )
a[i] = ~a[i];
}
void inc( unsigned char* a, unsigned int n )
{
for ( unsigned int i = 0; i < n; ++i )
if ( ++a[i] )
return;
}
void add( unsigned char* c, unsigned char* a, unsigned char* b, unsigned int n )
{
for ( unsigned int i = 0, r = 0; i < n; ++i )
c[i] = r = a[i] + b[i] + ( r >> 8 );
}
void sub( unsigned char* c, unsigned char* a, unsigned char* b, unsigned int n )
{
not( b, n );
add( c, a, b, n );
not( b, n ); // revert
inc( c, n );
}
You may want to use arbitrary-precision arithmetic, a.k.a. as bigint or bignum. You should use a library for that (because bignum algorithms are very clever and use some assembler code). I recommend GMPlib. See also this.
NUM_SIZE * 2 does not make sense with malloc(NUM_SIZE); ... for(int i = NUM_SIZE * 2 - 1. Only a loop of NUM_SIZE iterations is needed.
Repaired code
#define NUM_SIZE 8
//a - b
unsigned char *sub(const unsigned char *a, const unsigned char *b) {
unsigned char *c = malloc(NUM_SIZE);
if (!c) {
return NULL;
}
// zeroing `c[]` not needed. Retain that code if desired
int d = 0; // Use signed accumulator to save the "borrow"
// drop *2
for (int i = NUM_SIZE - 1; i >= 0; i--) {
d += a[i] - b[i]; // Perform the subtraction
c[i] = d; // Save the 8 least significant bits in c[]
d = (d - c[i]) / (UCHAR_MAX+1); // Form the "borrow" for the next loop
}
// If d<0 at this point, b was greater than a
return c;
}
Various performance improvements can be made, but important to get functionality correct first.
Numbers have a "base" that determines the range of each digit (e.g. "base 10" is decimal).
One uint8_t is a single digit in "base 256". One uint16_t is a single digit in "base 65536". One uint32_t is a single digit in "base 4294967296".
For mathematical operations, performance is heavily effected by the number of digits. By using a larger base you need fewer digits for the same number, which improves performance (until you exceed the CPU's native word size).
For subtraction of unsigned numbers:
#define DIGITS 4
int subtract(uint32_t *result, uint32_t *src1, uint32_t *src2) {
int carry = 0;
int oldCarry;
int i;
for(i = 0; i < DIGITS; i++) {
oldCarry = carry;
if(src2[i] < src1[i]) {
carry = 1;
} else if( (src2[i] == src1[i]) && (oldCarry != 0) ) {
carry = 1;
} else {
carry = 0;
}
result[i] = src1[i] - src2[i] - oldCarry;
}
return carry;
}
There may be some __int128_t. But if your compiler does not support it you define a struct with hi and lo with the biggest type you have. In c++ you can also add operators similar to the operators you know from the other int_t-s.
typedef struct uint128 {
uint64_t lo, hi; // lo comes first if you want to use little-endian else hi comes first
} uint128_t;
If you want to double the size, you use uint128_t in a similar struct.
Edit:
A simple function to increase the int128:
int128_t& int128_increase(int128_t& value) {
// increase the low part, it is 0 if it was overflown
// so increase hi
if (!(++value.lo)) {
++value.hi;
};
return value;
};
Edit:
A runtime scaled version of ints, I use words, because it is faster in accessing memory:
typedef struct uint_dynamic {
// the length as a multiple of the wordsize
size_t length;
size_t* words;
} uint_dynamic_t;
uint_dynamic_t& uint_dynamic_increase(uint_dynamic_t& value) {
size_t* ptr = value.words; size_t i = value.length;
while(i && !(++*ptr)) { ++ptr; --i; };
return value;
};
Or if you want some constant size, put it clearly into a struct.
#define uint_fixed_SIZE (16 / sizeof(size_t))
typedef struct uint_fixed {
size_t words[uint_fixed_SIZE];
} uint_fixed_t;
uint_fixed_t& uint_fixed_increase(uint_fixed_t& value) {
size_t* ptr = value.words; size_t i = uint_fixed_SIZE;
while(i && !(++*ptr)) { ++ptr; --i; };
return value;
};
This can be rewritten as a #define-macro, where you replace the specific values by a parameter. Which has similar functionality, by defining specific values and including a file:
File fixed_int.h
// note that here is no #ifndef FILE_H or #pragma once
// to reuse the file
#define _concat1(a, b) a ## b
#define _concat(a, b) _concat1(a, b)
#define _size (-((-fixed_int_size) / sizeof(size_t) / 8))
#ifndef fixed_int_name
#define _name concat(uint_, fixed_int_size)
#else
#define _name fixed_int_name
#endif
#define _name_(member) _concat(_concat(_name, _), member)
typedef struct _name {
size_t words[_size];
} _name_(t);
_name_(t)& _name_(increase)(_name_(t)& value) {
size_t* ptr = value.words; size_t i = _size;
while(i && !(++*ptr)) { ++ptr; --i; };
return value;
};
// undef all defines!
#undef _concat1
#undef _concat
#undef _size
#undef _name
#undef _name_
File my_ints.h
//...
// the following lines define the type uint128_t and the function uint_128_t& uint128_increase(uint128_t&)
#define fixed_int_name uint128 // is optional
#define fixed_int_size 128
#include"fixed_int.h"
#undef fixed_int_size
#undef fixed_int_name
//...
I want do the two's complement of a float data.
unsigned long Temperature ;
Temperature = (~(unsigned long)(564.48))+1;
But the problem is that the cast loses information, 564 instead of 564.48.
Can i do the two's complement without a loss of information?
That is a very weird thing to do; floating-point numbers are not stored as 2s complement, so it doesn't make a lot of sense.
Anyway, you can perhaps use the good old union trick:
union {
float real;
unsigned long integer;
} tmp = { 564.48 };
tmp.integer = ~tmp.integer + 1;
printf("I got %f\n", tmp.real);
When I tried it (on ideone) it printed:
I got -0.007412
Note that this relies on unspecified behavior, so it's possible it might break if your compiler does not implement the access in the most straight-forward manner. This is distinct form undefined behavior (which would make the code invalid), but still not optimal. Someone did tell me that newer standards make it clearer, but I've not found an exact reference so ... consider yourself warned.
You can't use ~ over floats (it must be an integer type):
#include <stdio.h>
void print_binary(size_t const size, void const * const ptr)
{
unsigned char *b = (unsigned char *) ptr;
unsigned char byte;
int i, j;
for (i = size - 1; i >= 0; i--) {
for (j = 7; j >= 0; j--) {
byte = b[i] & (1 << j);
byte >>= j;
printf("%u", byte);
}
}
printf("\n");
}
int main(void)
{
float f = 564.48f;
char *p = (char *)&f;
size_t i;
print_binary(sizeof(f), &f);
for (i = 0; i < sizeof(float); i++) {
p[i] = ~p[i];
}
print_binary(sizeof(f), &f);
f += 1.f;
return 0;
}
Output:
01000100000011010001111010111000
10111011111100101110000101000111
Of course print_binary is there for test the result, remove it, and (as pointed out by #barakmanos) print_binary assumes little endian, the rest of the code is not affected by endiannes:
#include <stdio.h>
int main(void)
{
float f = 564.48f;
char *p = (char *)&f;
size_t i;
for (i = 0; i < sizeof(float); i++) {
p[i] = ~p[i];
}
f += 1.f;
return 0;
}
Casting a floating-point value to an integer value changes the "bit contents" of that value.
In order to perform two's complement on the "bit contents" of a floating-point value:
float f = 564.48f;
unsigned long Temperature = ~*(unsigned long*)&f+1;
Make sure that sizeof(long) == sizeof(float), or use double instead of float.
I had a short interview where a question is like this: set an integer value to be 0xaa55 at address 0x*****9.
The only thing I noticed is that the address given is not aligned on word boundary. So setting an int *p to the address should not work. Then is it just using a unsigned char *p to assign the value byte-wise? Is it the point of this interview question? There is no point of doing this in real life, is there?
You need to get back to the interviewer with a number of subsidiary questions:
What is the size in bytes of an int?
Is the machine little-endian or big-endian?
Does the machine handle non-aligned access automatically?
What is the performance penalty for handling non-aligned access automatically?
What is the point of this?
The chances are that someone is thinking of marshalling data the quick and dirty way.
You're right that one basic process is to write the bytes via a char * or unsigned char * that is initialized to the relevant address. The answers to my subsidiary questions 1 and 2 determine the exact mechanism to use, but for a 2-byte int in little-endian format, you might use:
unsigned char *p = 0x*****9; // Copied from question!
unsigned int v = 0xAA55;
*p++ = v & 0xFF;
v >>= 8;
*p = v & 0xFF;
You can generalize to 4-byte or 8-byte integers easily; handling big-endian integers is a bit more fiddly.
I assembled some timing code to see what the relative costs were. Tested on a MacBook Pro (2.3 GHz Intel Core i7, 16 GiB 1333 MHz DDR3 RAM, Mac OS X 10.7.5, home-built GCC 4.7.1), I got the following times for the non-optimized code:
Aligned: 0.238420
Marshalled: 0.931727
Unaligned: 0.243081
Memcopy: 1.047383
Aligned: 0.239070
Marshalled: 0.931718
Unaligned: 0.242505
Memcopy: 1.060336
Aligned: 0.239915
Marshalled: 0.934913
Unaligned: 0.242374
Memcopy: 1.049218
When compiled with optimization, I got segmentation faults, even without -DUSE_UNALIGNED — which puzzles me a bit. Debugging was not easy; there seemed to be a lot of aggressive inline optimization which meant that variables could not be printed by the debugger.
The code is below. The Clock type and the time.h header (and timer.c source) are not shown, but can be provided on request (see my profile). They provide high resolution timing across most platforms (Windows is shakiest).
#include <string.h>
#include <stdio.h>
#include "timer.h"
static int array[100000];
enum { ARRAY_SIZE = sizeof(array) / sizeof(array[0]) };
static int repcount = 1000;
static void uac_aligned(int value)
{
int *base = array;
for (int i = 0; i < repcount; i++)
{
for (int j = 0; j < ARRAY_SIZE - 2; j++)
base[j] = value;
}
}
static void uac_marshalled(int value)
{
for (int i = 0; i < repcount; i++)
{
char *base = (char *)array + 1;
for (int j = 0; j < ARRAY_SIZE - 2; j++)
{
*base++ = value & 0xFF;
value >>= 8;
*base++ = value & 0xFF;
value >>= 8;
*base++ = value & 0xFF;
value >>= 8;
*base = value & 0xFF;
value >>= 8;
}
}
}
#ifdef USE_UNALIGNED
static void uac_unaligned(int value)
{
int *base = (int *)((char *)array + 1);
for (int i = 0; i < repcount; i++)
{
for (int j = 0; j < ARRAY_SIZE - 2; j++)
base[j] = value;
}
}
#endif /* USE_UNALIGNED */
static void uac_memcpy(int value)
{
for (int i = 0; i < repcount; i++)
{
char *base = (char *)array + 1;
for (int j = 0; j < ARRAY_SIZE - 2; j++)
{
memcpy(base, &value, sizeof(int));
base += sizeof(int);
}
}
}
static void time_it(int value, const char *tag, void (*function)(int value))
{
Clock c;
char buffer[32];
clk_init(&c);
clk_start(&c);
(*function)(value);
clk_stop(&c);
printf("%-12s %12s\n", tag, clk_elapsed_us(&c, buffer, sizeof(buffer)));
}
int main(void)
{
int value = 0xAA55;
for (int i = 0; i < 3; i++)
{
time_it(value, "Aligned:", uac_aligned);
time_it(value, "Marshalled:", uac_marshalled);
#ifdef USE_UNALIGNED
time_it(value, "Unaligned:", uac_unaligned);
#endif /* USE_UNALIGNED */
time_it(value, "Memcopy:", uac_memcpy);
}
return(0);
}
memcpy((void *)0x23456789, &(int){0xaa55}, sizeof(int));
Yes, you may need to deal with unaligned multi-byte values in real life. Imagine your device exchanges data with another device. For example, this data may be a message structure sent over a network or a file structure saved to disk. The format of that data may be predefined and not under your control. And the definiton of the data structure may not account for alignement (or even endianness) restrictions of your device. In these situations you'll need to take care when accessing these unaligned multi-byte values.
I wrote this code to do the IEEE 754 floating point arithmetic on a 4byte string.
It takes in the bytes, converts them to binary and with the binary I get the sign, exponent, and mantissa and then do the calculation.
It all works just about perfectl, 0xDEADBEEF gives me 6259853398707798016 and the true answer is 6.259853398707798016E18, now these are same values and I wont have anything this large in the project I'm working with, all other smaller values put the decimal in the correct place.
Here is my code:
float calcByteValue(uint8_t data[]) {
int i;
int j = 0;
int index;
int sign, exp;
float mant;
char bits[8] = {0};
int *binary = malloc(32*sizeof *binary);
for (index = 0;index < 4;index++) {
for (i = 0;i < 8;i++,j++) {
bits[i] = (data[index] >> 7-i) & 0x01;
if (bits[i] == 1) {
binary[j] = 1;
} else {
binary[j] = 0;
}
}
printf("\nindex(%d)\n", index);
}
sign = getSign(&(binary[0]));
mant = getMant(&(binary[0]));
exp = getExp(&(binary[0]));
printf("\nBinary: ");
for (i = 0;i < 32;i++)
printf("%d", binary[i]);
printf("\nsign:%d, exp:%d, mant:%f\n",sign, exp, mant);
float f = pow(-1.0, sign) * mant * pow(2,exp);
printf("\n%f\n", f);
return f;
}
//-------------------------------------------------------------------
int getSign(int *bin) {
return bin[0];
}
int getExp (int *bin) {
int expInt, i, b, sum;
int exp = 0;
for (i = 0;i < 8;i++) {
b = 1;
b = b<<(7-i);
if (bin[i+1] == 1)
exp += bin[i+1] * b;
}
return exp-127;
}
float getMant(int *bin) {
int i,j;
float b;
float m;
int manBin[24] = {0};
manBin[0] = 1;
for (i = 1,j=9;j < 32;i++,j++) {
manBin[i] = bin[j];
printf("%d",manBin[i]);
}
for (i = 0;i < 24;i++) {
m += manBin[i] * pow(2,-i);;
}
return m;
}
Now, my teacher told me that there is a much easier way where I can just take in the stream of bytes, and turn it into a float and it should work. I tried doing it that way but could not figure it out if my life depended on it.
I'm not asking you to do my homework for me, I have it done and working, but I just need to know if I could of done it differently/easier/more efficiently.
EDIT: there are a couple special cases I need to handle, but it's just things like if the exponent is all zeros blah blah blah. Easy to implement.
The teacher probably had this in mind:
char * str; // your deadbeef
float x;
memcpy(&x, str, sizeof(float));
I would advise against it, for the issues with endianness. But if your teacher wants it, he shall have it.
I think you want a union - just create a union where one member is a 4 character array, and the other a float. Write the first, then read the second.
Looking at what your code does then the "4 byte string" looks like it already contains the binary representation of a 32 bit float, so it already exists in memory at the address specified by data in big endian byte order.
You could probably cast the array data to a float pointer and dereference that (if you can assume the system you are running on is big endian and that data will be correctly aligned for the float type on your platform).
Alternatively if you need more control (for example to change the byte order or ensure alignment) you could look into type punning using a union of a uint8_t array and a float. Copy the bytes into your union's uint8_t array and then read the float member.
Here is my working code:
unsigned char val[4] = {0, 0, 0xc8, 0x41};
cout << val << endl;
cout << "--------------------------------------------" << endl;
float f = *(float*)&val;
cout << f << endl;
return 0;
I'm representing an infinitely precise integer as an array of unsigned ints for processing on a GPU. For debugging purposes I'd like to print the base 10 representation of one of these numbers, but am having difficulty wrapping my head around it. Here's what I'd like to do:
//the number 4*(2^32)^2+5*(2^32)^1+6*(2^32)^0
unsigned int aNumber[3] = {4,5,6};
char base10TextRepresentation[50];
convertBase2To32ToBase10Text(aNumber,base10TextRepresentation);
Any suggestions on how to approach this problem?
Edit: Here's a complete implementation thanks to drhirsch
#include <string.h>
#include <stdio.h>
#include <stdint.h>
#define SIZE 4
uint32_t divideBy10(uint32_t * number) {
uint32_t r = 0;
uint32_t d;
for (int i=0; i<SIZE; ++i) {
d = (number[i] + r*0x100000000) / 10;
r = (number[i] + r*0x100000000) % 10;
number[i] = d;
}
return r;
}
int zero(uint32_t* number) {
for (int i=0; i<SIZE; ++i) {
if (number[i] != 0) {
return 0;
}
}
return 1;
}
void swap(char *a, char *b) {
char tmp = *a;
*a = *b;
*b = tmp;
}
void reverse(char *str) {
int x = strlen(str);
for (int y = 0; y < x/2; y++) {
swap(&str[y],&str[x-y-1]);
}
}
void convertTo10Text(uint32_t* number, char* buf) {
int n = 0;
do {
int digit = divideBy10(number);
buf[n++] = digit + '0';
} while(!zero(number));
buf[n] = '\0';
reverse(buf);
}
int main(int argc, char** argv) {
uint32_t aNumber[SIZE] = {0,0xFFFFFFFF,0xFFFFFFFF,0xFFFFFFFF};
uint32_t bNumber[4] = {1,0,0,0};
char base10TextRepresentation[50];
convertTo10Text(aNumber, base10TextRepresentation);
printf("%s\n",base10TextRepresentation);
convertTo10Text(bNumber, base10TextRepresentation);
printf("%s\n",base10TextRepresentation);
}
If you have access to 64 bit arithmetic, it is easier. I would do something along the line of:
int32_t divideBy10(int32_t* number) {
uint32_t r = 0;
uint32_t d;
for (int i=0; i<SIZE; ++i) {
d = (number[i] + r*0x100000000) / 10;
r = (number[i] + r*0x100000000) % 10;
number[i] = d;
number[i] = r;
}
void convertTo10Text(int32_t* number, char* buf) {
do {
digit = divideBy10(number);
*buf++ = digit + '0';
} while (!isEqual(number, zero));
reverse(buf);
}
isEqual() and reverse() left to be implemented. divideBy10 divides by 10 and returns the remainder.
Fundamentally you need classic decimal printing using digit production by dividing your number by ten (in your base 2^32) repeatedly and using the remainder as digits. You may not have a divide by (anything, let alone) 10 routine, which is probably the key source of your problem.
If you are working in C or C++, you can get a complete infinite precision arithmetic package from GNU Bignum package. Most other widely used languages have similar packages available.
Of course, if you have too much free time, you can always implement multiprecision division yourself. You're already borrowing terminology from Knuth; he also supplies the multiprecision algorithms in Seminumerical Algorithms.
If it is .NET, take a look at this implementation of a BigInteger class.
How about using long doubles? Then you get 80bits in the mantissa, but I guess that the accuracy is lost when using floating point numbers.