Reduce() is not working as expected in Kotlin for sum operation for some value - arrays

val arr = arrayListOf(256741038, 623958417 ,467905213, 714532089, 938071625)
arr.sort()
val max = (arr.slice(1 until arr.size)).reduce { x, ars -> x+ars }
so I want the max sum of 4 out of 5 elements in an array but I am having an answer which is not expected
max = -1550499952
I don't know what going wrong there because it's working for many cases but not for this.
The expected output would be:
max = 2744467344

If you ever see a negative number appearing out of nowhere, that's a sign that you've got an overflow. The largest number an Int can represent is 2147483647 - add 1 to that and you get -2147483648, a negative number.
That's because signed integers represent negative numbers with a 1 in the most significant bit of the binary representation. Your largest positive number is 0111 (except with 32 bits not 4!), then you add 1 and it ticks over to 1000, the largest negative number. Then as you add to that, it moves towards zero, until you have 1111 (which is -1). Add another 1 and it overflows (there's no space to represent 10000) and you're back at zero, 0000.
Anyway point is you're adding lots of big numbers together and an Int can't hold the result. It keeps overflowing, so you lose the bigger digits (it can't represent more than ~2 billion) and it can be negative depending on where the overflow ends up, which half of the binary range it lands in.
You can fix this by using Longs instead (64-bits, max values +/- 9 quintillion, lots of room):
// note the L's to make them Longs
arrayListOf(256741038L, 623958417L ,467905213L, 714532089L, 938071625L)

Related

Why is log base 10 used in this code to convert int to string?

I saw a post explaining how to convert an int to a string. In the explanation there is a line of code to get the number of chars in a string:
(int)((ceil(log10(num))+1)*sizeof(char))
I’m wondering why log base 10 is used?
ceil(log10(num))+1 is incorrectly being used instead of floor(log10(num))+2.
The code is attempting to determine the amount of memory needed to store the decimal representation of the positive integer num as a string.
The two formulas presented above are equal except for numbers which are exact powers of 10, in which case the former version returns one less than the desired number.
For example, 10,000 requires 6 bytes, yet ceil(log10(10000))+1 returns 5. floor(log10(10000))+2 correctly returns 6.
How was floor(log10(num))+2 obtained?
A 4-digit number such as 4567 will be between 1,000 (inclusive) and 10,000 (exclusive), so it will be between 103 (inclusive) and 104 (exclusive), so log10(4567) will be between 3 (inclusive) and 4 (exclusive).
As such, floor(log10(num))+1 will return number of digits needed to represent the positive value num in decimal.
As such, floor(log10(num))+2 will return the amount of memory needed to store the decimal representation of the positive integer num as a string. (The extra char is for the NUL that terminates the string.)
I’m wondering why log base 10 is used?
I'm wondering the same thing. It uses a very complex calculation that happens at runtime, to save a couple bytes of temporary storage. And it does it wrong.
In principle, you get the number of digits in base 10 by taking the base-10 logarithm and flooring and adding 1. It comes exactly from the fact that
log10(1) = log10(10⁰) = 0
log10(10) = log10(10¹) = 1
log10(100) = log10(10²) = 2
and all numbers between 10 and 100 have their logarithms between 1 and 2 so if you floor the logarithm for any two digit number you get 1... add 1 and you get the number of digits.
But you do not need to do this at runtime. The maximum number of bytes needed for a 32-bit int in base 10 is 10 digits, negative sign and null terminator for 12 chars. The maximum you can save with the runtime calculation are 10 bytes of RAM, but it is usually temporary so it is not worth it. If it is stack memory, well, the call to log10, ceil and so forth might require far more.
In fact, we know the maximum number of bits needed to represent an integer: sizeof (int) * CHAR_BIT. This is greater than or equal to log2 of the MAX_INT + 1. And we know that log10(x) =~ 3.32192809489 * log2(x), so we get a good (possibly floored) approximation of log10(MAX_INT) by just dividing sizeof (int) * CHAR_BIT by 3. Then add 1 for we were supposed to add 1 to the floored logarithm to get the number of digits, then 1 for possible sign, and 1 for the null terminator and we get
sizeof (int) * CHAR_BIT / 3 + 3
Unlike the one from your question, this is an integer constant expression, i.e. the compiler can easily fold it at the compilation time, and it can be used to set the size of a statically-typed array, and for 32-bits it gives 13 which is only one more than the 12 actually required, for 16 bits it gives 8 which is again only one more than the maximum required 7 and for 8 bits it gives 5 which is the exact maximum.
ceil(log10(num)) + 1 is intended to provide the number of characters needed for the output string.
For example, if num=101, the expression's value is 4, the correct length of '101' plus the null terminator.
But if num=100, the value is 3. This behavior is incorrect.
This is because it's allocating enough space for the number to fit in the string.
If, for example, you had the number 1034, log10(1034) = 3.0145.... ceil(3.0145) is 4, which is the number of digits in the number. The + 1 is for the null-terminator.
This isn't perfect though: take 1000, for example. Despite having four digits, log(1000) = 3, and ceil(3) = 3, so this will allocate space for too few digits. Plus, as #phuclv mentions below, the log() function is very time-consuming for this purpose, especially since the length of a number has a (relatively low) upper-bound.
The reason it's log base 10 is because, presumably, this function represents the number in decimal form. If, for example, it were hexadecimal, log base 16 would be used.
A number N has n decimal digits iff 10^(n-1) <= N < 10^n which is equivalent to n-1 <= log(N) < n or n = floor(log(N)) + 1.
Since double representation has only limited precision floor(log(N)) may be off by 1 for certain values, so it is safer to allow for an extra digit i.e. allocate floor(log(N)) + 2 characters, and then another char for the nul terminator for a total of of floor(log(N)) + 3.
The expression in the original question ceil(log(N)) + 1 appears to not count the nul terminator, and neither allow for the chance of rounding errors, so it is one shorter in general, and two shorter for powers of 10.

matrix multiplication in C and MATLAB , different result

i am using 4Rungekutta to solve the DGL(Ax + Bu = x_dot) in MATLAB and C,
A is 5x5, x is 5x1, B 5x1, u 1x1, u is the output of sine function(2500 points),
the output of 4Rungekutta in MATLAB and C are all the same until 45th iteration, but at 45th(in 2500 iterations) iteration of 4Rungekutta the output of A*x at 2th Step of 4Rungekutta are different , hier are the Matrix.
i have printed them with 30 decimals
A and x are the same in MATLAB and C
A = [0, 0.100000000000000005551115123126,0,0,0;
-1705.367199390822406712686643004417 -13.764624913971095665488064696547 245874.405372532171895727515220642090 0.000000000000000000000000000000 902078.458362009725533425807952880859;
0, 0, 0, 0.100000000000000005551115123126, 0;
2.811622989796986438193471258273, 0, -572.221510883482778808684088289738, -0.048911651728553134921284595293 ,0;
0, 0, -0.100000000000000005551115123126 0, 0]
x = [0.071662614269441649028635765717 ;
45.870073568955461951190955005586;
0.000002088948888569741376840423;
0.002299524406171214990085571728;
0.000098982102875767145086331744]
but the results of A*x are not the same,the second element in MATLAB is-663.792187417201375865261070430279,in C is
-663.792187417201489552098792046309
MATLAB
A*x = [ 4.587007356895546728026147320634
-663.792187417201375865261070430279
0.000229952440617121520692600622
0.200180438762844026268084007825
-0.000000208894888856974158859866];
C
A*x = [4.587007356895546728026147320634
-663.792187417201489552098792046309
0.000229952440617121520692600622
0.200180438762844026268084007825
-0.000000208894888856974158859866];
though the difference is small, but i need this result to do the finite difference, at that point the result would be more obvious
does anyone know why?
How many digits do you consider you need? You have the same first 16 digits of each number equal, which is the aproximate amount of data a double normally can represent internally and store. You cannot get more, even if you force your printing routines to print more digits, they will print rubbish. What happens is that you have said to get say, 120 digits to your printing routines... and they will print those, normally multiplying the remainder (whatever it can be) As numbers are represented in base 2, you normally don't get zeros once passed the internal precission of the number... and the printing implementations don't have to agree on the digits printed once you don't have more bits represented in your number.
Suppose for a moment you have a hand calculator that only has 10 digits of precision. And you are given numbers of 120 digits. You begin to calculate and only get results with 10 digits... but you have been requested to print a report with 120 digit results. Well.... as the overal calculation cannot be done with more than 10 digits what can you do? you are using a calculator unable to give you the requested number of digits... and more, the number of base 10 digits in a 52bit significand is not a whole number of digits (there are 15.65355977452702215111442252567364 decimal digits in a 52bit significand). What can you do, you can fill with zeros (incorrect, most probably) you can fill those places with rubish (that will never affect the final 10 digits result) or you can go to Radio Shack and buy a 120 digit calculator. Floating point printing routines use a counter to specify how many times to go into a loop and get another digit, they normally stop when the counter reaches it's limit, but don't do any extra effort to know if you have got crazy and specified a large amount of digits... if you ask for 600 digits, you just get 600 loop iterations, but digits will be fake.
You should expect a difference of one part in 2^52 in a double number, as those are the number of binary digits used for the significand (this is aprox 2,220446049250313080847263336181641e-16, so you have to multiply this number by the one you have output to see how large the rounding error is, aproximately) if you multiply your number 663.792187417201375865261070430279 by that, you get 1.473914740073748177152126604805902e-13, which is an estimate of where in the number is the last valid digit of it. Probably the error estimate will be far larger due to the large number o multiplications and sums required to make a cell calculation. Anyway, a resolution of 1.0e-13 is very good (subatomic difference, should the values be lengths and units in meters).
EDIT
as an example, just consider the following program:
#include <stdio.h>
int main()
{
printf("%.156f\n", 0.1);
}
if you run it you'll get:
0.100000000000000005551115123125782702118158340454101562500000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
which indeed is the most (exact) aproximation to the internal representation of the number 0.1 that the machine can represent in base 2 floating point (0.1 happens to be a periodic number, when represented in base 2) Its representation is:
0.0001100110011(0011)*
so it cannot be represented exactly with 52 bits repeating the pattern 1100 indefinitely. At some point you have to cut, and the printf routine continues adding zeros to the right until it gets to the representation above (all the finite digit numbers in base 2 are representable as a finite number of digits in base 10, but the converse is not true, (because all the factors of 2 are in 10, but not all the factors of 10 are in 2).
If you divide the difference from 0.1 and 0.1000000000000000055511151231257827021181583404541015625 between 0.1 you'll get 5.55111512312578270211815834045414e-17 which is approximately 1/2^54 or one quarter approximately (roughly one fifth) of the limit 1/2^52 I showed to you above. It is the closest number representable with 52 bits to the number 0.1

Generating random numbers in ranges from 32 bytes of random data, without bignum library

I have 32 bytes of random data.
I want to generate random numbers within variable ranges between 0-9 and 0-100.
If I used an arbitrary precision arithmetic (bignum) library, and treated the 32 bytes as a big number, I could simply do:
random = random_source % range;
random_source = random_source / range;
as often as I liked (with different ranges) until the product of the ranges nears 2^256.
Is there a way of doing this using only (fixed-size) integer arithmetic?
Certainly you can do this by doing base 256 long division (or push up multiplication). It is just like the long division you learnt in primary school, but with bytes instead of digits. It involves doing a cascade of divides and remainders for each byte in turn. Note that you also need to be aware how you are consuming the big number, and that as you consume it and it becomes smaller, there is an increasing bias against the larger values in the range. Eg if you only have 110 left, and you asked for a rnd(100), the values 0-9 would be 10% more likely than 10-99 each.
But, you don't really need the bignum techniques for this, you can use ideas from arithmetic encoding compression, where you build up the single number without actually ever dealing with the whole thing.
If you start by reading 4 bytes to an unsigned uint_32 buffer, it has a range 0..4294967295 , a non-inclusive max of 4294967296. I will refer to this synthesised value as the "carry forward", and this exclusive max value is also important to record.
[For simplicity, you might start with reading 3 bytes to your buffer, generating a max of 16M. This avoids ever having to deal with the 4G value that can't be held in a 32 bit integer.]
There are 2 ways to use this, both with accuracy implications:
Stream down:
Do your modulo range. The modulo is your random answer. The division result is your new carry forward and has a smaller range.
Say you want 0..99, so you modulo by 100, your upper part has a range max 42949672 (4294967296/100) which you carry forward for the next random request
We can't feed another byte in yet...
Say you now want 0..9, so you modulo by 10, and now your upper part has a range 0..4294967 (42949672/100)
As max is less than 16M, we can now bring in the next byte. Multiply it by the current max 4294967 and add it to the carry forward. The max is also multiplied by 256 -> 1099511552
This method has a slight bias towards small values, as 1 in the "next max" times, the available range of values will not be the full range, because the last value is truncated, but by choosing to maintain 3-4 good bytes in max, that bias is minimised. It will only occur at max 1 in 16million times.
The computational cost of this algorithm is the div by the random range of both carry forward and max, and then the multiply each time you feed in a new byte. I assume the compiler will optimise the modulo
Stream up:
Say you want 0..99
Divide your max by range, to get the nextmax, and divide carryforward by nextmax. Now, your random number is in the division result, and the remainder forms the value you carry forward to get the next random.
When nextmax becomes less than 16M, simply multiply both nextmax and your carry forward by 256 and add in the next byte.
The downside if this method is that depending on the division used to generate nextmax, the top value result (i.e. 99 or 9) is heavily biased against, OR sometimes you will generate the over-value (100) - this depends whether you round up or down doing the first division.
The computational cost here is again 2 divides, presuming the compiler optimiser blends div and mod operations. The multiply by 256 is fast.
In both cases you could choose to say that if the input carry forward value is in this "high bias range" then you will perform a different technique. You could even oscillate between the techniques - use the second in preference, but if it generates the over-value, then use the first technique, though on its own the likelihood is that both techniques will bias for similar input random streams when the carry forward value is near max. This bias can be reduced by making the second method generate -1 as the out-of-range, but each of these fixes adds an extra multiply step.
Note that in arithmetic encoding this overflow zone is effectively discarded as each symbol is extracted. It is guaranteed during decoding that those edge values won't happen, and this contributes to the slight suboptimal compression.
/* The 32 bytes in data are treated as a base-256 numeral following a "." (a
radix point marking where fractional digits start). This routine
multiplies that numeral by range, updates data to contain the fractional
portion of the product, and returns the integer portion.
8-bit bytes are assumed, or "t /= 256" could be changed to
"t >>= CHAR_BIT". But then you have to check the sizes of int
and unsigned char to consider overflow.
*/
int r(int range, unsigned char *data)
{
// Start with 0 carried from a lower position.
int t = 0;
// Iterate through each byte.
for (int i = 32; 0 < i;)
{
--i;
// Multiply next byte by our multiplier and add the carried data.
t = data[i] * range + t;
// Store the low bits of the result.
data[i] = t;
// Carry the high bits of the result to the next position.
t /= 256;
}
// Return the bits that carried out of the multiplication.
return t;
}

Explaining of Computing square root of two fixed point fractions

I have found that piece of code on blackfin533, which has fract32 which is from -1, 1, its in the format 1.31.
I can't get why the pre-shifting is required for calculating the amplitude of a complex number (re, img). I know if you want to multiply 1.31 by 1.31 fractional format then you need to shift right 31 bits.
GO_coil_D[0].re, and GO_coil_D[0].im are two fract32.
I can't get what the following code is doing :
norm[0] = norm_fr1x32(GO_coil_D[0].re);
norm[1] = norm_fr1x32(GO_coil_D[0].im);
shift = (norm[0] < norm[1]) ? (norm[0] - 1) : (norm[1] - 1);
vectorFundamentalStored.im = shl_fr1x32(GO_coil_D[0].im,shift);
vectorFundamentalStored.re = shl_fr1x32(GO_coil_D[0].re,shift);
vectorFundamentalStored.im = mult_fr1x32x32(vectorFundamentalStored.im, vectorFundamentalStored.im);
vectorFundamentalStored.re = mult_fr1x32x32(vectorFundamentalStored.re, vectorFundamentalStored.re);
amplitudeFundamentalStored = sqrt_fr16(round_fr1x32(add_fr1x32(vectorFundamentalStored.re,vectorFundamentalStored.im))) << 16;
amplitudeFundamentalStored = shr_fr1x32(amplitudeFundamentalStored,shift);
round_fr1x32` (fract32 f1) fract16 Rounds the 32-bit fract to a 16-bit fract using biased rounding.
norm_fr1x32 norm_fr1x32 (fract32) int Returns the number of left shifts required to normalize the input variable so that it is either in the interval 0x40000000 to 0x7fffffff, or in the interval 0x80000000 to 0xc0000000. In other words, fract32 x; shl_fr1x32(x,norm_fr1x32(x)); returns a value in the range 0x40000000 to 0x7fffffff, or in the range 0x80000000 to 0xc0000000
1) If the most significant n bits of the fractional part are all '0' bits, and they are followed by a '1' bit, then n behaves like a floating point binary exponent of value n, and the remaining 31-n bits behave like the mantissa. Squaring the number doubles the number of leading '0' bits to 2*n and reduces the size of the mantissa to 31-2*n bits. This can lead to a loss of precision in the result of the squaring operation.
2) round_fr1x32 converts the 1.31 fraction to a 1.15 fraction, losing up to 16 more bits of precision.
Hopefully you can see that steps 1 and 2 can remove a lot of precision in the number. Pre-scaling the number reduces the number of leading '0' bits n as much as possible, resulting in less precision being lost at step 1. In fact, for one of the two numbers being squared and added, the number of leading '0' bits n will be zero, so squaring that number will still leave up to 31 bits of precision before it is added to the other number. (Step 2 will reduce that precision to 15 bits.)
Lastly, you are wrong about the result of multiplying two 1.31 fraction format numbers - the result needs to be shifted right by 31 bits, not 62 bits.
Worked example:
Let's say the real part is 3/1024 and the imaginary part is 4/1024 in decimal, so the absolute value should be 5/1024 by pythagoras.
With no pre-scaling, the binary fractions are re=0.0000000011₂, im=0.0000000100₂. Squaring them gives re²=0.00000000000000001001₂, im²=0.00000000000000010000₂. Adding the squares gives abs²=0.00000000000000011001₂. Rounding to 15 fractional bits gives abs²=0.000000000000001₂. Taking the square root gives abs=0.000000010110101₂. This differs from the exact result 0.0000000101₂ by 0.000000000010101₂.
When prescaling, both fractions are shifted left by 6 bits, giving sre=0.0011₂, sim=0.0100₂ (I used the prefix 's' to mean 'scaled'). Squaring them gives sre²=0.00001001₂, sim²=0.00010000₂. Adding the squares gives sabs²=0.00011001₂. Rounding to 15 fractional bits does not change the value. Taking the square root gives sabs=0.01010000₂. Converting that to 1.31 format and shifting right by 6 bits gives abs=0.0000000101₂ which is exactly correct (5/1024 in decimal).

loop over 2^n states of n bits in C with n > 32

I'd like to have a loop in C over all possible 2^n states of n bits. For example if n=4 I'd like to loop over 0000, 0001, 0010, 0011, ..., 1110, 1111. The bits can be represented in any way, for example an integer array of length n with values 0 or 1, or a character array of length n with values "0" or "1", etc, it doesn't really matter.
For smallish n what I do is calculate x=2^n using integer arithmetic (both n and x are integers), then
for(i=0;i<x;i++) {
bits = convert_integer_to_bits( i );
work_on_bits( bits );
}
Here 'bits' is in the given representation of bits, what was useful so far is an integer array of length n with values 0 or 1 (but can be anything else).
If n>32 this approach obviously doesn't work even with longs.
How would I work with n>32?
Specifically, do I really need to evaluate 2^n, or is there a tricky way of writing the loop which does not refer to the actual value of 2^n but nevertheless iterates 2^n times?
For n > 32 use unsigned long long. This will work for n up to 64. Still for values even close to 50 you will have to wait long time until the cycle finishes.
It's not clear why you say that if n>32, it obviously won't work. Is your concern the width of bits, or is your concern the run time?
If you're concerned about number width, investigate a big math library such as http://gmplib.org/.
If you're concerned about run time... you won't live long enough for your loop to complete if the width is large enough, so get a different hobby ;) Seriously... figure out the rough run time of one iteration through your loop and multiply that by 4 billion, divide by 20 years, and you'll have an estimate of the number of generations of your ancestors that will need to wait for the answer.

Resources